You could use a setup with two cameras. The first one pointing at the 1st person view of the player. The second one pointing at a Tilemap and rendering to a render texture by setting that render texture as the "Target Texture" in the camera component. That render texture can then be used as the source texture of a RawImage UI object. So you can place that minimap window on a canvas, just like any other UI object.
Which camera renders what can be controlled via layers. Or alternatively you can just try to design the spacial layout of your scene in a way that the two cameras and the things they render never get close enough to each other for anything to ever appear in the field of view of the wrong camera. Usually I would not advise that (because you never know how your scene might evolve), but your use-case sounds simple enough that it could work.