1

I am working with high res NOAA aerial imagery data from Hurrican Ida, RGB. Data is available here under download - any of the date times will download a large amount of tifs https://storms.ngs.noaa.gov/storms/ida/index.html#18/29.46140/-90.30946

I want to merge multiple tifs along (x , y , bands) in order to create a complete 'mosaic', to identify any missing patches of imagery, and then create a building outline mask.

I have a function that can merge 20 Tifs, but above that it kills the kernel. Is there a better way to merge further? Ideally would want 40 in an image.

import rioxarray
from rioxarray import merge
from rasterio.plot import show 
 
def combine_tif_large(file_list, title=''):
        """Plot the combined image of a set of tif files - chunks long file list into groups of 50 (n) to prevent datasets closing
        file_list = list of file paths for files to be merged
        title = title for plot 
        """
        n = 50
        fig, ax = plt.subplots( figsize=(20,20))
        chunked_files = [file_list[i:i + n] for i in range(0, len(file_list), n)]
    
        final = [] 
        for chunk in chunked_files:
            now = datetime.now()
            current_time = now.strftime("%H:%M:%S")
            print("Current Time =", current_time)
            print('starting chunk..')
            elements = []
            for val in chunk:
                elements.append(rioxarray.open_rasterio(val))
            print('finish chunk')
            print('starting  merge')
            merged = merge.merge_arrays(elements, nodata=0.0)
            final.append(merged)
        print('starting final merge')
        merge_final = merge.merge_arrays(final, nodata=0.0)
        image = merge_final.values
        show(image, ax = ax, title= title) 
2
  • 1
    If your kernel is being killed because the data is too large to fit into memory, it's probably too large to generate a single image from! You'll want to downsample the data somehow. the simplest approach would be to literally sample every Nth pixel... something like da = rioxarray.open_rasterio(val); da = da[::100, ::100]. You could use dask or all kinds of tricks to work with a larger-than-memory dataset but if you want to visualize the whole thing you're going to have to sample from or summarize it somehow. Commented May 5, 2022 at 1:43
  • 1
    thanks that makes a lot of sense! I guess I was thinking that I had to keep the resolution for the network, but thats not needed for the plot Commented May 6, 2022 at 14:07

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.