I'm interested in making a distributed dask array out of a bunch of netcdf files I have lying around. I started down the path outlined in "Distributed Dask arrays" but have gotten a little caught up by the deprecation of 'distributed.collections'
What is the best way to create a distributed dask array now? I have my dask-scheduler and dask-worker tasks running. And I can successfully execute the following:
from distributed import Client, progress
client = Client('scheduler-address:8786')
futures = client.map(load_netcdf, filenames)
progress(futures)
What next?