1

I'm trying to load hdf5 datasets into a pytorch training for loop.

Regardless of num_workers in dataloader, this randomly throws "KeyError: 'Unable to open object (component not found)' " (traceback below).

I'm able to start the training loop, but not able to get through 1/4 of one epoch without this error which happens for random 'datasets' (which are 2darrays each). I'm able to separately load these arrays in the console using the regular f['group/subroup'][()] so it doesn't appear like the hdf file is corrupted or that there's anything wrong with the datasets/array.

I've tried:

  • adjusting num_workers as per various other issues that people have had with pytorch - still happens with 0 num_workers.
  • upgrading /downgrading, torch, numpy and python versions.
  • using f.close() at the end of data loader getitem
  • using a fresh conda env and installing dependencies.
  • calling parent groups first, then initialising array eg: X = f[ID] then X = X[()]
  • using double slashes in hdf path

Because this recurs with num_workers=0, I figure it's not a multithreading issue although the traceback seems to point to lines from /torch/utils/data/dataloader that prep the next batch.

I just can't figure out why h5py can't see the odd individual dataset, randomly.

IDs are strings to match hdf paths eg: ID = "ID_12345//Ep_-1//AN_67891011//ABC"

excerpt from dataloader:

def __getitem__(self, index):

    ID = self.list_IDs[index]

    # Start hdf file in read mode:
    f = h5py.File(self.hdf_file, 'r', libver='latest', swmr=True)

    X = f[ID][()]

    X = X[:, :, np.newaxis] # torchvision 0.2.1 needs (H x W x C) for transforms

    y = self.y_list[index]

    if self.transform:
        X = self.transform(X)

    return ID, X, y

`

Expected: training for loop

Actual: IDs / datasets / examples are loaded fine initially, then after between 20 and 200 steps...

Traceback (most recent call last):

File "Documents/BSSA-loc/mamdl/models/main_v3.py", line 287, in main() File "Documents/BSSA-loc/mamdl/models/main_v3.py", line 203, in main for i, (IDs, images, labels) in enumerate(train_loader): File "/home/james/anaconda3/envs/jc/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 615, in next batch = self.collate_fn([self.dataset[i] for i in indices]) File "/home/james/anaconda3/envs/jc/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 615, in batch = self.collate_fn([self.dataset[i] for i in indices]) File "/home/james/Documents/BSSA-loc/mamdl/src/data_loading/Data_loader_v3.py", line 59, in getitem X = f[ID][()] File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "/home/james/anaconda3/envs/jc/lib/python3.7/site-packages/h5py/_hl/group.py", line 262, in getitem oid = h5o.open(self.id, self._e(name), lapl=self._lapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5o.pyx", line 190, in h5py.h5o.open

KeyError: 'Unable to open object (component not found)'

1 Answer 1

0

For the record, my best guess is that this was due a bug in my code for hdf construction, which was stopped and started multiple times in append mode. Some datasets appeared as though they were complete when queried f['group/subroup'][()] but were not able to loaded with pytorch dataloader.

Haven't had this issue since rebuilding hdf differently.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.