3

I have a dict of objects.

data = [{'a': 'qwerty', 'b': 123}]

I create a dataframe:

df = pd.DataFrame(data)

now I want to persist it:

df.to_hdf(filename, 'book', table=True, mode='a', append=True)

now I want to persist another batch of data that slightly longer in size:

data = [{'a': 'qwerty2', 'b': 123}]
df = pd.DataFrame(data)
df.to_hdf(filename, 'book', table=True, mode='a', append=True)

it fails with error:

ValueError: Trying to store a string with len [7] in [values_block_2] column but
this column has a limit of [6]!
Consider using min_itemsize to preset the sizes on these columns

It basically works when only when I keep the size of column the same size but if it is different I am getting the error above. How do I make pandas to work with any size of the string?

2
  • Are you getting this error on your example? I am not. Commented Feb 19, 2018 at 9:29
  • @Stev, edited my answer now it is reproducible. Commented Feb 20, 2018 at 10:56

1 Answer 1

8

Finally I found an answer to my own question. Problem is that when I was doing my first to_hdf batch it automatically creates a schema based on data provided in array, however if next batch of rows contains data that exceed limit for this column size what has been created in the first batch then it will crash with error: ValueError: Trying to store a string with len

So solution is to add min_itemsize argument to to_hdf:

df.to_hdf(filename, 'book', table=True, mode='a', append=True, min_itemsize={'a': 7})

In other words you can treat hdf as simple SQL table where you need to predefine size for each String column.

Alternatively, you need to write data into a new file.

Sign up to request clarification or add additional context in comments.

1 Comment

Well done :) I am sure other people will find this useful.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.