I would like to open a csv file in pieces with pd.read_csv(path, chunksize = N) until the end of it in a quite elegant and efficient way. The problem is that once the pointer is out of the file the following message of error takes place:
df.get_chunk()
Traceback (most recent call last):
File "<ipython-input-115-061ea8dbcbad>", line 1, in <module>
df.get_chunk()
File "C:\Users\fedel\Anaconda2\lib\site-packages\pandas\io\parsers.py", line 784, in get_chunk
return self.read(nrows=size)
File "C:\Users\fedel\Anaconda2\lib\site-packages\pandas\io\parsers.py", line 763, in read
ret = self._engine.read(nrows)
File "C:\Users\fedel\Anaconda2\lib\site-packages\pandas\io\parsers.py", line 1213, in read
data = self._reader.read(nrows)
File "pandas\parser.pyx", line 766, in pandas.parser.TextReader.read (pandas\parser.c:7988)
File "pandas\parser.pyx", line 813, in pandas.parser.TextReader._read_low_memory (pandas\parser.c:8629)
StopIteration
and the code can't continue anymore!
I believe that a try/except statement will avoid me that message hence the code will keep going with the next issues. Say that I have a python DataFrame like that one you can generate with the following lines of code
path = r"C:\Users\fedel\Desktop" + '\\fileName.csv'
pd.DataFrame( np.random.randn(30, 3), columns = list('abc')).to_csv(path, index = False)
df = pd.read_csv(path, chunksize = 6)
I think that a statement like the following one could avoid that error and let the code continue with the next issues
while True:
try:
df.get_chunk()
except TypeOfError:
funcyfunction()
could you fix this last exception handling lines of code, please?
df = pd.read_csv(path, chunksize = 6, error_bad_lines=False)to skip the lines causing errors.