0

I have a large sized csv file, approximately 6gb, and it's taking a lot of time to load on to python. I get the following error:

import pandas as pd
df = pd.read_csv('nyc311.csv', low_memory=False)


Python(1284,0x7fffa37773c0) malloc: *** mach_vm_map(size=18446744071562067968) failed (error code=3)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/io/parsers.py", line 646, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/io/parsers.py", line 401, in _read
    data = parser.read()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/io/parsers.py", line 939, in read
    ret = self._engine.read(nrows)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/io/parsers.py", line 1508, in read
    data = self._reader.read(nrows)
  File "pandas/parser.pyx", line 851, in pandas.parser.TextReader.read (pandas/parser.c:10438)
  File "pandas/parser.pyx", line 939, in pandas.parser.TextReader._read_rows (pandas/parser.c:11607)
  File "pandas/parser.pyx", line 2024, in pandas.parser.raise_parser_error (pandas/parser.c:27037)
pandas.io.common.CParserError: Error tokenizing data. C error: out of memory

I don't think I am understanding the error code, the last line seems to suggest that the file is too big to load? I also tried low_memory=FALSE option but this did not work either.

I'm not sure what " can't allocate region" mean, could it be possible that the header includes 'region' and pandas cannot locate the column underneath?

2
  • 1
    you need to read the file in chunks. Use the chunksize parameter Commented Feb 8, 2017 at 4:24
  • Just a heads up on another cause of this in pandas 0.20.3, I have the *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug error in a script that I last ran in a previous pandas version. The cause in this case, or at least the thing that rectified the error, was the low_memory = False option. The script is loading a large (1.2Gb) dataset but with 32Gb of RAM available and it and larger datasets load happily on the same machine, but my script failed at df = pd.read_csv(datasetName, low_memory = False) until low_memory = False was removed. Commented Feb 10, 2018 at 13:52

1 Answer 1

1

Out of memory issue occur due to RAM. There's no other explaination for that.

Sum of all data memory-overheads for in-RAM objects !< RAM

malloc: *** mach_vm_map(size=18446744071562067968) failed You can clearly understand from this error statement.

Try using.

df = pd.read_csv('nyc311.csv',chunksize =5000,lineterminator='\r')

Or, if reading this csv is only a part of your program, and if there are any other dataframes created before,try cleaning them if not in use.

import gc
del old_df              #clear dataframes not in use
gc.collect()        # collect Garbage 
del gc.garbage[:]   # Clears RAM

`

Sign up to request clarification or add additional context in comments.

2 Comments

Hi, thank you for your comment. Why would I get the following error message:Python(5431,0x7fffa37773c0) malloc: *** mach_vm_map(size=18446744071562067968) failed (error code=3) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug Python(5431,0x7fffa37773c0) malloc: *** error for object 0x104623257: pointer being freed was not allocated *** set a breakpoint in malloc_error_break to debug
@song0089 malloc meands memory allocation , it seems there's some issue with allocating free memory to store your dataframe. It begins with a pointer, and then each row of your dataframe is saved in your memory and the pointer is incremented each time, as you could see, at object 0x104623257 (which maybe some nth row) the pointer has no more free address(i.e memory) where it could point that row to be stored, that's why you are getting this error. If you're satisfied, kindly upvote/accept answer as it is a common practice here.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.