I have 60mb file with lots of lines.
Each line has the following format:
(x,y)
Each line will be parsed as a numpy vector at shape (1,2).
At the end it should be concatenated into a big numpy array at shpae (N,2) where N is the number of lines.
What is the fastest way to do that? Because now it takes too much time(more than 30 min).
My Code:
with open(fname) as f:
for line in f:
point = parse_vector_string_to_array(line)
if points is None:
points = point
else:
points = np.vstack((points, point))
Where the parser is:
def parse_vector_string_to_array(string):
x, y =eval(string)
array = np.array([[x, y]])
return array
points = np.vstack((points, point)). That results inpointsbeing copied for every new line. Instead, makepointsa python list, and append to it. Don't convert it to a numpy array until you have finished reading the file.