I'm a little stumped on this one. I've created a proof of concept where I built a Pandas Dataframe with a static Numpy Array of numbers. I got this working fine, but now I'm taking it a step further and importing a CSV file to build this same Dataframe and Numpy Array. Here is the snippet of the file and what I've written. I want to take the second column of 'numbers' and build an array of 6 numbers per line. For example, [[11],[21],[27],[36],[62],[24]], [[14],[18],[36],[49],[67],[18]], etc.
CSV:
date,numbers,multiplier
09/26/2020,11 21 27 36 62 24,3
09/30/2020,14 18 36 49 67 18,2
10/03/2020,18 31 36 43 47 20,2
CODE:
data = pd.read_csv('pbhistory.csv')
data['date'] = pd.to_datetime(data.date, infer_datetime_format=True)
data.sort_values(by='date', ascending=True, inplace=True)
df = pd.DataFrame(data.numbers).to_numpy()
df2 = pd.DataFrame(df, columns=['1', '2', '3', '4', '5', '6'])
print(df2.head())
ERROR: I'm expecting 6 columns of data from df2 as I thought it was converted to an array properly after importing the 'numbers' column from the CSV, but I get the following:
ValueError: Shape of passed values is (1414, 1), indices imply (1414, 6)
So, I change the code to df2 = pd.DataFrame(df, columns=['1']) and get the following output. The problem is, I need it to be in 6 columns, not 1.
1
0 11 21 27 36 62 24
1 14 18 36 49 67 18
2 18 31 36 43 47 20
So, as you can see, I'm only getting one column with all numbers, instead of an array of numbers with 6 columns.