3

I want to add a column of 1s in the beginning of a pandas dataframe which is created from an external data file 'ex1data1.txt'. I wrote the following code. The problem is the print(data) command, in the end, is returning None. What is wrong with this code? I want data to be a pandas dataframe. The raw_data and X0_ are fine, I have printed them.

import numpy as np
import pandas as pd
raw_data = pd.read_csv('ex1data1.txt', header= None, names= ['x1','y'])
X0_ = np.ones(len(raw_data))
idx = 0
data = raw_data.insert(loc=idx, column='x0', value=X0_)
print(data)

2 Answers 2

2

Another solution might look like this:

import numpy as np
import pandas as pd
raw_data = pd.read_csv('ex1data1.txt', header= None, names= ['x1','y'])

raw_data.insert(loc=0, column='x0', value=1.0)

print(raw_data)
Sign up to request clarification or add additional context in comments.

4 Comments

Is pd.Series actually necessary here? I think it's natural to assign a NumPy array to a series.
You are right, it's not necessary; I'll remove it from the example
Actually the numpy array is not necessary either. You can just pass value=1 and it will broadcast.
using value=1 will fill the column with ints, the np array will fill it with floats; But then, if floats are needed, value=1.0 would work just as well, yeah.
2

pd.DataFrame.insert

You can use pd.DataFrame.insert, but note this solution is in place and does not need reassignment. You may also need to explicitly set dtype to int:

df = pd.DataFrame([[1, 2, 3], [4, 5, 6]],
                  columns=['col1', 'col2', 'col3'])

arr = np.ones(len(df.index), dtype=int)
idx = 0
df.insert(loc=idx, column='col0', value=arr)

print(df)

   col0  col1  col2  col3
0     1     1     2     3
1     1     4     5     6

Direct definition + reordering

One clean solution is to simply add a row and move the last column to the beginning of your dataframe. Here's a complete example:

df = pd.DataFrame([[1, 2, 3], [4, 5, 6]],
                  columns=['col1', 'col2', 'col3'])

df['col0'] = 1  # adds column to end of dataframe
cols = [df.columns[-1]] + df.columns[:-1].tolist()  # move last column to front
df = df[cols]  # apply new column ordering

print(df)

   col0  col1  col2  col3
0     1     1     2     3
1     1     4     5     6

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.