Pandas dataframe creation returning none

Question

I want to add a column of 1s in the beginning of a pandas dataframe which is created from an external data file 'ex1data1.txt'. I wrote the following code. The problem is the print(data) command, in the end, is returning None. What is wrong with this code? I want data to be a pandas dataframe. The raw_data and X0_ are fine, I have printed them.

import numpy as np
import pandas as pd
raw_data = pd.read_csv('ex1data1.txt', header= None, names= ['x1','y'])
X0_ = np.ones(len(raw_data))
idx = 0
data = raw_data.insert(loc=idx, column='x0', value=X0_)
print(data)

heidemarie · Accepted Answer · 2018-06-09 20:36:16Z

2

Another solution might look like this:

import numpy as np
import pandas as pd
raw_data = pd.read_csv('ex1data1.txt', header= None, names= ['x1','y'])

raw_data.insert(loc=0, column='x0', value=1.0)

print(raw_data)

edited Jun 9, 2018 at 20:36

answered Jun 9, 2018 at 20:29

heidemarie

1155 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

jpp Over a year ago

Is pd.Series actually necessary here? I think it's natural to assign a NumPy array to a series.

heidemarie Over a year ago

You are right, it's not necessary; I'll remove it from the example

user2285236 Over a year ago

Actually the numpy array is not necessary either. You can just pass value=1 and it will broadcast.

heidemarie Over a year ago

using value=1 will fill the column with ints, the np array will fill it with floats; But then, if floats are needed, value=1.0 would work just as well, yeah.

jpp · Accepted Answer · 2021-07-15 11:56:26Z

pd.DataFrame.insert

You can use pd.DataFrame.insert, but note this solution is in place and does not need reassignment. You may also need to explicitly set dtype to int:

df = pd.DataFrame([[1, 2, 3], [4, 5, 6]],
                  columns=['col1', 'col2', 'col3'])

arr = np.ones(len(df.index), dtype=int)
idx = 0
df.insert(loc=idx, column='col0', value=arr)

print(df)

   col0  col1  col2  col3
0     1     1     2     3
1     1     4     5     6

Direct definition + reordering

One clean solution is to simply add a row and move the last column to the beginning of your dataframe. Here's a complete example:

df = pd.DataFrame([[1, 2, 3], [4, 5, 6]],
                  columns=['col1', 'col2', 'col3'])

df['col0'] = 1  # adds column to end of dataframe
cols = [df.columns[-1]] + df.columns[:-1].tolist()  # move last column to front
df = df[cols]  # apply new column ordering

print(df)

   col0  col1  col2  col3
0     1     1     2     3
1     1     4     5     6

Collectives™ on Stack Overflow

Pandas dataframe creation returning none

2 Answers 2

4 Comments

pd.DataFrame.insert

Direct definition + reordering

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

pd.DataFrame.insert

Direct definition + reordering

Comments

Your Answer

Sign up or log in

Post as a guest

Related