Python - initiate empty dataframe and populate from another dataframe

Question

Working with python pandas 0.19.

I want to create a new dataframe (df2) as a subset of an existing dataframe (df1). df1 looks like this:

In [1]: df1.head()
Out [1]:
    col1_name    col2_name    col3_name
0          23           42           55
1          27           55           57
2          52           20           52
3          99           18           53   
4          65           32           51

The logic is:

df2 = []

for i in range(0,N):
    loc = some complicated logic
    df1_sub = df1.ix[loc,]
    df2.append(df1_sub)

df2 = pd.DataFrame.from_records(df2)

The result df2 is indeed a dataframe, but the content is all comprised of column names of df1. It looks like this:

In [2]: df2.head()
Out [2]:
    col1_name    col2_name    col3_name
0   col1_name    col2_name    col3_name
1   col1_name    col2_name    col3_name
2   col1_name    col2_name    col3_name
3   col1_name    col2_name    col3_name
4   col1_name    col2_name    col3_name

I know it's probably related to the conversion from list to dataframe but I'm not sure what exactly I'm missing here. Or is there a better way of doing this?

please include df1.head() and final result that you want. That makes the problem easier to understand. — Mohammad Yusuf
– Mohammad Yusuf, Commented Jan 6, 2017 at 14:40
I'm not sure exactly what you are asking but there are many things that need to be addressed. Do not use .ix unless absolutely necessary. You shouldn't have to create a list of dataframes to do this but if you do, the last line should be changed to pd.concat(df2). Please provide more info as it might be possible to not use a for loop to construct the logic. Also the name df2 implies you have a DataFrame. Use something like df_list instead. — Ted Petrou
– Ted Petrou, Commented Jan 6, 2017 at 14:43
in the for loop check the value of loc, it may tell you if there is something wrong — Shijo
– Shijo, Commented Jan 6, 2017 at 14:49
@ Ted Petrou pd.concat(df2) is the way to go. The logic is indeed complicated. I'll have to even do a while loop within the for loop: take a slice from df1 called df1_sub, take out one row of df1_sub if a condition is met, and check the remaining df1_sub until the condition is no longer met. — data-monkey
– data-monkey, Commented Jan 6, 2017 at 14:49

Oxymoron88 · Accepted Answer · 2017-01-06 14:44:09Z

0

How about just slice the dataframe?

import pandas as pd
DF1 = pd.DataFrame()
DF1['x'] = ['a','b','c','a','c','b']
DF1['y'] = [1,3,2,-1,-2,-3]

DF2 = DF1[[(x == 'a' and y > 0) for x,y in zip(DF1['x'], DF1['y'])]]

This should be way more efficient than appending. DF1[Complicated Condition] takes any Boolean arguement

answered Jan 6, 2017 at 14:44

Oxymoron88

4953 silver badges9 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

DeepSpace · Accepted Answer · 2017-01-06 14:47:44Z

0

You can take advantage of pandas' (actually numpy's) masked arrays.

import pandas as pd

df1 = pd.DataFrame({'a': [1, 2, 3, 4, 5], 'b': ['a', 'b', 'c', 'd', 'e'],
                    'c': [10, 11, 12, 13, 14]})

#      a  b   c
#   0  1  a  10
#   1  2  b  11
#   2  3  c  12
#   3  4  d  13
#   4  5  e  14

Let's assume that df2 should be a subset of df1: it should have columns b and c and only the rows where column a has an even value:

df2 = df1[df1['a'] % 2 == 0][['b', 'c']]
#    b   c
# 1  b  11
# 3  d  13

answered Jan 6, 2017 at 14:47

DeepSpace

82.1k12 gold badges119 silver badges166 bronze badges

Comments

data-monkey · Accepted Answer · 2017-01-06 14:52:08Z

0

As per Ted Petrou, the solution is simply:

pd.concat(df2)

I was confused by the data type of df2.

It is impossible, given the logic within the for loop, to directly select df1 using some index.

answered Jan 6, 2017 at 14:52

data-monkey

1,7554 gold badges19 silver badges27 bronze badges

Collectives™ on Stack Overflow

Python - initiate empty dataframe and populate from another dataframe

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related