Working with python pandas 0.19.
I want to create a new dataframe (df2) as a subset of an existing dataframe (df1). df1 looks like this:
In [1]: df1.head()
Out [1]:
col1_name col2_name col3_name
0 23 42 55
1 27 55 57
2 52 20 52
3 99 18 53
4 65 32 51
The logic is:
df2 = []
for i in range(0,N):
loc = some complicated logic
df1_sub = df1.ix[loc,]
df2.append(df1_sub)
df2 = pd.DataFrame.from_records(df2)
The result df2 is indeed a dataframe, but the content is all comprised of column names of df1. It looks like this:
In [2]: df2.head()
Out [2]:
col1_name col2_name col3_name
0 col1_name col2_name col3_name
1 col1_name col2_name col3_name
2 col1_name col2_name col3_name
3 col1_name col2_name col3_name
4 col1_name col2_name col3_name
I know it's probably related to the conversion from list to dataframe but I'm not sure what exactly I'm missing here. Or is there a better way of doing this?
df1.head()and final result that you want. That makes the problem easier to understand..ixunless absolutely necessary. You shouldn't have to create a list of dataframes to do this but if you do, the last line should be changed topd.concat(df2). Please provide more info as it might be possible to not use a for loop to construct the logic. Also the namedf2implies you have a DataFrame. Use something likedf_listinstead.