Fetch values in dataframe

Question

I have dataframe as follows:

        A   B   
1   2   3   4   5
4   5   6   7   8

I am trying to fetch data from this dataframe in following ways:

print (file_dataframe.columns)

Index(['A', 'B', 'Unnamed: 2'], dtype='object')

file_dataframe_values = [cell for column in file_dataframe.columns for cell in file_dataframe[column].values.tolist()]
print (file_dataframe_values )

['3', '6', '4', '7', '5', '8']

Why it is starting dataframe from first values in first row?

When I am using following dataframe:

    A
1   2   3   4   5
4   5   6   7   8


print (file_dataframe.columns)

Index(['A', 'Unnamed: 1', 'Unnamed: 2','Unnamed: 3'], dtype='object')

file_dataframe_values = [cell for column in file_dataframe.columns for cell in file_dataframe[column].values.tolist()]
print (file_dataframe_values )

['2','5','3', '6', '4', '7', '5', '8']

When I am using following data frame as first row is empty:

1   2   3   4   5
4   5   6   7   8


print (file_dataframe.columns)

Index(['Unnamed: 0', 'Unnamed: 1', 'Unnamed: 2','Unnamed: 3','Unnamed: 4'], dtype='object')

file_dataframe_values = [cell for column in file_dataframe.columns for cell in file_dataframe[column].values.tolist()]
print (file_dataframe_values )

['1','4','2','5','3', '6', '4', '7', '5', '8']

Can anyone please explain this behavior?

I think python interprets unknown column(s) prior to the first named one as (multi) index in case you don't specify 'header = 1'. In contrast, If all are unnamed, python assigns 'unnamed' to all since a dataframe can't be made of index columns only. — sudonym
– sudonym, Commented Jun 21, 2018 at 8:24

jpp · Accepted Answer · 2018-06-21 08:24:55Z

1

Don't judge a dataframe by `print`

In the first instance, you have a dataframe with a MultiIndex:

df = pd.DataFrame([[3, 4, 5], [6, 7, 8]],
                  columns=['A', 'B', ''],
                  index=pd.MultiIndex.from_tuples([(1, 2), (4, 5)]))

print(df)

     A  B   
1 2  3  4  5
4 5  6  7  8

In the second instance, you have a dataframe with a regular Index:

df = pd.DataFrame([[2, 3, 4, 5], [5, 6, 7, 8]],
                  columns=['A', '', '', ''],
                  index=[1, 4])

print(df)

   A         
1  2  3  4  5
4  5  6  7  8

When you extract columns, index and values from each dataframe, you will have different results. This shouldn't be surprising. However, it does require you to learn about Pandas indexing, which is a useful exercise in any case. The following sections of the official docs may be helpful:

Unfortunately, there's no shortcut. This is purely API-specific logic.

answered Jun 21, 2018 at 8:24

jpp

166k37 gold badges301 silver badges362 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Piyush S. Wanare Over a year ago

I don't have data index actually it's excel sheet data, it's considering it as an index. can I do something as it will not consider it as index.

jpp Over a year ago

@PiyushS.Wanare, Use df = df.reset_index() to elevant indices to columns.

Piyush S. Wanare Over a year ago

@jpg, I got it working by mentioning header=None while reading excel.

Collectives™ on Stack Overflow

Fetch values in dataframe

1 Answer 1

Don't judge a dataframe by `print`

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Don't judge a dataframe by print

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related

Don't judge a dataframe by `print`