I recently started learning python for data analysis and I am having problems trying to understand some cases of object assignment when using pandas DataFrame and Series.
First of all, I understand that changing the value of one object, will not change another object which value was assigned in the first one. The typical:
a = 7
b = a
a = 12
So far a = 12 and b = 7. But when using Pandas I have the following situation:
import pandas as pd
my_df = pd.DataFrame({'Col1': [2, 7, 9],'Col2': [1, 6, 12],'Col3': [1, 6, 9]})
pd_colnames = pd.Series(my_df.columns.values)
list_colnames = list(my_df.columns.values)
Now this two objects contain the same text, one as pd.Series and the second as list. But if I change some column names the values change:
>>> my_df.columns.values[0:2] = ['a','b']
>>> pd_colnames
0 a
1 b
2 Col3
dtype: object
>>> list_colnames
['Col1', 'Col2', 'Col3']
Can somebody explain me why using the built-in list the values did not change, while with pandas.Series the values changed when I modified the data frame?
And what can I do to avoid this behavior in pandas.Series? I have a data frame which column names sometimes I need to use in English and sometimes in Spanish, and I'd like to be able to keep both as a pandas.Series object in order to interact with them.
some_series.values.tolist()orlist(some_series.values). The majority of the time, it’s completely unnecessary. On the rare occasion that you do need a list, you can simply usesome_series.tolist().