Append columns from excel file to csv file based on if statement

Question

I have two files:

One with 'filename' and value_count columns (ValueCounts.csv)
Another with 'filename' and 'latitude' and 'longitude' columns (GeoData.xlsx)

I have started by creating dataframes for each file and the specific columns within that I intend on using. My code for this is as follows:

Xeno_values = pd.read_csv(r'C:\file_path\ValueCounts.csv')
img_coords = pd.read_excel(r'C:\file_path\GeoData.xlsx')

df_values = pd.DataFrame(Xeno_values, columns = ['A','B'])
df_coords = pd.DataFrame(img_coords, columns = ['L','M','W'])

However when I print() each dataframe all the column values are returned as 'NaN'.

How do I correct this? And then write and if statement that iterates over the data and says:

if 'filename' (col 'A') in df_values == 'filename' (col 'W') in df_coords, append 'latitude' (col 'L') and 'longitude' (col 'M') to df_values

If any clarification is needed please do ask.

Thanks, R

eNc · Accepted Answer · 2020-06-10 14:28:43Z

1

Check out the documentation for pandas read_csv and read_excel (https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html). These functions already return the data in a dataframe. Your code is trying to create a dataframe using a dataframe, which is fine if you don't specify columns, but will return all NaN values if you do.

So if you want to load the dataframes:

df_values = pd.read_csv(r'C:\file_path\ValueCounts.csv')
df_coords = pd.read_excel(r'C:\file_path\GeoData.xlsx')

Will do the trick. And if you just want specific columns:

df_values = pd.read_csv(r'C:\file_path\ValueCounts.csv', usecols=['A','B'])
df_coords = pd.read_excel(r'C:\file_path\GeoData.xlsx', usecols=['L','M','W'])

Make sure that those column names do actually exist in your csv files

If you want to rename columns (make sure you're doing all columns here):

df_values.columns = ['Filename', 'Date']

For adding lat/long to df_values you could try:

df = pd.merge(df_values, df_coords[['filename', 'LAT', 'LONG']], on='filename', how='inner')

Which assumes that there are columns 'filename' in both the values and coords dataframes, and that the coords dataframes has columns 'LAT' and 'LONG' in it.

Lastly, do out a tutorial on pandas (https://www.tutorialspoint.com/python_pandas/index.htm). Becoming more familiar with it will help you wrangle data better.

edited Jun 10, 2020 at 14:28

answered Jun 10, 2020 at 12:37

eNc

1,09110 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

CephaloRhod Over a year ago

Hi @eNc. Thanks for the quick response! This is great in theory, however doen't 'pd.merge' essentially just paste the selected columns alongside eachother? If so this is not appropreate, as not all the filenames in one file are included in the other, so as soon as one filename is dropped the entire dataset will be offset. Hence the need for an if statement to ensure that the lat/long data is only appended where the filename is equal to filename. or does merge do this anyway?

eNc Over a year ago

@CephaloRhod I've updated the soln to use the intersection of filenames from both data frames. Take a look at pandas merge (pandas.pydata.org/pandas-docs/stable/reference/api/…). What the last line of code there does is take the lat, long from coords, and merges with df_values where the filenames column has the same values in both df. It returns a new dataframe, df

CephaloRhod Over a year ago

Fantastic this is now working as desired. Thank you very much youve been a massive help. I'm just an ecologist drowning in a programmers world, but I'm desperately trying to learn! I'll make sure I check out some of those resources you linked me to more thoroughly when i have some more time. Thanks again mate @eNc :)

eNc Over a year ago

@CephaloRhod Glad its working for you. Good luck with your research, and plz accept my solution (check mark next to arrows) if its answered your question.

Collectives™ on Stack Overflow

Append columns from excel file to csv file based on if statement

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related