First things first, here's the code:
import pandas as pd
headers = ["Category", "Brand", "Product_Name", "Shipping", "Price"]
xl = pd.ExcelFile("C:\\Users\\*myusername*\\Desktop\\products.xlsx")
df = xl.parse("products")
print(df)
df = df.sort_values(by=headers[4], axis='columns', na_position='last')
writer = pd.ExcelWriter('C:\\Users\\*myusername*\\Desktop\\output.xlsx')
df.to_excel(writer, sheet_name="Sheet1", columns=headers, index=False)
writer.save()
print("Done")
What I'm trying to do with this is sort some data I scraped from Newegg as a sort of practice project. I intend to take this code here and expand it out to do a variety of things with the data, but I thought I'd start easy and just sort it all by the Price column.
When I run the above code, it throws the following error:
File "<input>", line 9, in <module>
File "C:\Users\mmiller3\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\frame.py", line 4421, in sort_values
stacklevel=stacklevel)
File "C:\Users\mmiller3\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\generic.py", line 1382, in _get_label_or_level_values
raise KeyError(key)
KeyError: 'Price'
When I print(df), immediately after I create it from the parsed xl sheet, it correctly displays the 5 headers, and all the data held within. The header 'Price' definitely exists.
print(df) output below:
Category Brand ... Shipping Costs Price
0 Desktop Memory G.SKILL ... Free Shipping 199.99
1 Desktop Memory G.SKILL ... Free Shipping 143.99
That's only a small snippet of the output, it goes on for 147 rows.
I've tried a number of things, including replacing "headers[4]" with a more straight forward "Price", I've tried indicating "E" for the column, rather than using the header.
At this point I'm stumped, and the only other reference I've found to this specific issue was a simple syntax error, but I'm not making that same error.
Any help you guys can offer me would be appreciated.