I have an excel sheet where column A is filled with Date/Time and Column N is extracting just the year from the date, eg "=YEAR(A2)". I am trying to use some form of python, Openpyxl, Pandas whatever, to be able to read Column N and then fill Column O with the unique years from column O. Right now, my issue is that when I read with Pandas at least I am getting NaN for all of the rows except the header.
This is my python code
import pandas as pd
files = 'A_data.xlsx'
sheetName = "Sheet1"
# generate path plus files for workbook.
print(files)
df = pd.read_excel(files,usecols='N')
print(df)
And this is my data that I get back from printing the df:
0 Year
1 NaN
2 NaN
3 NaN
4 NaN
...
...
6285 NaN
6286 NaN
6287 NaN
6288 NaN
6289 NaN
[6290 rows x 1 columns]
I tried to copy the formulas with actual data and interestingly that seemed to solve my problem, but that isn't really how I want to go about doing it if I don't have to. Any help would be greatly appreciated.
I think I listed what I tried already, I have posted my code and sample spreadsheet. I have tried to replace formula with number, which seemed to fix it oddly enough. I have also tried to tell Pandas to ignore headers but that did not solve the problem. Instead of using code, I tried to use the Excel formula "Unique", but when I did that upon opening the sheet, Excel complained of issues, those went away when I commented out this one line.