0

I am trying to modify .csv files in a folder. The files contain flight information from years 2011-2016.

However, year information cannot be found in the values.

I would like to solve this by using the filename of the .csv file which contains the year. I am adding a new 'year' column after reading it into a pandas dataframe. I will then export the modified file to a new .csv with only the year as its filename.

However, I am encountering this error:

ValueError:Length of values does not match length of index

Code below for your reference.

import pandas as pd
import glob
import re
import os

path = r'data_caap/'                   
all_files = glob.glob(os.path.join(path, "*.csv"))


for f in all_files:
    df = pd.read_csv(f)
    year= re.findall(r'\d{4}', f)

    #Error here
    df['year']=year
    #Error here

    df.to_csv(year)
3
  • can you try print(re.findall(r'\d{4}', f))? Commented Jun 2, 2018 at 5:44
  • ['2001'] ['2002'] ['2003'] ['2004'] ['2005'] ['2006'] ['2007'] ['2008'] ['2009'] ['2010'] ['2011'] ['2012'] ['2013'] ['2014'] ['2015'] ['2016'] Commented Jun 2, 2018 at 5:45
  • 1
    Must be df['year']=year[0]. findall returns a list. Commented Jun 2, 2018 at 5:45

1 Answer 1

1

Found the cause of the error.

Must be df['year']=year[0]. findall returns a list. – DyZ

Thanks a lot @Dyz

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.