0

I have many CSV files under subdirectories in one folder. They all contain tweets and other metadata. I am interested in removing most of these metadata and keeping the tweets themselves and their time. I used glob to read the files, and the removing part seems to be working fine. However, I am not sure how to save the output so that all files are saved and with their original file name.

import pandas as pd
import glob
path = r'D:\tweets'
myfiles= glob.glob(r'D:\tweets\**\*.csv', recursive=True)
for f in myfiles:
    df = pd.read_csv(f)
df = df.drop(["name", "id","conversation_id","created_at","date"], axis=1)
df = df[df["language"].str.contains("bn|ca|ckbu|id||zh")==False]
df.to_csv("output_filename.csv", index=False, encoding='utf8')
4
  • 1
    Are their Indenting Problems in your question? If not, there isn't only the last file in D:\tweets is getting converted back into a csv. Commented Jun 13, 2021 at 14:26
  • The only indentation I have is in the sixth line (df = pd.read_csv(f)) Commented Jun 13, 2021 at 14:32
  • As you are processing each from the list myfiles, your code needs to be part of the for loop. i don't think you would need to overwrite the original file, then something like will help df.to_csv(os.path.splitext(f)[0]+"_transformed.csv") Commented Jun 13, 2021 at 14:46
  • @simpleApp This worked like magic!!! Thank you so so much!! Commented Jun 13, 2021 at 15:16

1 Answer 1

1

If you do it this way, it will overwrite the same file:

for f in myfiles:
    df = pd.read_csv(f)
    df = df.drop(["name", "id","conversation_id","created_at","date"], axis=1)
    df = df[df["language"].str.contains("bn|ca|ckbu|id||zh")==False]
    df.to_csv(f, index=False, encoding='utf8')
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.