0

I have the following template DataFrame:

df = pd.DataFrame({
    'File_name_Column': ['File1', 'File2', 'File3', 'File1', 'File2', 'File3'],
    'Column3': ['xxr', 'xxv', 'xxw', 'xxt', 'xxe', 'xxz'],
    'Column4': ['wer', 'cad', 'sder', 'dse', 'sdf', 'csd'],
    'Column5': ['xxr', 'xxv', 'xxw', 'xxt', 'xxe', 'xxz'],
    'Column6': ['xxr', 'xxv', 'xxw', 'xxt', 'xxe', 'xxz'],})

I want to write several .txt files named based on the column "File_name_Column".

I want to use something like this, but it's not working:


df.to_csv(f'{df_File_name_Column}.txt', sep='|', index=False, header=False)

Desired Output:
File1.txt
'xxr'|'wer'|'xxr'|'xxr'
'xxt'|'dse'|'xxt'|'xxt'

File2.txt
'xxv'|'cad'|'xxv'|'xxv'
'xxe'|'sdf'|'xxe'|'xxe'

File3.txt
'xxw'|'sder'|'xxw'|'xxw'
'xxz'|'csd'| 'xxz'| 'xxz'

Note¹: This is millions of rows dataframe

Note²: I cannot use Open() Function, because I'm migrating this pipeline to a platform that don't support this function.

2
  • You cannot write Python code in the platform, open is a built-in function... Commented Oct 24, 2021 at 16:29
  • Pandas is supported. Open() is not.. This is a customer demand that I don't understand but can't argue. But is because of directory write functions restrictions. Commented Oct 24, 2021 at 16:32

1 Answer 1

1

One approach, groupby + to_csv:

for key, group in df.groupby("File_name_Column"):
    group.drop("File_name_Column", 1).to_csv(f"{key}.txt", sep='|', index=False, header=False)
Sign up to request clarification or add additional context in comments.

3 Comments

This seems to work like a charm. Thanks. Can you think of another approach?
@TanaiGoncalves Not really
Ok. no problem. I think will work. But I'll only know when deploying on the platform. Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.