Merging 2 text files into one text file with 2 columns in Python

Question

I have 2 text files each having same number of lines, i want to merge these 2 text files into a single csv file into 2 fields with an additional field of line number.is this possible in python ?

File1:
This is a source first line 
This is a source second line
This is a source third line 

File2:
This is a transformed line 1
This is a transformed line 2
This is a transformed line 3 

Outputfile:
1,This is a source first line    ,This is a transformed line 1
2,This is a source second line   ,This is a transformed line 2
3,This is a source third  line   ,This is a transformed line 3

zip_longest is your friend

dawg
– dawg

2018-12-18 19:36:41 +00:00
Commented Dec 18, 2018 at 19:36 — dawg
– dawg, Commented Dec 18, 2018 at 19:36
stackoverflow.com/questions/53654574/…

Aaron_ab
– Aaron_ab

2018-12-18 19:41:33 +00:00
Commented Dec 18, 2018 at 19:41 — Aaron_ab
– Aaron_ab, Commented Dec 18, 2018 at 19:41

dawg · Accepted Answer · 2018-12-18 21:42:02Z

1

Given:

$ cat file1
This is a source first line 
This is a source second line
This is a source third line 
$ cat file2
This is a transformed line 1
This is a transformed line 2
This is a transformed line 3

You can do:

from itertools import izip_longest

with open(fn1) as f1, open(fn2) as f2:
    print '\n'.join(['{}: {}\t{}'.format(i,l1.strip(),l2.strip()) for i,(l1,l2) in enumerate(izip_longest(f1,f2),1)])

Prints:

1: This is a source first line  This is a transformed line 1
2: This is a source second line This is a transformed line 2
3: This is a source third line  This is a transformed line 3

Now suppose you have:

$ cat file1
This is a source first line 
This is a source second line
This is a source third line 
$ cat file2
This is a transformed line 1
This is a transformed line 2
This is a transformed line 3 
This is line 4

You need to make the output true columns (by using {:40} to set a 40 character column value) and use a fillvalue for izip_longest:

with open(fn1) as f1, open(fn2) as f2:
    print '\n'.join(['{}: {:40}{:40}'.format(i,l1.strip(),l2.strip()) for i,(l1,l2) in enumerate(izip_longest(f1,f2,fillvalue=""),1)])

Prints:

1: This is a source first line             This is a transformed line 1            
2: This is a source second line            This is a transformed line 2            
3: This is a source third line             This is a transformed line 3            
4:                                         This is line 4

edited Dec 18, 2018 at 21:42

answered Dec 18, 2018 at 19:46

dawg

105k24 gold badges142 silver badges217 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Ramachandra Sedimbi Over a year ago

Thanks @dawg. I have used the zip_longest and was able to get the needed output file. Thanks everyone for all the valuable inputs.

KuboMD · Accepted Answer · 2018-12-18 19:52:30Z

0

We can do something like this without importing. If we have two files:

File1:
This is a source first line 
This is a source second line
This is a source third line

File2:
This is a transformed line 1
This is a transformed line 2
This is a transformed line 3

Then...

with open("file1.txt") as f, open("file2.txt") as f2, open("outFile.txt", "w+") as o:
        lines = len(f.readlines())
        f.seek(0)
        for i in range(lines):
                o.write("{},{} \t\t,{}\n".format(i+1, f.readline().strip(), f2.readline().strip()))

To explain: We open the two reading files and the one writing file. We see how many rows are in the file. We put the line-reading cursor back at the top of the file. Then, for each line, we write it to the file by including the index, the first file's line, the tabs and commas, and the second file's line. Our output:

1,This is a source first line           ,This is a transformed line 1
2,This is a source second line          ,This is a transformed line 2
3,This is a source third line           ,This is a transformed line 3

answered Dec 18, 2018 at 19:52

KuboMD

6845 silver badges16 bronze badges

1 Comment

dawg Over a year ago

1) No need to read the file then rewind to get the line count -- just use enumerate 2) This will throw an error if one file is a different length than the other.

Kevin S · Accepted Answer · 2018-12-18 20:11:13Z

0

with open(r'C:/file1.txt') as f1, open(r'C:/file2.txt') as f2, open(r'C:/destination.txt', 'w') as o:
    for index, (line1, line2) in enumerate(zip(f1, f2), 1):
            o.write('{}:,{} ,{}\n'.format(index, line1.rstrip(), line2.rstrip()))

The nice thing about this solution is that it doesn't read in the entire files into memory, it iterates over each line in the input files and writes them to the output file one at a time. I made an assumption based on the original question that both files have the same number of lines, but if they don't then you would use zip_longest instead of zip here.

edited Dec 18, 2018 at 20:11

answered Dec 18, 2018 at 20:03

Kevin S

97011 silver badges20 bronze badges

4 Comments

dawg Over a year ago

That will truncate the longer of the two files.

Kevin S Over a year ago

The original post says "I have 2 text files each having same number of lines"

dawg Over a year ago

How silly to design a solution based on that assumption when there is an easy and Pythonic solution to avoid it.

Kevin S Over a year ago

I've updated with a comment to indicate that zip_longest is an option. I do agree it's more flexible.

Collectives™ on Stack Overflow

Merging 2 text files into one text file with 2 columns in Python

3 Answers 3

1 Comment

1 Comment

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

1 Comment

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related