2

I've two files in this form:

File1: id:0.0260509118455
File2: id:X:Y

I'd like to get a third file having all lines of file1 joined with the lines of second file containing the same id. i.e. :

File3: id:0.0260509118455:X:Y

(file1 has 100 lines, file2 has 666 lines). There are not unpairable lines

3 Answers 3

2

To join files containing database tables, use the join command after sorting the tables into key order:

sort -b -t : file1 > sorted-file1
sort -b -t : file2 > sorted-file2
join -t : sorted-file1 sorted-file2

Further reading

4
  • what if I want the line sorted not by id but by the numerical field of File1 ? Commented Feb 27, 2017 at 16:31
  • Then you use the sort command on the joined output. Commented Feb 27, 2017 at 16:36
  • Why not join -t: <(sort -b -t: file1) <(sort -b -t: file2) to eliminate the need of extra files? Commented Feb 28, 2017 at 20:31
  • Because the question is not specific to merely those particular shells where that would actually work, and because for all we know the need for extra files is already eliminated and the files are already in sort order. In the more straightforward form given, it's simpler for readers to work out how to take out the redundant sort steps, when they are in fact redundant. Commented Feb 28, 2017 at 23:40
0

You should able to do this with the "paste" command. It reads the columns instead of lines.

awk -F: '{ print $2}' File2 > File4

To remove the id: tag

Then

paste  File1 File4 > File3

Should do the job.

0

You can also do it with awk, checking the id, without the need to sort or to pre-process the files in any way:

awk -F: 'NR==FNR{a[$1]=$0;next}$1 in a {print a[$1],$2,$3}' OFS=: file1 file2 >file3

PS: To gain performance small file (file1 100 lines) is loaded first in memory, and big file is compared against memory.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.