How to add two columns from two different files with reference to matching values in another column?

Question

File 1:

File 2:

Examining field 1 in both File 1 and File 2, we see the strings 2 and 4 are in common. These are my reference rows. For these reference rows, I would like to add the values from field 2 in both files.

In other words,

search File 1 and File 2 for matching strings in $1. In this case, 2 and 4.
for $1 = 2, then $2 = 0.1 + 0.7 = 0.8
for $1 = 4, then $2 = 0.8 + 0.2 = 1.0

Desired output in File 3:

Namely, File 3 = File 1, except the rows, where $1 in File 1 matches $1 in File 2, have been added together in $2.

Summary

I would like a script that can search for matches in $1 between two files, then print $2 (File 1) + $2 (File 2) wherever a $1 match is found. The output is File 3, which prints File 1 with the new summed values whereever matches occurred. Any assistance is much appreciated!

sort + join + ( cut + sed + bc ) or + awk

KamilCuk
– KamilCuk

2019-03-05 12:07:18 +00:00
Commented Mar 5, 2019 at 12:07 — KamilCuk
– KamilCuk, Commented Mar 5, 2019 at 12:07

RavinderSingh13 · Accepted Answer · 2019-03-05 15:30:06Z

3

Could you please try following(if you are ok with awk).

awk 'FNR==NR{a[$1]=$2;next} {$2=$1 in a?$2+a[$1]:$2} 1' Input_file2  Input_file1

In case you want to have floating point till 1 point along with proper tab format in output then try following.

awk 'FNR==NR{a[$1]=$2;next} $1 in a{$2=sprintf("%.01f",$2+a[$1])} 1' Input_file2  Input_file1 | column -t

Or as per Ed sir's comment we need not to check $1 in a so removing it from code.

awk 'FNR==NR{a[$1]=$2;next} {$2=sprintf("%.01f",$2+a[$1])} 1' Input_file2  Input_file1 | column -t

edited Mar 5, 2019 at 15:30

answered Mar 5, 2019 at 12:08

RavinderSingh13

135k14 gold badges61 silver badges100 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

RavinderSingh13 Over a year ago

@Blaisem, glad that it helepd you, happy learning and sharing on this GREAT site SO.

Blaisem Over a year ago

May I also ask if this can be used for more than 2 files? Say to combine a common row from File 1, File 2, and File 3, to output File 4? Basically, the same task as this one, but including one more input file to add with File 1. I can also submit a new question if that's easier.

RavinderSingh13 Over a year ago

@Blaisem, IMHO, I would say better to have a new question else people will not get what was changed and why was changed and what was posted too(total confusion), a new question with you efforts should be good I believe, cheers.

stack0114106 · Accepted Answer · 2019-03-05 19:19:52Z

0

Using pipelined awk's

$ awk ' $(NF+1)=FILENAME ' blaisem2.txt blaisem1.txt | 
        awk ' { a[$1]+=$2; $2=sprintf("%.01f",a[$1]); print } ' | 
             awk ' /blaisem1.txt/ && NF-- '
1 0.3
2 0.8
3 0.4
4 1.0

$

where the files are

$ cat blaisem1.txt
1  0.3
2  0.1
3  0.4
4  0.8

$ cat blaisem2.txt
2  0.7
4  0.2
6  0.5
8  0.9

$

It can be further shortened with 2 awks as

$ awk ' $(NF+1)=FILENAME ' blaisem2.txt blaisem1.txt | 
    awk ' { a[$1]+=$2; $2=sprintf("%.01f",a[$1]); } /blaisem1.txt/ { NF--; print } '
1 0.3
2 0.8
3 0.4
4 1.0

$

edited Mar 5, 2019 at 19:19

answered Mar 5, 2019 at 19:12

stack0114106

8,8934 gold badges16 silver badges40 bronze badges

Collectives™ on Stack Overflow

How to add two columns from two different files with reference to matching values in another column?

2 Answers 2

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related