File and text processing using bash

Question

I have two text files. hash_only.txt and final_output.txt hash_only.txt looks like below.

193548
401125
401275

final_output.txt looks like below.

193548      1199687744  5698758206701808640
193548      1216464960  5698758206761818112
193548      1216464960  5698758206778417152
193548      4236691520  5698758206778945280
401125      2138607488  5698762375908890880
401125       863932288  5698762375909423360
401125      3884158848  5698762375910044160
401125      2609483648  5698762375911032320

I am trying to write a loop which does the follows.

for i in `cat hash_only.txt` ;
do
    for j in `cat final_output.txt` ;
            do
                    if [ $i -eq $j ]
                    then
                            echo $i $j      
                    fi
            done
 done;

For all the values in hash_only.txt such as 193548,401125 etc I want to extract column 2,3 from the file 'final_output.txt' where column 1 matches 193548,401125 etc and output column 2,3 to print_193548, print_401125 etc.

How do I do that.In the above code I need to put some code inside the then part.But I can't figure that out since I am not very proficient in bash.

Edit:

I have now modified the my script to look likefor i in cat hash_only.txt ;

do
        for j in `cat final_output.txt` ;
                do
                        if [ $i -eq $j ]
                        then
                                gawk 'FNR==NR
                                        { hash[$1]  
                                          next 
                                        }
                                       $1 in hash  { 
                                        print $2,$3 >> "print_"$1; 
                                }' hash_only.txt final_output.txt
                        fi
                done
done;

It is not creating any files named print_[0-9]*.I can't understand why not?

So you want to create a bunch of files, right? One for each distinct value in the first file? — Ray Toal
– Ray Toal, Commented Jun 14, 2012 at 5:21
The gawk command will do all the job. if...else/for...loop can be deleted. — kev
– kev, Commented Jun 14, 2012 at 6:21

Vijay · Accepted Answer · 2012-06-14 05:48:50Z

2

try this:

nawk 'FNR==NR{a[$0];next}($1 in a){print $2,$3>$1}' hash_only.txt  final_output.txt

This will actually create a file with name as the first field and store the output in the way you have requested.

edited Jun 14, 2012 at 5:48

answered Jun 14, 2012 at 5:40

Vijay

67.7k94 gold badges238 silver badges327 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Dennis Williamson Over a year ago

You can omit the parentheses.

kev · Accepted Answer · 2012-06-16 05:17:33Z

1

awk '
FNR==NR {
    hash[$1]
    next
}
$1 in hash {
    printf("%s\t%s\n", $2, $3) > "print_"$1;
}' hash_only.txt final_output.txt

What a magic, my solution is almost identical to peter's.

edited Jun 16, 2012 at 5:17

answered Jun 14, 2012 at 5:36

kev

163k49 gold badges286 silver badges282 bronze badges

4 Comments

liv2hak Over a year ago

are you suggesting that I add this code after the if then part in my code??I tried it it doesn't seem to work.It just printed out a series of values.

kev Over a year ago

Copy and paste to your terminal. It'll create two files(print_193548, print_401125) in the current directory.

Dennis Williamson Over a year ago

The >> should be > (it works a bit differently in AWK than the shell).

liv2hak Over a year ago

Thanks very much.it worked.But I want to adjust the spacing between the $2 and $3 so that I can give this as an input to gnuplot.Any ideas I have tried '\t' " " etc doesn't seem to work ?

mle · Accepted Answer · 2019-03-26 12:38:31Z

-1

cat hash_only.txt | while read FNAME; do { cat final_output.txt |grep ${FNAME} |awk '{$1="";}1' > print_${FNAME}; } ; done ; find ./print_* -type f -size 0 -delete

$ ls ./print_??????
./print_193548  ./print_401125

$ cat ./print_193548
 1199687744 5698758206701808640
 1216464960 5698758206761818112
 1216464960 5698758206778417152
 4236691520 5698758206778945280

$ cat ./print_401125
 2138607488 5698762375908890880
 863932288 5698762375909423360
 3884158848 5698762375910044160
 2609483648 5698762375911032320

edited Mar 26, 2019 at 12:38

mle

2,5401 gold badge21 silver badges25 bronze badges

answered Mar 26, 2019 at 11:54

Valentin

1

Collectives™ on Stack Overflow

File and text processing using bash

3 Answers 3

1 Comment

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related