0

I have two text files. hash_only.txt and final_output.txt hash_only.txt looks like below.

193548
401125
401275

final_output.txt looks like below.

193548      1199687744  5698758206701808640
193548      1216464960  5698758206761818112
193548      1216464960  5698758206778417152
193548      4236691520  5698758206778945280
401125      2138607488  5698762375908890880
401125       863932288  5698762375909423360
401125      3884158848  5698762375910044160
401125      2609483648  5698762375911032320

I am trying to write a loop which does the follows.

for i in `cat hash_only.txt` ;
do
    for j in `cat final_output.txt` ;
            do
                    if [ $i -eq $j ]
                    then
                            echo $i $j      
                    fi
            done
 done;

For all the values in hash_only.txt such as 193548,401125 etc I want to extract column 2,3 from the file 'final_output.txt' where column 1 matches 193548,401125 etc and output column 2,3 to print_193548, print_401125 etc.

How do I do that.In the above code I need to put some code inside the then part.But I can't figure that out since I am not very proficient in bash.

Edit:

I have now modified the my script to look likefor i in cat hash_only.txt ;

do
        for j in `cat final_output.txt` ;
                do
                        if [ $i -eq $j ]
                        then
                                gawk 'FNR==NR
                                        { hash[$1]  
                                          next 
                                        }
                                       $1 in hash  { 
                                        print $2,$3 >> "print_"$1; 
                                }' hash_only.txt final_output.txt
                        fi
                done
done;

It is not creating any files named print_[0-9]*.I can't understand why not?

3
  • 1
    So you want to create a bunch of files, right? One for each distinct value in the first file? Commented Jun 14, 2012 at 5:21
  • yes.that is exactly what I want. Commented Jun 14, 2012 at 5:29
  • 1
    The gawk command will do all the job. if...else/for...loop can be deleted. Commented Jun 14, 2012 at 6:21

3 Answers 3

2

try this:

nawk 'FNR==NR{a[$0];next}($1 in a){print $2,$3>$1}' hash_only.txt  final_output.txt 

This will actually create a file with name as the first field and store the output in the way you have requested.

Sign up to request clarification or add additional context in comments.

1 Comment

You can omit the parentheses.
1
awk '
FNR==NR {
    hash[$1]
    next
}
$1 in hash {
    printf("%s\t%s\n", $2, $3) > "print_"$1;
}' hash_only.txt final_output.txt

What a magic, my solution is almost identical to peter's.

4 Comments

are you suggesting that I add this code after the if then part in my code??I tried it it doesn't seem to work.It just printed out a series of values.
Copy and paste to your terminal. It'll create two files(print_193548, print_401125) in the current directory.
The >> should be > (it works a bit differently in AWK than the shell).
Thanks very much.it worked.But I want to adjust the spacing between the $2 and $3 so that I can give this as an input to gnuplot.Any ideas I have tried '\t' " " etc doesn't seem to work ?
-1
cat hash_only.txt | while read FNAME; do { cat final_output.txt |grep ${FNAME} |awk '{$1="";}1' > print_${FNAME}; } ; done ; find ./print_* -type f -size 0 -delete
$ ls ./print_??????
./print_193548  ./print_401125
$ cat ./print_193548
 1199687744 5698758206701808640
 1216464960 5698758206761818112
 1216464960 5698758206778417152
 4236691520 5698758206778945280
$ cat ./print_401125
 2138607488 5698762375908890880
 863932288 5698762375909423360
 3884158848 5698762375910044160
 2609483648 5698762375911032320

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.