3

I have file 1:

A1  1  NA
A1  2  NA
A1  3  NA
A1  4  A
A1  5  G
A1  6  T
A1  7  NA
A1  8  NA
A1  9  NA
A2  1  NA
A2  2  NA
A2  3  T
A2  4  NA

And file 2:

A1  4  A
A1  5  B
A1  6  T
A2  3  T

I want to replace row number 4,5,6 and 3 of A2 from file 1 with a value of 4,5,6 and 3 from file 2

Expected Output in new file3:-

A1  1  NA
A1  2  NA
A1  3  NA
A1  4  A
A1  5  B
A1  6  T
A1  7  NA
A1  8  NA
A1  9  NA
A2  1  NA
A2  2  NA
A2  3  T
A2  4  NA

I want to try this in Linux. I tried googling but I could not find better answers.

6
  • Did u try awk? Commented Mar 26, 2018 at 7:58
  • i tried this diff file2 file1 Commented Mar 26, 2018 at 8:00
  • Are the line numbers always contiguous in both files, and are the values to be replaced always all the values in file 2? Commented Mar 26, 2018 at 8:02
  • yes the line numbers always contiguous in both files, and are the values to be replaced always all the values in file 2 Commented Mar 26, 2018 at 8:03
  • The question is unclear, please add the expected output that would be produced with the above inputs, and any attempts you have made Commented Mar 26, 2018 at 11:39

3 Answers 3

1

Another alternative, with awk:

awk 'NR==FNR{a[$1,$2]=$0;next} ($1,$2) in a {$0=a[$1,$2]}1' file2 file1
A1  1  NA
A1  2  NA
A1  3  NA
A1  4  A
A1  5  B
A1  6  T
A1  7  NA
A1  8  NA
A1  9  NA
A2  1  NA
A2  2  NA
A2  3  T
A2  4  NA
Sign up to request clarification or add additional context in comments.

3 Comments

I tried this as well cat file2 file1 | awk '!seen[$1,$2]++' | sort -k 1V,1 -k 2n,2 but no result
@RKG: That's a bit strange, on my system both commands give the same output
Actually my file rows are some where 3lakhs and 3 column file 1 and 50K rows and 3 column file 2 so it is taking little time. I hope it give me result
1

Using join:

join -a 2 file2 file1 | cut -d ' ' -f -2

Where file1 is the original file and file2 is the file with the replacement fields.


Edit: The question's requirements have been changed since posting this; it originally asked for joining two files with two columns each. For the new format, this awk script works:

cat file1 file2 | awk '
  BEGIN { OFS = "  " }
  { rows[$1 OFS $2] = $3 }
  END { for (r in rows) print(r, rows[r]) }
' | sort -V >file3

Output using the files specified in the question:

$ cat file3
A1  1  NA
A1  2  NA
A1  3  NA
A1  4  A
A1  5  B
A1  6  T
A1  7  NA
A1  8  NA
A1  9  NA
A2  1  NA
A2  2  NA
A2  3  T
A2  4  NA

6 Comments

No i want to replace in file 1 only with matching row value from file 2
I just edited in an example of its output. Does it not work for what you're trying to do?
No this is not giving me results it just concatenated the files
Now that the input data is defined to have another field in front of the numbers, the above join command doesn't work anymore. One way is to replace the first delimiting space pair with for example '+' and then use the command, but I don't know how to do it without temporary files.
I agree, join doesn't fit the new problem too well. I edited my answer with an awk solution, and it should be more scalable since minor changes let you configure the output separator or the number of columns to join by. It isn't as elegant as join though, in my opinion.
|
0

If you've inserted the data in the file using a structure,you can use the same structure to retrieve the nth entry using a loop and replace it respectively!

Else if you've inserted data manually,you could just traverse the file for (n-1) number of '\n' and retrieve the following data!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.