I have a tab deliminated, File A, like this
establishment_of_protein_localization_to_endoplasmic_reticulum GO:0072599
lipid_oxidation GO:0034440
endocytic_vesicle_lumen GO:0071682
monocarboxylic_acid_metabolic_process GO:0032787
protein_transmembrane_transport GO:0071806
cellular_response_to_topologically_incorrect_protein GO:0035967
preribosome GO:0030684
negative_regulation_of_hematopoietic_progenitor_cell_differentiation GO:1901533
and a second file structure as such:
font-family: Helvetica;
font-size: 10.86px;
font-weight: 700;
text-anchor: middle;
fill: #000000;
stroke: none;">
GO:0072599
</text>
<text x="509.10" y="-243.88"
style="
font-family: Helvetica;
font-size: 10.72px;
font-weight: 700;
text-anchor: middle;
fill: #000000;
stroke: none;">
GO:0034440
</text>
and i want to use awk or sed to match the second column of file a to the second file and replace the matching strings with the first column of file in the second file and replace them with the first column. To give this ouput essentially
font-family: Helvetica;
font-size: 10.86px;
font-weight: 700;
text-anchor: middle;
fill: #000000;
stroke: none;">
establishment_of_protein_localization_to_endoplasmic_reticulum
</text>
<text x="509.10" y="-243.88"
style="
font-family: Helvetica;
font-size: 10.72px;
font-weight: 700;
text-anchor: middle;
fill: #000000;
stroke: none;">
lipid_oxidation
</text>
Except the GO:###### Sequences match the column in the first file. I tried using this command
#!/bin/bash
awk 'NR==FNR{a[$2]=$1;next}{$1=a[$1\2];}1' input.csv
however, it replaces more than just the strings in column 2 of file a
regulation_of_muscle_system_process GO:0090257does not relate toGO:0045927. Update your description**est...really appear in your file