I have a file (usearch.txt) with entries that look like this:
0 AM158981
0 AM158980
0 AM158982
etc.
I want to replace the accession numbers in this file (AM158981, etc.) with the bacterial names that correspond to it, which are in a second file (acs.txt):
AM158981 Brucella pinnipedialis Brucellaceae
AM158980 Brucella suis Brucellaceae
AM158982 Brucella ceti Brucellaceae
etc.
My plan was to make a dictionary using the second file (accession number as the key, names as the value), and then open the first file and use the dictionary to replace the accession numbers and save this to a new file (done.txt):
#! /usr/bin/env python
import re
# Creates a dictionary for accession numbers
fname = r"acs.txt"
namer = {}
for line in open(fname):
acs, name = line.split(" ",1)
namer[acs] = str(name)
infilename = "usearch.txt"
outfilename = "done.txt"
regex = re.compile(r'\d+\s(\w+)')
with open(infilename, 'r') as infile, open(outfilename, 'w') as outfile:
for line in infile:
x = regex.sub(r'\1', namer(name), line)
outfile.write(x)
I get this error when I run this script: Traceback (most recent call last):
File "nameit.py", line 21, in <module>
x = regex.sub(r'\1', namer(name), line)
TypeError: 'dict' object is not callable
Ideally, my "done.txt" file would look like this:
0 Brucella pinnipedialis Brucellaceae
0 Brucella suis Brucellaceae
0 Brucella ceti Brucellaceae