0

I have a file (usearch.txt) with entries that look like this:

0 AM158981
0 AM158980
0 AM158982
etc.

I want to replace the accession numbers in this file (AM158981, etc.) with the bacterial names that correspond to it, which are in a second file (acs.txt):

AM158981 Brucella pinnipedialis Brucellaceae
AM158980 Brucella suis Brucellaceae
AM158982 Brucella ceti Brucellaceae
etc.

My plan was to make a dictionary using the second file (accession number as the key, names as the value), and then open the first file and use the dictionary to replace the accession numbers and save this to a new file (done.txt):

#! /usr/bin/env python
import re
# Creates a dictionary for accession numbers

fname = r"acs.txt"

namer = {}
for line in open(fname):
        acs, name = line.split(" ",1)
        namer[acs] = str(name)

infilename = "usearch.txt"
outfilename = "done.txt"

regex = re.compile(r'\d+\s(\w+)')

with open(infilename, 'r') as infile, open(outfilename, 'w') as outfile:
    for line in infile:
        x = regex.sub(r'\1', namer(name), line)

        outfile.write(x) 

I get this error when I run this script: Traceback (most recent call last):

  File "nameit.py", line 21, in <module>
  x = regex.sub(r'\1', namer(name), line)
  TypeError: 'dict' object is not callable

Ideally, my "done.txt" file would look like this:

0 Brucella pinnipedialis Brucellaceae
0 Brucella suis Brucellaceae
0 Brucella ceti Brucellaceae

2
  • 1
    dict access is namer[name] Commented May 30, 2013 at 17:59
  • I changed it to namer[name] and got this: TypeError: 'dict' object is not callable Commented May 30, 2013 at 18:00

1 Answer 1

1

You're trying to use namer like a method:

x = regex.sub(r'\1', namer(name), line)

You want to replace the parentheses with brackets to access the element with the key name:

x = regex.sub(r'\1', namer[name], line)

Note you'll also need to get the name again or you'll use the same key over and over:

with open(infilename, 'r') as infile, open(outfilename, 'w') as outfile:
    for line in infile:
        # Need to get the ID for the bacteria in question. If we don't, everything
        # will end up with the same name in our output file.
        _, name = line.split(" ", 1)

        # Strip the newline character
        name = name.strip()

        x = regex.sub(r'\1', namer[name], line)
        outfile.write(x) 
Sign up to request clarification or add additional context in comments.

22 Comments

I changed it to namer[name] and got this: TypeError: 'dict' object is not callable.
@Jen What's the full error? It should give you the line it doesn't like.
@Jen Also, what happens if you print name right before the regex.sub?
Are you sure you're modifying the right file? Did you install it and use a copy?
@Jen Right now, you only set name to a value when you loop through your dictionary file. When you start looping through input file to write to the output file, name is never changed. So if name was something like foo, everything in the output file would use the foo value in your name dictionary. I edited my answer to show how you can get the name - see if it makes sense.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.