0

I am a somewhat experienced Java programmer who is re-implemting some code in Python, as I am just learning the language. The issue that I am having is that a method is returning nothing when I pass in global variables, but returning intended code when a literals are passed in. The code returns a list of words of the specified length passed in, starting with the string passed in. For example:

print getNGramBeginsWords("ha", 5)

returns

['HAAFS', 'HAARS', 'HABIT', 'HABUS', 'HACEK', 'HACKS', 'HADAL', 'HADED', 'HADES',
 'HADJI', 'HADST', 'HAEMS', 'HAETS', 'HAFIZ', 'HAFTS', 'HAHAS', 'HAIKA', 'HAIKS',
 'HAIKU', 'HAILS', 'HAINT', 'HAIRS', 'HAIRY', 'HAJES', 'HAJIS', 'HAJJI', 'HAKES', 
 'HAKIM', 'HAKUS', 'HALAL', 'HALED', 'HALER', 'HALES', 'HALID', 'HALLO', 'HALLS',
 'HALMA','HALMS', 'HALON', 'HALOS', 'HALTS', 'HALVA', 'HALVE', 'HAMAL', 'HAMES',
 'HAMMY', 'HAMZA', 'HANCE', 'HANDS', 'HANDY', 'HANGS', 'HANKS', 'HANKY', 'HANSA', 
 'HANSE', 'HANTS', 'HAOLE', 'HAPAX', 'HAPLY', 'HAPPY', 'HARDS', 'HARDY', 'HARED',
 'HAREM', 'HARES', 'HARKS', 'HARLS', 'HARMS', 'HARPS', 'HARPY', 'HARRY', 'HARSH', 
 'HARTS', 'HASPS', 'HASTE', 'HASTY', 'HATCH', 'HATED', 'HATER', 'HATES', 'HAUGH', 
 'HAULM', 'HAULS', 'HAUNT', 'HAUTE', 'HAVEN', 'HAVER', 'HAVES', 'HAVOC', 'HAWED', 
 'HAWKS', 'HAWSE', 'HAYED', 'HAYER', 'HAYEY', 'HAZAN', 'HAZED', 'HAZEL', 'HAZER', 
 'HAZES']

as it should. However,

print inputString
print numLetters
print getNGramBeginsWords(inputString, numLetters)

returns

ha
5
[]

inputString and numLetters are global variables, which I have seen called "dangerous," though I don't know why, and was thinking that they could be the cause of this oddity? Even local copies of the global variables being used as parameters doesn't help. Perhaps I need to use the "global" keyword in the parameters of the method, although from my research it appears you don't need the "global" keyword unless you are changing the global variable? Any suggestion or help would be appreciated. On the off chance that it is an issue with the method, here it is:

def getNGramBeginsWords(nGram, length):
    dict = open('/home/will/workspace/Genie/src/resources/TWL06.txt', 'r')
    nGram = nGram.upper()
    words = []
    for line in dict:
        if(len(line)>0):
            if(len(nGram)>len(line.strip()) | len(line.strip())!= length):
                continue
            s = line.strip()[:len(nGram)]
            if(s == nGram and len(line.strip()) == length):
                words.append(line.strip())
    return words
3
  • 2
    If you: print type(numLetters) before calling getNGramBeginsWords, do you get <type 'int'> or <type 'str'>? I suspect you get the latter, so that for each line, you compare the length of the line to the string "5". (Side note: dict is the name of a type in Python. You can re-use these as local variables, it just makes for a surprise when you then try to make a dict by invoking dict. Maybe use stream for the variable name instead :-) ) Commented Jul 17, 2013 at 20:13
  • 1
    Are you sure it is the | you want in if(len(nGram)>len(line.strip()) | len(line.strip())!= length). Because, in Python | is Bitwise OR and or is Boolean OR. Commented Jul 17, 2013 at 20:14
  • Also, don't forget to close the dict file. Try using with open(..., 'r') as dict: statement. Commented Jul 17, 2013 at 20:36

1 Answer 1

1

tl;dr: Global variables have nothing to do with this; it is almost certainly that you are passing a string instead of an int as your length parameter. Your code had a lot of redundancies.

Your code has a number of obvious problems, both stylistic and substantive:

def getNGramBeginsWords(nGram, length):
    # dict is the name of a builtin function, which you are confusingly overwriting
    # dict = open('/home/will/workspace/Genie/src/resources/TWL06.txt', 'r')
    wlist = open('/home/will/workspace/Genie/src/resources/TWL06.txt', 'r')
    nGram = nGram.upper()
    words = []
    for line in wlist:
        # an empty string evaluates to False in a binary context; also no need for those brackets
        stripline = line.strip().upper() # you keep doing this; I added the upper here.
        # you don't need this if, because you immediately test length
        #if stripline: #I know I changed this, but you only refer to the stripped version below 
        # pipe | is bitwise OR. I bet you don't want that
        if len(nGram)>len(stripline) or len(stripline)!= length:
            continue
        # s = stripline[:len(nGram)] #you only use this once
        # you don't need to check that stripline is of length again; you already did that
        # also, you can just use `endswith` instead of slicing
        if stripline.endswith(nGram):
            words.append(stripline)
    return words

And, without comments:

def getNGramBeginsWords(nGram, length):
    wlist = open('/home/will/workspace/Genie/src/resources/TWL06.txt', 'r')
    nGram = nGram.upper()
    words = []
    for line in wlist:
        stripline = line.strip() # you keep doing this
        # you can merge these two ifs
        if len(nGram)>len(stripline) or len(stripline)!= length:
            continue
        if stripline.endswith(nGram):
            words.append(stripline)
    return words

Merging the two adjacent ifs:

def getNGramBeginsWords(nGram, length):
    wlist = open('/home/will/workspace/Genie/src/resources/TWL06.txt', 'r')
    nGram = nGram.upper()
    words = []
    for line in wlist:
        stripline = line.strip().upper() # you keep doing this
        # you can merge these two ifs
        # also this renders the comparison of ngram and stripline lengths redundant
        if (len(stripline) == length) and stripline.endswith(nGram):
            words.append(stripline)
    return words

Now, let's look at this last version - funnily enough, you never actually perform a numeric operation on length. Given that length is supposed to be a number, you might like to force it to a number; if it can't be converted, you'll get an exception.

def getNGramBeginsWords(nGram, length):
    wlist = open('/home/will/workspace/Genie/src/resources/TWL06.txt', 'r')
    nGram = nGram.upper()
    words = []
    length = int(length) # force to an int
    assert isinstance(n, int) # or do this if you prefer to get an exception on all invalid input
    for line in wlist:
        stripline = line.strip().upper() # you keep doing this
        # you can merge these two ifs
        if (len(stripline) == length) and stripline.endswith(nGram):
            words.append(stripline)
    return words

And finally, you never actually close the file explicitly. You'll have it hanging around for a while. It's better to use the with construct to close it automatically:

def getNGramBeginsWords(nGram, length):
    with open('/home/will/workspace/Genie/src/resources/TWL06.txt', 'r') as wlist:
        nGram = nGram.upper()
        words = []
        length = int(length) # force to an int
        assert isinstance(n, int) # or do this if you prefer to get an exception on all invalid input
        for line in wlist:
            stripline = line.strip().upper() # you keep doing this
            #you should be using `endswith` instead of the slice
            if (len(stripline) == length) and stripline.endswith(nGram):
                words.append(stripline)
        return words
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, I made most of the changes you recommended, aside from the "stripline.endsWith(nGram), rather I wanted it to be beginsWith. The issue was that I was "adding" values to numLetters with a "raw_input()" without casting to a integer first, I was too used to Scanner.nextInt() from java. Sorry for all of the silly mistakes, and thanks for helping me fix them!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.