The code is meant to take a file as input, change all the letters to lowercase and get rid of any non alphabetical characters. Then it should print out the recurrence of each word in the file.
#!/usr/bin/python
import sys
def main(argv):
try:
tf = open(sys.argv[1],"r")
except IOError:
print("The file ",tf," was not found")
sys.exit()
data = tf.read()
data.lower()
data.replace("-"," ")
validLetters = " abcdefghijklmnopqrstuvwxyz"
cleanData = ''.join([i for i in data if i in validLetters])
frequency = {}
words = []
words = cleanData.split()
for x in words:
if frequency.has_key(x):
frequency[x] = frequency[x] + 1
else:
frequency[x]
print sorted(frequency.values())
tf.close()
this is what I get in the command line:
$ python -m py_compile q1_word_count.py drake.txt
File "drake.txt", line 1
I Was A Teenage Hacker
^
SyntaxError: invalid syntax
"I Was A Teenage Hacker" is the first line of the text file..
python q1_word_count.py drake.txt