I have some python code to read a file and push data to a list. Then put this list to queue, use threading to process the list, say 20 items a time. After processing, I save the result into a new file. What was put in the new file was actually different order than the original file. For example, I have in input,
1 a
2 b
3 c
4 a
5 d
But the output looks like:
2 aa
1 ba
4 aa
5 da
3 ca
Is there any way to preserve the original order? Here is my code:
import threading,Queue,time,sys
class eSS(threading.Thread):
def __init__(self,queue):
threading.Thread.__init__(self)
self.queue = queue
self.lock = threading.Lock()
def ess(self,email,code,suggested,comment,reason,dlx_score):
#do something
def run(self):
while True:
info = self.queue.get()
infolist = info.split('\t')
email = infolist[1]
code = infolist[2]
suggested = infolist[3]
comment = infolist[4]
reason = infolist[5]
dlx_score = (0 if infolist[6] == 'NULL' else int(infolist[6]))
g.write(info + '\t' + self.ess(email,code,suggested,comment,reason,dlx_score) +'\r\n')
self.queue.task_done()
if __name__ == "__main__":
queue = Queue.Queue()
filename = sys.argv[1]
#Define number of threads
threads = 20
f = open(filename,'r')
g = open(filename+'.eSS','w')
lines = f.read().splitlines()
f.close()
start = time.time()
for i in range(threads):
t = eSS(queue)
t.setDaemon(True)
t.start()
for line in lines:
queue.put(line)
queue.join()
print time.time()-start
g.close()
gisn't in scope in the run method. Also, as Daniel alluded to, do you really need threads? Even ignoring the out-of-order information, does this actually run any faster than just reading and writing sequentially?