Python Multithreading Missing Data
Solution 1:
Python's threading module isn't really multithreaded because of the global interpreter lock, http://wiki.python.org/moin/GlobalInterpreterLock as such you should really use multiprocessing
http://docs.python.org/library/multiprocessing.html if you really want to take advantage of multiple cores.
Also you seem to be accessing a file simultatnously
withopen( self.error_log, 'a') as err_log_f:
err_log_f.write("{0},{1},{2}\n".format(idx,url,resp.code))
This is really bad AFAIK, if two threads are trying to write to the same file at the same time or almost at the same time, keep in mind, their not really multithreaded, the behavior tends to be undefined, imagine one thread writing while another just closed it...
Anyway you would need a third queue to handle writing to the file.
Solution 2:
At first glance this looks like a race condition, since many threads are trying to write to the log file at the same time. See this question for some pointers on how to lock a file for writing (so only one thread can access it at a time).
Post a Comment for "Python Multithreading Missing Data"