Skip to content Skip to sidebar Skip to footer

Python Multithreading Missing Data

useI am working on a python script to check if the url is working. The script will write the url and response code to a log file. To speed up the check, I am using threading and qu

Solution 1:

Python's threading module isn't really multithreaded because of the global interpreter lock, http://wiki.python.org/moin/GlobalInterpreterLock as such you should really use multiprocessinghttp://docs.python.org/library/multiprocessing.html if you really want to take advantage of multiple cores.

Also you seem to be accessing a file simultatnously

withopen( self.error_log, 'a') as err_log_f:
    err_log_f.write("{0},{1},{2}\n".format(idx,url,resp.code))

This is really bad AFAIK, if two threads are trying to write to the same file at the same time or almost at the same time, keep in mind, their not really multithreaded, the behavior tends to be undefined, imagine one thread writing while another just closed it...

Anyway you would need a third queue to handle writing to the file.

Solution 2:

At first glance this looks like a race condition, since many threads are trying to write to the log file at the same time. See this question for some pointers on how to lock a file for writing (so only one thread can access it at a time).

Post a Comment for "Python Multithreading Missing Data"