Why My Multi-threading Code Spend More Time Than Single-threading Code

May 27, 2024 Post a Comment

this is my code : import re,threading class key_value: def __init__(self,filename='a.txt'): self.filename = filename def __getitem__(self,key): file = ope

Solution 1:

The application is IO bound not CPU bound so multi-threading is not going to help.

Also, as noted 1,000 threads is not going to be productive, try smaller numbers, i.e. 2 - 4, it is popular to try up to around 2 × number of cores. Increasing the number of threads too high will result in the overhead of thread management causing the application to be significantly slower.

Solution 2:

The multi-thread is usefull and efficient when you have to access to different ressources (files, network, user interface...) on the same time. In your code it seems to me that you access at only one ressource, a file, so the mutlti-thread is less efficient

Solution 3:

I didn't read your code in details but test it on a multi-core computer, you will probably see an improvement.

Solution 4:

One reason is that accessing one file at time can be actual much faster than accessing multiple files at same time, due to reading overhead. (you know disk has limited cache, and it is always best to read file as stream from beginning to end).

In anyway bottleneck is disk. And more treads you have asking for resources, the worst it gets.

Solution 5:

David Beazley has done an excellent investigation of this phenomenon here. This is a video of the talk. In short, your threads are battling each other to send and respond to signals in order to acquire the GIL. And no, this does not only happen to CPU-bound threads, IO-bound threads suffer from the same problem too.

Python Freelancers