Skip to content Skip to sidebar Skip to footer

What Am I Missing In Python-multiprocessing/multithreading?

I am creating, multiplying and then summing all elements of two big matrices in numpy. I do this some hundred times with two methods, a loop and with the help of the multiprocessin

Solution 1:

There is always some overhead (synchronization, data-preparation, data-copies and co.).

But: given a good setup, your matrix-vector and vector-vector operations in numpy are already multithreaded, using BLAS (which is the state of the art standard used everywhere including numpy, matlab and probably tensorflow's cpu-backend; there are different implementations though).

So if BLAS is able to occupy all your cores (easier with big dimensions), you are only seeing the overhead.

And yes, tensorflow in it's core will be implemented by at least one of C/C++/Fortran plus BLAS for it's CPU-backend and some Cuda-libs when targeting GPU. This also means, that the core-algorithms as gradient-calcs and optimization-calcs should never need external parallelization (in 99.9% of all use-cases).

Post a Comment for "What Am I Missing In Python-multiprocessing/multithreading?"