Skip to content Skip to sidebar Skip to footer

Should I Send Data In Chunks, Or Send It All At Once?

I have python code that sends data to socket (a rather large file). Should I divide it into 1kb chunks, or would just conn.sendall(file.read()) be acceptable?

Solution 1:

It will make little difference to the sending operation. (I assume you are using a TCP socket for the purposes of this discussion.)

When you attempt to send 1K, the kernel will take that 1K, copy it into kernel TCP buffers, and return success (and probably begin sending to the peer at the same time). At which point, you will send another 1K and the same thing happens. Eventually if the file is large enough, and the network can't send it fast enough, or the receiver can't drain it fast enough, the kernel buffer space used by your data will reach some internal limit and your process will be blocked until the receiver drains enough data. (This limit can often be pretty high with TCP -- depending on the OSes, you may be able to send a megabyte or two without ever hitting it.)

If you try to send in one shot, pretty much the same thing will happen: data will be transferred from your buffer into kernel buffers until/unless some limit is reached. At that point, your process will be blocked until data is drained by the receiver (and so forth).

However, with the first mechanism, you can send a file of any size without using undue amounts of memory -- your in-memory buffer (not including the kernel TCP buffers) only needs to be 1K long. With the sendall approach, file.read() will read the entire file into your program's memory. If you attempt that with a truly giant file (say 40G or something), that might take more memory than you have, even including swap space.

So, as a general purpose mechanism, I would definitely favor the first approach. For modern architectures, I would use a larger buffer size than 1K though. The exact number probably isn't too critical; but you could choose something that will fit several disk blocks at once, say, 256K.

Post a Comment for "Should I Send Data In Chunks, Or Send It All At Once?"