Elephant in the Cloud

Today, I came across an interesting issue with cloud. Its called LFN aka Elephant. LFN refers to a TCP/IP issue called Long Fat Network. You can google for it and get the interesting technical details on this subject. What this means for a cloud user is simply that he or she is going to have to wait for ages before his or her fat file gets uploaded to the cloud server. It does not really matter if you have 2 Mbps or 20 Mbps. The latency in the network prevents the TCP from being able to transfer the files at the maximum available bandwidth. 

And this is serious issue for all cloud users! I tested it across many cloud service providers -  Amazon, Rackspace and Opsource. Over my 4 Mbps, I could not get more than  100 kbps while transferring over scp. 

This file has the relevant test info for copying speeds on various cloud operators.

So, whats the Solution?

Solution#1: Simple and Quick

  • Split the files into smaller chunks. I used hjsplit - it has software for windows as well as Linux. eg. hjsplit filename
  • Use a multi-threaded ftp client. Cuteftp supports muti-threading and has a 30 day evaluation version. GoFTP is another choice. Filezilla is my choice anyday :). 
  • Using such a client, you basically transfer multiple files in parallel. And hence increasing the overall throughput.
  • On the server side, you run hjsplit -j to join the files back after all of the chunks have been uploaded. eg. hjsplit -j filename.001

Solution#2: Technically engaging, Robust and long term - more appropriate for corporate environments with continuing needs for heavy file transfer

  • change TCP/IP settings to allow higher packet size on the file servers. 
  • Use UDT protocol based ftp server and client.

How to achieve the above two solutions is a post for another time.