Copying large files from Linux to Windows

In my spare time, I was doing some OpenBMC Yocto builds on my Linux machines, and decided I wanted to copy these files over to my Windows PC. Little did I know how complicated this could be.

Given the huge code base involved, using Yocto to build a BMC image with the OpenBMC framework involves a huge number of files – 3,922,917 files taking up 120.4GB, to be precise. And it can take hours for the build to complete. Luckily, I have an 8-core, 16-thread Ryzen 1700X CPU in my desktop at home, so the build goes fairly quickly. Even though it’s a first-generation AMD Ryzen chip, having so many threads at my disposal makes it run circles around doing the same build on my Surface Pro laptop; even though the Surface does have an Intel 4-core 8-thread CPU Kaby Lake Refresh CPU, Windows is my daily driver on it, which means I do my builds using Ubuntu on VirtualBox, and need to dedicate some CPU resources to keeping Windows alive in the background.

So, it occurred to me to copy the openbmc folder from my Linux desktop over to my Windows laptop, so I could study the build contents while I am on the road traveling. Sometimes when you’re alone at night in a hotel, there’s not much else to do (besides watching TV), so I thought it could be a fun thing to occupy time. Now, there are probably a hundred ways of getting files from one machine to another, so this wouldn’t be particularly interesting except for the fact that I chose some wrong paths, and had to puzzle it out. So here goes.

Being a bit of a lazy person, I thought that using Dropbox was an easy choice. I use it all the time on Windows, having upgraded recently from the Basic plan with 2GB to Dropbox Plus with 2TB of storage. A cursory review indicated that Dropbox was supported on Ubuntu, but I learned that the support is poor. Attempts to install the application from the Ubuntu Software library seemed to succeed, but the application would not launch. Following the directions at https://help.dropbox.com/installs-integrations/desktop/linux-commands and using the Dropbox CLI also failed. I would see errors like this:

Dropbox failures

My final attempt with Dropbox involved trying to use their web interface via Firefox. But the upload speed was terrible. It seemed like it would take a week to upload the 120GB folder. So, I ended up giving up on this.

“Plan B” involved copying the openbmc folder directly over to a Windows hard drive. Linux and Windows are in separate partitions on my desktop PC, and Linux can access the NTFS-formatted Windows hard drives. So, I just did a drag and drop, and it started up nicely. But, things went downhill from there:

Copy from Linux to Windows hard drive NTFS

Alas, this too looked like it was going to take days and days, so I gave up on it as well. It was really strange: when the Copy operation started up, it would be running at >15MB/s, but then would gradually decline in speed until it got down into the 400kB/s to 300kB/s range. The longer I ran it, the slower the transfer. At that rate, it might never complete.

I tried the same experiment with an NTFS-formatted USB stick, but it exhibited the same behavior as copying to the hard drive. Still too slow.

Astute readers will have figured out my problem by now. As it turns out, being lazy, I was trying simply to copy the unaltered folder from one file system to another. The sheer volume of files (almost 4,000,000) and the total size of 120GB was slowing the filesystem down. And although I get a sense of satisfaction watching progress bars and indicators on my computer slowly creep towards completion (watching Yocto builds in progress is particularly gratifying), it was time to be more rigorous. Yes, I compressed all four million files in the folder into one .tar.gz file, reducing its size from 120.4GB to 42.1GB. This sped up the copy process from my Linux partition to Dropbox, and the NTFS-formatted hard drive and USB stick tremendously.

But all these above methods suffer from the same level of indirection, slowing the overall progress down: once the compressed file is uploaded, it still needs to be downloaded onto my Windows PC. After thinking about it a while, I decided that using a direct SSH connection between the two machines would probably be one of the fastest mechanisms.

To do this, I used PuTTY on my PC – I already had this set up some time ago. This gave me access to the PuTTY Secure Copy (PSCP) executable.

On my Ubuntu desktop machine, I set it up as an SSH server with the following commands:

$ sudo apt update

$ sudo apt install openssh-server

$ sudo systemctl status ssh

And the Ubuntu machine is now ready to respond to SSH requests to download files:

Sudo systemctl status ssh

It’s now easy to initiate the file transfer using the PSCP.exe from the PuTTY installation:

SSH PSCP

This process only took about three hours, much faster than the others.

Live and learn. Sometimes you have to try a lot of different things to gain experience.

There are a few other things I learned from playing with this:

Compressing the 120GB openbmc folder on Linux (Ubuntu 16.04 LTS) took ~ 45 minutes. The resulting .tar.gz file took up 42GB.

Copying the openbmc .tar.gz file from Linux hard drive to my Windows partition hard drive took ~ 12 minutes.

Copying the openbmc.tar.gz file from my Linux hard drive to a NTFS-formatted 256GB USB3 flash stick took ~ 1 hour.

On Windows, copying the 40GB openbmc.tar.gz file from the USB stick to my boot SSD took ~ 3 minutes.

Uncompressing and extracting the openbmc.tar.gz file using 7-Zip on my Windows laptop took >~ 2 hours. And it is a two-step process: a .tar.gz or .tgz file really is two formats: .tar is the archive, and .gz is the compression. So, the first step decompresses, and the second step extracts the archive. Quite tedious. I’ll look into better ways to do this in the future. And luckily I have a 1TB boot SSD on my Surface Pro 6: otherwise I would have run out of space at some point during my experiments.

As you can see, I did end up circling back and doing the direct copies of the compressed file (instead of the uncompressed folder) to the Windows hard drive and the USB stick, with some good results. As you might expect, copying to direct media is faster than using SSH over my LAN between the two machines.

Doing the same thing with Dropbox, however, did not yield such great results. Again using the Firefox web interface, it ran for about an hour, and then this happened:

Dropbox machine check

A nasty machine check error! Maybe I’ll research this in a later blog.

Alan Sguigna