Avoiding Being “Bit”ten: Bandwidth Issues With Cloud Computing Backups

As attorneys consider using cloud computing for file backup, the time required to restore files after a disaster may [unpleasantly] surprise a law firm. Backups in cloud storage may take days to download depending on the speed of the internet connection and the amount of data. Anticipating the potential download times, and creating a plan, may help a law firm to avoid unexpected problems should a disaster occur. (And confirms that off-site cloud storage should be combined with local backups to minimize down-time and law firm disruption in the event of a catastrophic data loss.)

What Is the Issue? Large Amounts of Data Accumulate Over Time

Massive data storage capacity is now cheap, plentiful, and presumed. Terabyte-and-beyond hard drives and “infinite” capacity, “cloud” drives, are readily available. Over time, even a small firm can easily accumulate gigabytes of data—and even terabytes are no longer abnormal. So what is a gigabyte or a terabyte in a practical sense?

Today, the basic increments of data storage, in order from small to large, are:

Unit Number of Bytes Per Unit Number of Units
kilobyte [kB] 1,000 1,000 bytes
megabyte [MB] 1,000,000 1,000 kilobytes
gigabyte [GB] 1,000,000,000 1,000 megabytes
terabyte [TB] 1,000,000,000,000 1,000 gigabytes

This table shows the progressive magnitude of basic units of data. The lesson: gigabytes or terabytes represent massive stores of data despite the hum-drum or casual discussion of this quantity of data. A terabyte is a lot of data— a trillion bytes of data to be exact. Even the smaller gigabyte is large. (For reference, a standard DVD holds about 4.7GB of data. One terabyte equals the data on over 210 DVDs!). Thus, in a practical sense, storing these large units of data is a formidable task simply because of the sheer magnitude of the data sets. Before turning, however, to moving these data sets to and from off-site cloud storage, a brief discussion of bits and bytes is necessary.

Distinguishing Bits and Bytes

Distinguishing between the similar-sounding bit and byte is important to fully understand data transfer. Bits and bytes are frequently, but incorrectly, used interchangeably. An analogy may help to grasp the important distinction between the two. Think of a bit as an individual letter and a byte as collection of letters (or a word). For our purposes, one byte (word) generally consists of 8 bits (letters). Bits and bytes are especially important in discussing online backup because 1) hard drive capacity and data storage are often quoted in byte increments while 2) bandwidth (internet connection capacity or “speed”) is generally quoted in bit increments. [You may want to re-read this because it is very important.] While you can convert the two, knowing that a bit is smaller than a byte is essential.

Bandwidth Describes the Capacity of an Internet Connection

A simple metaphor may help you understand Internet bandwidth. Bandwidth describes the relative capacity to transmit data and is somewhat similar to gallons per minute describing the flow of water to a faucet. Five gallons per minute describes the capacity of a faucet or how much water can flow in a given period of time. Similarly, Internet bandwidth describes the amount of data that can “flow” through the Internet connection in a given period of time. Typically, Internet bandwidth is measured in kilobits or megabits per second.[FN1] Note the latter emphasis on the term bits. Remember, the measure of bandwidth is typically in bits-per-second.

So, how long does it take to transfer a file? Assume a small, 1.5 megabyte, word processing file and assume a 1.5 megabit-per-second internet connection. How long will this take to transfer the file? One second?

This is not a trick question;the question illustrates an important concept. No, one second is not correct. Under “perfect” conditions, it takes at least 8 seconds to move the file because you are moving 1.5 megabytes of data through a 1.5 megabit connection— thus, at least 8 seconds are needed and not 1 second.[FN2]

How did I get 8 seconds? A 1.5 megabit-per-second connection converts to approximately 0.187 megabytes-per-second (or more easily expressed as 187 kilobytes-per-second). To calculate megabytes-per-second, take 1,500,000 bits (equaling 1.5 megabits and the quoted bandwidth) and divide by 8 bits (number of bits per byte) and the result is about 187,500 bytes. Divide the 187,500 bytes by 1,000 to get 187 kilobytes-per-second. (Or divide by 1,000,000 to get megabytes-per-second.) Now, the file size is 1.5 megabytes. We need to transfer 1.5 megabytes of data through a connection at 0.187 megabytes-per second (remember, we already converted the bandwidth from megabits to megabytes). So, 1.5MB divided by 0.187 megabytes-per-second equals about 8 seconds.

For the 1.5 megabyte file size in the illustration, the difference in transfer time is fairly minor; for gigbytes or terabytes of data (look at the table above), the difference in transfer times can be startling (see below) due to the high orders of magnitude.

A Hypothetical Disaster Data Recovery Scenario

Using the above information, assume a law firm has accumulated 50GB (gigabytes) of data on a cloud-based backup provider. The firm subscribes to a 1.5 megabit-per-second internet connection. The hard drive on the local server fails (with no reasonable possibility of recovery). The law firm breathes a sigh of relief— thank goodness for the online backup. The relief is warranted because the local server data, presumably, can be recovered from the cloud backup. However ….

Let’s look at the numbers (remember the order of magnitude comment above). Assume the law firm has 50GB (50,000,000,000 bytes) of data. Converting from bytes to bits, we get 50,000,000,000 bytes * 8 bits which equals 400,000,000,000 bits. Now, even assuming a hypothetically “perfect” internet connection at 1,500,000 bits-per-second (1.5 megabits), we get:

  • 400,000,000,000 file bits to transfer / 1,500,000 bits-per-second via the internet connection = 266,666.67 seconds OR
  • 266,666.67 seconds / 60 seconds per minute = 4,444.44 minutes OR
  • 4,444.44 minutes / 60 minutes per hour = 74.07 hours OR
  • 74.07 Hours / 24 hours per day = 3.09 days

Surprised? Shocked? And this example assumes for simplicity “perfect” conditions which are impossible to achieve in real life.[See FN3 for even more startling issues.] Even if we increase the bandwidth to 10 megabits-per-second by buying more bandwidth from the internet service provider, a rough calculation still shows approximately 11.1 hours required to download to the 50GB of files.[FN4]

Furthermore, the download time alone also does not necessarily mean full recovery to the point where you are back-in-business. Time would be needed to re-build the server (for example, replacing the hard drive, restoring the operating system, and restoring programs) and to manage the downloads. Thus, even a fairly small 50GB storage requirement could result in days of downtime.[FN5]

The Take-Away: Disaster Recovery from Cloud Backups Should Take Download Time into Account

A disaster recovery plan should take into account bandwidth issues if cloud computing backups are used. As the amount of data stored in the cloud grows, the importance of planning also grows since restoration time will increase significantly. The analysis above also indicates that a more tenable solution might involve both a local backup (for example, an external hard drive) and remote [cloud] backup. The dual mode backups create some redundancy and potentially minimizes downtime by allowing use of the local drive for most of the restoration (and supplementing with the online documents as needed). In any case, the lawyer exploring cloud backups should be aware of these issues so he or she can plan accordingly.

New Development Addendum: Shipping Initial Backup on Hard Drive

MozyProCloud, to apparently mitigate the bandwidth issues on the front-end, offers a hard drive backup program. The initial backup helps reduce the potentially extensive bandwidth (and time) requirements of an initial backup. [See FN6] Lawyers should pay particular attention to the encryption scheme apparently employed by the backup—a double-encryption process with data secured first by the data owner’s chosen key and then also locked by Mozy’s key. See why this is important at Navigating the Fog of Cloud Computing; Cloud Computing: Who Holds the Encryption Keys? [And Why It May Matter to Lawyers]; and Storing Files in the Cloud: Storage-as-a-Service for Lawyers—Encryption.

Footnotes

FN1—Some refer to bandwidth as the “speed” of the connection. Technically, this is not really accurate because the speed is influenced by many factors not necessarily related to the bandwidth.

FN2—The explanation is simplified for laypersons. Technically, the full 1.5megabits would not be available, even under perfect conditions, for file transfer because there is additional bandwidth overhead required to maintain the connection, line noise, collisions, network congestion, intentional bandwidth throttling, and other issues.

FN3—While I tried to keep the analysis simple, one cannot overlook the fact that the illustration is a best case scenario because bandwidth available for transfer will never match the quoted maximums (1.5Mb-per-second is quoted bandwidth not actual). Furthermore, the internet service provider and/or the cloud provider may throttle bandwidth to avoid over-taxing the system—they generally are not going to allow you to dominate the bandwidth to the detriment of others. Thus, realistically, even the 1.5Mb-per-second will cap-out at much lower rates. I regularly see sub-225 kilobits-per-second during file transfers even though I have a 10Mb-per-second connection. At a 225 kb-per-second rate, the 50GB download in the illustration would take approximately 20 days to complete!

FN4—While I used my own spreadsheet to calculate the figures used here, several online bandwidth calculators allow you to easily estimate bandwidth scenarios. See, for example, Bandwidth Calculator or Data Transfer Speed Calculator.

FN5—The scenario is for illustration purposes. In actuality, one might be able to download the remote files while the server is restored/re-built. Further, the cloud backup might be better suited restoring after isolated, inadvertent file deletions or for partial restores—I do not mean to imply that a restore must be all-or-nothing. The point, however, is not to quibble over potential mitigation but to illustrate a real factor in cloud backups—a full restore may take significantly more time to complete than one might assume.

FN6—Lucas Mearian, Mozy Ships Hard Drives to Cloud Backup Customers, CIO Magazine, (Sept. 23, 2011) available at http://www.cio.com/article/690326/Mozy_Ships_Hard_Drives_to_Cloud_Backup_Customers?source=CIONLE_nlt_insider_2011-09-26&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+cio%2Ffeed%2Fdrilldowntopic%2F3028+%28CIO.com+-+Data+Center%29.

Original Publication: 07 April 2011
Revised: 26 September 2011