A little over a year ago I attended a Cloud conference sponsored by a major university. On the panel were CIO-class folk from large finance and health care companies, education, and state government. The panel was talking about what was keeping them from going to the Cloud. They gave the usual answers: concerns about security, performance and availability, and the potential for loss of control. But the biggest concern was security. How could they convince their stakeholders that their data was secure? To these people, security meant ensuring the confidentiality, integrity and availability of their data.
- Confidentiality means that only those people who are supposed to see their data can see it.
- Integrity means that only authorized processes are allowed to modify data and only in very specific ways. For example, it means that the transaction I send to the Cloud arrives unchanged at the service provider, and the response comes back to me unmodified. It means data stored in my archive hasn’t been changed while it is just sitting there for years.
- Availability means that the data is accessible when needed.
Some of the panel also talked about some specific Cloud Service Providers they had talked to about their concerns, and those CSP’s responses and promises. One mentioned that they liked the ease of use and low cost of Amazon, at least for some applications. That comment garnered some fairly negative reactions from the others on the panel, including, “Would you trust your IT to a book seller?” That generated a good laugh from the panel and audience.
However, that answer may now be “yes.” Earlier this month, Amazon announced a new encryption feature that allows you to encrypt data at rest stored in Amazon S3. Amazon S3 is “Amazon Simple Storage Service,” designed to make web-scale computing easier for developers. “Amazon S3 provides a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, secure, fast, inexpensive infrastructure that Amazon uses to run its own global network of web sites. The service aims to maximize benefits of scale and to pass those benefits on to developers.” There are not many businesses that need web services on a larger scale than what Amazon uses in its own business.
Amazon S3 allows the storage of arbitrary “objects.” These might be documents, database files, images, music or movies, software, or anything else you can store on a computer. Each individual object can be up to 5 terabytes in size plus have to 2 kilobytes of metadata that describes the object. The owner assigns each object to a “bucket.” Each bucket belongs to an Amazon Web Services (AWS) account.
This is the Cloud. You pay for what you use. The pricing is fairly complex, based on the amount of storage you are using, the amount of data that is moving out of Amazon S3 and the number of transactions per day. A back of an envelope calculation says you could have a lot of activity against a total of 1 terabyte of data for less than $200 per month, with full redundancy.
Amazon S3 supports four different access control mechanisms that allow you to control who can access your data as well as how, when, and where they can access it.
- Identity and Access Management policies let you give different individuals different access rights.
- Access Control Lists allow you selectively grant certain permissions on specific stored objects.
- Bucket Policies allow finer access control to individual objects with a single bucket.
- Query String Authentication allows you to share data objects through URLs that are time-limited.
Amazon S3 uses checksums stored with the data to periodically verify the integrity of your data. If Amazon S3 detects data corruption, it automatically repairs it using redundant data.
In terms of availability and disaster recovery, it will be hard to beat what Amazon S3 can provide: 99.99% availability over a year. More importantly, Amazon S3 is designed to provide 99.999999999% durability and survive the concurrent loss of data in two facilities.
What is left to worry about is confidentiality. Enter Amazon S3 Encryption.
Amazon S3 Encryption has two options: server-side encryption, which is managed by Amazon, and client-side encryption, which is managed by you. In either case, you can use SSL encryption to protect data being uploaded or downloaded to Amazon S3. SSL (HTTPS) is the same encryption you use for your on-line banking and other secure on-line applications.
The server-side encryption uses AES-256. Also known as Rijndael, AES is a block cipher encryption standard adopted by the U.S. and other governments. It has been analyzed extensively and is now used widely worldwide including defense applications. AES is one of the most popular algorithms used in Symmetric Key cryptography. “AES” is often followed by a number, as “AES-256”, which indicates the length of the key in bits. The longer the key, the harder it is to break the encryption without the key. Each object has a unique key, and these keys are themselves encrypted with a master key, which is periodically changed. This rekeying means that someone without current valid credentials will not be able to access an object using information obtained before the rekeying.
For many applications, this server-side encryption for data-at-rest coupled with SSL data-in-motion will be sufficient. However, some certifications and perhaps your company’s policy require that you manage the encryption keys. While there is no reason to not trust Amazon’s employees, at least a few of their estimated 33,700 employees would have access to those keys and could, either maliciously or accidentally, capture and decrypt your data. For those cases, Amazon S3 offers client side encryption. In this case, you create and manage the keys. Amazon never sees unencrypted data and never has access to the keys. You could safely store your most valuable proprietary information in a Cloud managed by your biggest and most evil competitor.
The down side is that you must manage the keys yourself. Key management is not an easy task. It requires that you have the right processes and procedures and have vetted the right employees to manage them. You must make sure that the keys never escape to someone who shouldn’t have them. Just “giving” a key to an employee for their legitimate use is complicated because you have to protect the key throughout that transportation process. Mess this part up and you risk giving someone keys to data they should not be able to access.
More importantly, you must make sure you never lose a key. No matter what happens, storage failure, accidental or deliberate action by an employee or contractor, building failure, you must never lose the keys. Lose the keys, lose the data.
In the early days of World War II, as Winston Churchill was approaching Paris on his last visit just before the Germans took the city, he remarked that it was sad to see the center of Paris burning. What he actually saw was the many plumes of black smoke from all the embassies and French government offices burning their papers. If the data had been stored electronically and encrypted, it would have taken only a few seconds to destroy the keys and the data would have been rendered useless, with no environmental impact. If you decide to no longer use Amazon S3 for your data storage, copy the data to your new storage infrastructure and destroy the old keys. No one can recover that data. You can just walk away from it. It does not matter if Amazon deletes some or all of the data, just assigns the storage space to a new customer without wiping the disk, or tries to sell the drives on e-bay. Again, I have no expectation that Amazon would act irresponsibly, but without the keys it really does not matter.
Amazon S3 encryption should allow you to solve many of your data security problems, inexpensively. Most privacy laws around the world do not count data as really being lost if was encrypted. You should be able to, for example, use Amazon S3 encryption as part of a compliant solution for HIPAA (personal health) data. I doubt you could put together a total solution that would be PCI compliant (credit, debit, and ATM cards). However, it might be possible to use it as part of specific process steps.
Make sure you thoroughly test the performance of any Cloud implementation, including Amazon S3, before you put it into production.
The last word:
This is yet another example of how fast the Cloud is evolving. Cloud Service Providers and tool vendors are improving their products and creating new offering at an astounding rate. A rating of a product or company that is more than six months old may actually be useless. If you avoided putting an application in the Cloud last year because the Cloud was not ready, you might want to look again.
Keep your sense of humor.