Cloud storage and on-premises storage are both great options for storing, serving, and protecting people data – but there are four key differences that you should be aware of before making a choice: availability, data protection, performance, and compliance. The key word is different – not better, not worse, just different.
To set the stage, by cloud storage, I mean hosted object stores like Amazon S3, IBM SmartCloud Enterprise, and AT&T Synaptic Storage. For on-premises storage, I am including all file-oriented storage including NAS, clustered NAS, unified storage, and object storage. Although there are clear technological differences between on-premises storage, many of those differences are erased when deployment is considered.
1. Cloud storage is more highly available than many on-premises storage deployments, with much less complexity
Cloud storage is an increasingly attractive choice for people data, as prices continue to come down and the services mature. The economics of paying a consumption bill versus buying physical infrastructure and then providing space, power and bandwidth are understood. But what else needs to be considered? Cloud storage availability can often exceed a typical customer environment because of the inherent replication of each object across multiple physical machines, and typically across multiple sites. This provides continued availability in the event of disk, array or site failure. For example, Amazon’s S3 cloud storage does the following:
The cloud storage methodology of object replication provides availability across all the failure scenarios mentioned above with nearly instant failover. In a failure state, the network simply re-routes the application (e.g. the Oxygen client) to an identical object replica in a different location.
On-premises storage still offers many benefits as costs continue to decrease and advanced functionality is more and more common in even lower end products. On-premises storage can be configured for very high degrees of availability, but this is often at the array level. Multi-array and multi-site availability with on-premises storage still requires a great deal of expertise and cost.
Why? Multi-site availability obviously requires a second data center located in a different geography and all the cost and complexity it entails. It also requires redundant hardware. NAS will require equal or nearly equal capacity to support data replication. Object storage, while able to support replication at a more granular level, is similarly constrained. Multi-array availability is more straightforward with object storage and clustered NAS, but still requires the proper amount of physical “heads” (storage and metadata controllers) and proper network configuration. Conventional NAS relies on in-the-box redundancy and typically has to fall back on the recovery of snapshots when an array fails, which clearly degrades the failover time.
Beyond the technology, customers often don’t know how to classify people data (is it mission critical? business critical? neither?). In most businesses, non-mission critical data doesn’t get continuous replication within a facility, let alone across facilities. This reinforces the reliance on data protection techniques like RAID mirroring or backup, which are either inefficient or require significant recovery time, making them unsuitable for availability. The simplified availability techniques of cloud storage is a significant advantage.
2. On-premises storage still has superior flexibility over data protection and recoverability, but the right application can work with cloud storage to create a different and powerful protection profile
On-premises storage gives the educated customer a huge range of options around service level and degrees of data protection. From higher-end file storage like IBM SONAS, which can house many CPUs, lots of RAM and hundreds of solid state storage disks to lower-end arrays housed in a single box, it is now common to be able to apply different classes of storage media and protection schemes, even within a single system, to address different types of applications and types of people data. For example, the IBM Storwize V7000 Unified can have a mix of storage media internally and move data around those storage tiers by policy, even across different arrays.
The flexibility of on-premises storage, which supports file system replication, tiering, snapshots, and backup, at increasingly granular levels, enables IT organizations to create a high degree of data protection, disaster recovery and point-in-time recovery (also called business continuity).
By contrast, with cloud storage, granular data recoverability is the customer’s responsibility. While data durability, by virtue of having at least three immutable copies of each object with regular background error checking, can be very high (11 9′s is often quoted), cloud storage does not promise the recoverability of older versions of data, or accidentally deleted data. For example, AT&T Synaptic Storage as a Service:
This is an area where the application or the technology that overlays cloud storage can help. For example, because Oxygen stores a new object for each version of data, we support user recovery of infinite versions of data, as well as “soft deleted” data. This eliminates the need for traditional backup for people data recovery (business continuity) and the associated IT overhead.
While the complementary pairing of an application like Oxygen with cloud storage produces a new, efficient type of data protection profile, the same can be applied to on-premises storage with similar results.
3. On-premises storage offers significantly better performance, but cloud storage locality is closing the gap
A foremost benefit of on-premises storage is data placement. Data locality is still important for many applications, particularly ones that deal with large files like media or Excel files used for detailed financial analysis. Storage on the local area network is considerably faster than cloud storage. On-premises network storage can easily reach 40MB/s for reads and writes over the LAN, whereas cloud storage is severely limited by network bandwidth, particularly in upload (write) scenarios.
Cloud storage makes up some of the difference by being available in nearly every major geography, all without a customer having to worry about putting storage hardware in a place with adequate power, cooling and bandwidth. The increasingly global presence of cloud storage can help support data locality for improved performance, particularly with read operations, which in many businesses makes up the majority of people data interaction.
In-device storage is significantly faster than either on-premises or cloud storage. With standard in-device SSD, 200MB/s of read/write and microsecond response time is now possible. With this in mind, a modern people data technology must be able to take advantage of in-device storage when necessary and on-premises or cloud storage when lower performance is acceptable or shared access is required. Oxygen begins to strike this complex balance by making a file available in-device when a person needs it (selective sync) and keeping that marked file updated automatically if it is subsequently changed on another device or by another individual. Files that are not needed don’t take up capacity in the device, instead leveraging the larger capacity on-premises or cloud storage. Oxygen users can even configure certain files to use on-premises storage based on performance needs (via creating specific spaces tied to on-premises storage), and other files to use cloud storage.
4. Perceived and real compliance gaps are being addressed quickly, but the right applications are needed for both cloud storage and on-premises storage
One of the main objections to cloud storage is compliance. Certainly, on-premises storage offers the flexibility of exact location, tailored security architecture, specific governance criteria and screening and training of administrators. There’s also the comfort of being able to walk over to the data center and pull the plug if necessary. But those that haven’t looked at cloud storage for a year or two should look again.
Cloud storage is maturing rapidly in the area of compliance, with leading vendors obtaining deeper and deeper certifications, particularly around standards of governance and administration. Those that question the rigor of SSAE-16 or ISO 27001 guidelines for control should compare their own internal controls: the standards are quite comprehensive. The global presence of cloud storage also closes some gaps with data locality.
In either case, a people data technology like Oxygen can complete the picture by integrating with corporate authentication services for user access control and supporting end-to-end encryption for data control, including within the storage, to protect data privacy and confidentiality. These capabilities certainly fill gaps for cloud storage, but can also address gaps for on-premises storage, which in many cases are easily compromised if the network is hacked.
Oxygen is the only product that lets you use either cloud storage, on-premises storage, or both, to better fit your variety of use cases
As long as people have a need for a range of storage availability, protection, performance, and compliance, there will be a need for both on-premises storage and cloud storage. The choice isn’t as simple as economics.
Fortunately, Oxygen doesn’t force you to choose.
Instead, we let you leverage cloud storage for scenarios where you need inherent high availability and data protection and reasonable performance and compliance. We let you leverage on-premises storage where performance is critical, where availability and data protection processes are already in place and need to be maintained, and where compliance rules are stricter.
Uniquely, Oxygen enables you to leverage a mix of both, to address every company’s mix of individual access, internal shared access, external shared access, internal file sharing, external file share, across a range of applications from the common (Microsoft Office) to the proprietary (media).