The cost of data distribution

From large tracking databases such as airline reservation and ecommerce systems to a child's baseball card collection, databases are collections of information organized to provide efficient data storage and retrieval. A database could be as simple as the file cabinet at your neighboring barbershop or as complex and computerized as the database powering Google’s search engines.

Originally, what we know today as local databases were purely local to the building that hosted their home server. Their data was only distributed through direct or LAN (Local Area Network) connections to the clients. As companies grew beyond the limits of a simple building, the notion of LAN connections updated to encompass secure private intranet connections between several buildings of the same company within or without the same state or country. Consequently, the term local database evolved to include centralized databases stored within a single building that serves the operations needs of several satellite clients that may or may not be located within the same region as the home server. Yet, with the increasing need for speed of access, recovery and security, many companies have move to elect the option of using LAN and/or WAN (Wide Area Network) interconnected servers to store the interrelated databases that independently host fragments of the global data under a single architecture known as distributed database systems.

While both architectures are very dependable, the choice to implement one option versus the other will for the most part depend on the needs and the size of the organization.

For arrangements where all the computers accessing the data are in the same building, a local database architecture deployed on a client to server relationship would for the most part be the best implementation. All data will be stored on a local server that will be accessed through LAN connection by the client. This will guaranty a faster, more secure and a cost efficient consumption of the data. Conversely for disaster recovery while using this option, the company should invest in backing up the data either in the cloud or to a secondary backup server preferably off site to ensure the integrity of the data in the case of unforeseen disasters.

If the computers are spread out around the country or even worldwide, the best approach will be the distributed option. The distributed database would allow a user with proper permission to access data stored at different locations for live transactions, global reporting or remote management. The ability to store several copies or fragments of the data on different computers around the world allows the implementation of highly efficient disaster recovery plans that seamlessly keeps the business running even in cases of server crashes or locations shutdown.

The cost, convenience, security and availability being the decision driver on whether to use either local or distributed architectures have pushed many companies to turn to major virtual storage vendors such as Microsoft, Google, AT&T or Amazon for Storage as a service (STaaS). The STaaS architecture essentially allows the business to rent virtual memory for its storage needs. This model takes up the burden of maintaining, updating, and backing up the data from the business to the vendor for a pre-determined usage fee.

With today’s exponential data growth, would STaaS be the silver bullet to our data consumption needs? Or a source of security fear?

Leave a Reply