Friday, November 7, 2008

Data Explosion

Data Explosion is a term given to express the increase in stored data when using MultiDimensional Database Systems. The amount of data stored in these systems is often a multiple of the size of the raw data entered into the systems from the existing operational databases. Hence, the data undergoes an “Explosion” to several times (or many times) its original size.

Data has become the driving force behind businesses today, and as such, it is a highly valued asset. Because of the nature of business that is going towards the internet where huge bulks of data are shared and gathered every second, generating and storing of large volumes of data will certainly make a data warehouse reach a critical mass.
Today's data warehouses are typically implemented using multidimensional databases which are data aggregating systems. In other words, these databases combine data from a variety of data sources. Multidimensional databases also offer networks, hierarchies, arrays and other data formats and which may be difficult to model in Structure Query Langauge (SQL). Also, multidimensional databases come with high degree of flexibility in the definition of their dimensions units, and unit relationships, regardless of data format. Because of the nature to handle huge capacities of data, these databases are expected to handle imminent data explosions.

Data duplication is a practice of many data warehouses wherein the company tries to maintain several back up copies of important and critical data or implement data mirroring so that an added assurance can be had against data loss in case of unforeseen physical database failure. Disaster recover plans include getting data from the duplicated copies to be stored in an alternate location. Another use of data duplication is in application development and testing environments where original production database keep several clones.

Despite the reality of data explosion in this internet age, there are several solutions to hand this. New computing storage technologies and comprehensive software applications have made handling data explosion easier by providing effective mechanisms for creating, collecting and storing all kinds of data.
There are many ways to manage data. They can be stored in structured relational databases or in semi structured files systems as in email files. They can also be stored in unstructured fixed context as in documents and graphics files.
Data growth exploding across industries have brought about software solutions like customer relationship management (CRM, enterprise resource planning (ERP) and other mission critical applications which effectively captures, create and processing exponentially increasing bulks of data to keep the operations of a business profitable against competitors. Most companies depend on high availability of data preferably 24 hours a day, seven days as week 365 days a year. They also try to implement fast network connections to handle data sharing and transmission. In today's setting, it would difficult to find an organization - whether from healthcare, pharmaceutical, insurance, financial services, telecommunications, retail, manufacturing and many other industries – that do not gather and utilize large quantities of data for its business decision making policy.

An example of one sector that needs bulk of data and may be a candidate for data explosion is the federal government where thousands of tax returns are being filed into the Internal Revenue Service systems annually. Each succeeding year, the data grows along with new registrants.
Dealing with data explosions means spending a lot of money. A company needs to purchase high powered computer systems that with large capacity hard disks and random access memory. The network infrastructure should also be able to handle sending and receivable large volumes of data and transmission may be happening very frequently.