Saturday, December 13, 2008

Data Modeling

Data Modeling is a method used to define and analyze data requirements needed to support the business functions of an enterprise. These data requirements are recorded as a conceptual data model with associated data definitions. Data modeling defines the relationships between data elements and structures.

Data modeling can be used for a wide array of purposes. It is an act of exploring data oriented structures without considering any specific applications that the data will be used in. It is like a conceptual definition an entity and its real life counterparts which is any thing that is of interest to the organization implementing a database.

Data models are the products of data modeling. In general, three data model styles namely conceptual data model, logical data model and physical data model.

The conceptual data model is often called the domain model. It describes the semantics of a business organization as this model consists of entity classes which represent things of significance to an organization and the relationships of these entities. Relationships are defined as assertions about associations between various pairs of entity classes. The conceptual data model is commonly used to explore domain concepts with project stakeholders. Conceptual models may be created to explore high level static business structures and concepts. But they can be used as well as precursor or alternatives to logical data models.

The logical data model is used in exploring domain concepts and other key areas such as relationships and domain problems. The logical data models could be defined for the scope of a single project or for the whole enterprise. The logical data model describes semantics related to particular data manipulation methods and such descriptions include those of tables, columns, object oriented classes, XML tags and many other things. The logical data model depicts some logical entity types, the data attributes to describe those entities and relations among the entities.

The physical data model is used in the design of the internal database schema. This design defines data tables, data columns for the tables and the relationships among the tables. Among other things that the physical data model is concern with include descriptions of the physical means by which data should be stored. This storage aspect embraces concerns on hard disk partitioning, CPU usage optimization, creation of table spaces and others.

Applications developers need to understand the fundamentals of data modeling so that their application can be optimized. It should be noted that the tasks involve in data modeling may be performed in an iterative manner. These data modeling tasks include the following: Identifying entity types, Identifying attributes, Applying naming conventions, Identifying relationships, Applying data model patterns, Assigning keys, Normalizing to reduce data redundancy and De-normalizing to improve performance.

Data modeling also focuses on the structure of a data within a domain. This structure is described in such a manner that specification is in a dedicated grammar for an artificial language used for a certain domain. But as always, the description of the data structure will never make any mention of a specific implementation of any database management system such as specific vendors.

Sometimes, having different data modelers could lead to confusion as they could potentially produce different data models within the same domain. The difference could stem from different levels of abstraction in the data models. This can be overcome by coming up with generic data modeling methods.

For instance, generic data modeling could take advantage of generic patterns in a business organization. An example is the concept of a Party which includes Persons and Organizations. A generic data model for this entity may be easier to implement to without creating conflict along the way.

Data Model

Data Model is a logical map that represents the inherent properties of the data independent of software, hardware, or machine performance considerations.
The model shows data elements grouped into records, as well as the association around those records.
Since the data model is the basis for data implementation regardless of software or hardware platforms, the data model should present descriptions about a data in an abstract manner which does not mention detailed information specific to any hardware or software such as bits manipulation or index addition.
There are two generally accepted meanings on the term data model. The first is that the data model could be some sort of theory about the formal description of the data's structure and use without any mention of heavy technical terms related to information technology. The second is that a data model instance is the application of the data model theory in order to create to meet requirements of some applications such as those used in a business enterprise.
The structural part of a data model theory refers to the collection of data structures which make up a data when it is being created. These data structures represent entities and objects in the database model. For instance the data model may that be of a business enterprise involved in sales of toys. The real life things of interest would include customers, company staff and of course the toy items. Since the database which will keep the records of these things of interest cannot understand the real meaning of customers, company staff and toy item, there should be created a data representation of this real life things.
The integrity part of a data model refers to the collection of rules which governs the constraints on the data structures so that structural integrity could be achieved. In the integrity aspect of a data model, the formal definition of an extensive sets of rules and consistent application of data is defined so that the data can be used for its intended purpose. Techniques are defined on hot to maintain data in the data resource and to ensure that the data consistently contains value which is loyal to its source while at the same time accurate in its destination. This is to ensure that data will always have data value integrity, data structure integrity, data retention integrity, and data derivation integrity.
The manipulation part of a data model refers to the collection of operators which be applied to the data structures. These operations include query and update of data within the database. This is important because not all data can be allowed for altering or deletion. The data manipulation part works hand in hand with the integrity part so that the data model can result in high quality in the database for the data consumers to enjoy.
As an example, let us take the relational model. The data model defined in the structural part refers to the modified concept of the mathematical relation. The reasoning about such data is represented as n-ary which is a subset of the Cartesian product of n domains. The integrity part refers to the expression in the first order logic and the manipulation part refers to the relational algebra as well as tuple and domain calculus.
The process of defining a data model is extremely important in any database implementation in that there can only be one data model which is the basis for a wide variety of data implementation. Hence, any database management system such as Access, Oracle or MySQL can be implementing and maintaining a database based on one data model only.