ETL explained
ETL stands for Extract, Transform and Load, the
processes that enable the move of data from multiple sources, reformat and
cleanse it, and load it into another file, database, a data mart or a data
warehouse for analysis, or onto another system.
We all know that there are valuable data lying around
throughout our systems that would be very useful if it could be reused in
another program.
The only problem is that the data lies in all sorts of formats that cannot
be readily used other applications.
To solve the problem, you can use extract, transform and load (ETL)
software, which includes reading data from its source, cleaning it up and
formatting it uniformly, and then writing it to a target format to be
exploited.
The data used in ETL processes can come from any source: a flat file, a
mainframe application, an ERP application, a CRM tool, an Excel spreadsheet, an
extraction program, anything really.
Extracting the data
Extraction can be done via a variety of methods. Often, the environment or
program in which the data is currently held will have an export function that
can be used to get the data into a format that can be easily transformed and processed.
There are also specialized tools available to take data from a database
environment.
After extraction, the data is transformed, or modified, depending on the
specific business logic involved so that it can be sent to the target data
store.
There are a variety of ways to perform the transformation, and the work
involved varies. The data may require reformatting only, but most ETL
operations also involve cleansing the data to remove duplicates and enforce
consistency.
In addition, the ETL process could involve transforming from a fixed-record
format to a variable one, or vice versa, standardizing name and address fields,
verifying telephone numbers or expanding records with additional fields
containing demographic information or data from other systems.
The transformation occurs when the data from each source is mapped, cleansed
and reconciled so it all can be tied together.
After reconciliation, the data is transported and loaded into the data
warehouse for analysis.
Online data transformation
There are many tools available that help in the ETL process.
All of them however mandate an investment in software that needs to be
installed on your computer systems. The online functions available here can be
very useful if you cannot, or don’t want to install any software on your
computer. You will still need to extract the information from your existing
environment into a file, but the transformation process can in many instances be
done online.
Since turning their paper documents into digital data, companies have been focused on acquiring the most effective means of
managing their databases. Thus far, ETL processes and software tools are the leading management solution being used today.
ETL (Extract, Transfer and Load) is a collection of data processes used to simplify data migration to and from
different environments on the enterprise level.
ETL Process
As a data management solution, the ETL process is implemented for specific data management tasks: fixing improperly formatted data,
rapidly generating databases from other resources, handling customer data on FTP servers, or managing simple XML files.
Although the ETL process will differ in each individual case, it essentially involves three stages.
The first is data extraction which can be done via database APIs, proprietary code or flat files. In any case, the extracted data
can come from any source and, hence, in different forms which can’t be integrated into other environments as such.
The inconsistent data is therefore put into a single format in which the data can be easily transformed and processed.
The next step is aimed at resolving the data disparities. At this point, the extracted data gets processed and transformed into the form
your end target requires. This involves data modifications that can vary from removing duplicates and simple reformatting
to standardizing information fields and validating data.
After the data is cleansed and transformed, the last step of the process transports and loads the data into the end target database
for analysis, or onto another system.
ETL with iConv
Powerful ETL software tools used in the process are easily accessible--but they require
software that needs to be installed on your computer systems. The online functions available here,
however, eliminate that and can be very useful in the ETL process.
Although you will still need to extract the information from your existing environment into a file,
in many instances, without needing any transformation code or installations,
the transformation process can be done online.
These iConv utilities can help to transform or convert your data and essentially expedite the ETL process.
Data Conversion Home
Help
Contact us
Link to us
Rate This Page
Copyright ©2005 iConv.com. All
rights reserved.
|