The Data Staging Area :
Often the most complex part in the architecture, and involves...
• Extraction (E)
• Transformation (T)
• Load (L)
• indexing
ETL-tools can be used Scripts for extraction, transformation and load are implemented.
Extraction :
means reading and understanding the source data and copying the data needed for the data warehouse into staging area for further manipulation, i.e. transformation
Transformation involves…
• data conversion/transformation
(specify transformation rules to convert to a common data format
and common terms/semantics)
• data cleaning/cleansing
– data scrubbing (use domain-specific knowledge (e.g postal
adresses) to check the data)
– data auditing (discover suspicious pattern, discover violation of
stated rules)
• combining data from multiple sources
• assigning warehouse (surrogate) keys
• data aggregation
Often the most complex part in the architecture, and involves...
• Extraction (E)
• Transformation (T)
• Load (L)
• indexing
ETL-tools can be used Scripts for extraction, transformation and load are implemented.
Extraction :
means reading and understanding the source data and copying the data needed for the data warehouse into staging area for further manipulation, i.e. transformation
Transformation involves…
• data conversion/transformation
(specify transformation rules to convert to a common data format
and common terms/semantics)
• data cleaning/cleansing
– data scrubbing (use domain-specific knowledge (e.g postal
adresses) to check the data)
– data auditing (discover suspicious pattern, discover violation of
stated rules)
• combining data from multiple sources
• assigning warehouse (surrogate) keys
• data aggregation
This comment has been removed by the author.
ReplyDelete