Ahad, 22 Februari 2015

Chapter 8 – Accessing Organizational Information – Data Warehouse


What is Data Warehouse?

Ø  Defined in many different ways, but not rigorously

-  A decision support database that is maintained separately from the                organization’s operational database.

-  A consistent database source that bring together information from                  multiple sources for decision support queries.

-  Support information processing by providing a solid platform of                          consolidated, historical data for analysis.

History of Data Warehousing

Ø  In the 1990’s executives became less concerned with the day-to-day                     business operations and more concerned with overall business functions

Ø  The data warehouse provided the ability to support decision making                    without disrupting the day-to-day operations, because :

-  Operational information is mainly current – does not include the history         for better decision making

-  Issues of quality information

-  Without information history, it is difficult to tell how and why things                 change over time

Data warehouse fundamentals

Ø  Data warehouse – A logical collection of information – gathered from             many different operational databases – that supports business analysis       activities and decision-making takes

Ø  The primary purpose of a data warehouse is to combined information                  throughout an organization into a single repository for decision-                        making purposes – data warehouse support only analytical processing
     Data warehouse model

Ø  Extraction, transformation and loading (ETL) – A process that extracts       information from internal and external databases, transforms the                 information using a common set of enterprise definitions, and loads the             information into a data warehouse.

Ø  Data warehouse then send subsets of the information to data mart.

Ø  Data mart – contains a subset of data warehouse information.




Multidimensional Analysis and Data Mining

Ø  Relational Database contains information in a series of two-dimensional       tables.

Ø  In a data warehouse and data mart, information is multidimensional, it             contains layers of columns and rows

                    -  Dimension – A particular attribute of information




Ø  Cube – common term for the representation of multidimensional                              information





Ø  Once a cube of information is created, users can begin to slice and dice the     cube to drill down into the information.

Ø  Users can analyze information in a number of different ways and with              number of different dimensions.

Ø  Data Mining – the process of analyzing data to extract information not          offered by the raw data alone. Also known as “knowledge discovery” –            computer-assisted tools and techniques for sifting through and                            analyzing vast data stores in order to finds trends, patterns and                      correlations that can guide decision making and increase understanding

Ø  To perform data mining users need data-mining tools

-  Data-mining tool – uses a variety of techniques to finds patterns and                relationships in large volumes of information. Eg: retailers and use                  knowledge of these patterns to improve the placement of items in the              layout of a mail-order catalog page or Web page.

Information Cleansing or Scrubbing
Ø  An organization must maintain high-quality data in the data warehouse

Ø  Information cleansing or scrubbing – A process that weeds out and fixes         or discards inconsistent, incorrect or incomplete information

Ø  Occurs during ETL process and second on the information once if is in the           data warehouse

Ø  Contract information in an operational system

Ø  Standardizing Customer  name from Operational Systems

Ø  Information cleansing activities

-  Missing Records or Attributes
-  Redundant Records
-  Missing Keys or Other Required Data
-  Erroneous Relationships or References
-  Inaccurate Data

Ø  Accurate and complete information



Business Intelligence

Ø  Business Intelligence – refers to applications and technologies that are        used to gather, provides access, analyze data and information to                      support decision making efforts

Ø  These systems will illustrate business intelligence in the areas of                    customer profiling, customer support, market research, market                          segmentation, product profitability, statistical analysis, and inventory        and distribution analysis to name a few

Ø  Eg; Excel, Access

Tiada ulasan:

Catat Ulasan