Data Warehouse Interview Questions: Difference between metadata and data dictionary

Q.Difference between metadata and data dictionary.
A.Metadata describes about data. It is ‘data about data’. It has information about how and when, by whom a certain data was collected and the data format. It is essential to understand information that is stored in data warehouses and xml-based web applications.

Data dictionary is a file which consists of the basic definitions of a database. It contains the list of files that are available in the database, number of records in each file, and the information about the fields.

Data dictionary is a repository to store all information. Meta data is data about data. Meta data is data that defines other data. Hence, the data dictionary can be metadata that describes some information about the database.

Q.Describe the various methods of loading Dimension tables.
A.The following are the methods of loading dimension tables:

Conventional Load:
In this method all the table constraints will be checked against the data, before loading the data.

Direct Load or Faster Load:
As the name suggests, the data will be loaded directly without checking the constraints. The data checking against the table constraints will be performed later and indexing will not be done on bad data.

The methods to load Dimension tables:

Conventional load:- Here the data is checked for any table constraints before loading.
Direct or Faster load:- The data is directly loaded without checking for any constraints.

Q.What is the difference between OLAP and data warehouse?
A.The following are the differences between OLAP and data warehousing:

Data Warehouse

Data from different data sources is stored in a relational database for end use analysis.Data organization is in the form of summarized, aggregated, non volatile and subject oriented patterns. Supports the analysis of data but does not support data of online analysis.

Online Analytical Processing

With the usage of analytical queries, data is analyzed and evaluated in the data ware house.Data aggregation and summarization is utilized to organize data using multidimensional models.Speed and flexibility for online data analysis is supported for data analyst in real time environment.

A data warehouse serves as a repository to store historical data that can be used for analysis. OLAP is Online Analytical processing that can be used to analyze and evaluate data in a warehouse. The warehouse has data coming from varied sources. OLAP tool helps to organize data in the warehouse using multidimensional models.

Q.Describe the foreign key columns in fact table and dimension table.
A.The primary keys of entity tables are the foreign keys of dimension tables.The Primary keys of fact dimensional table are the foreign keys of fact tables.

A foreign key of a fact table references other dimension tables. On the other hand, dimension table being a referenced table itself, having foreign key reference from one or more tables.

Q.What is cube grouping?
A.A transformer built set of similar cubes is known as cube grouping. A single level in one dimension of the model is related with each cube group. Cube groups are generally used in creating smaller cubes that are based on the data in the level of dimension.

Q.Define the term slowly changing dimensions (SCD)
A.Slowly changing dimension target operator is one of the SQL warehousing operators that can be used in mining flow or in data flow. When the attribute for a record varies over time, the SCD is applied.

SCD are dimensions whose data changes very slowly. An example of this can be city of an employee. This dimension will change very slowly. The row of this data in the dimension can be either replaced completely without any track of old record OR a new row can be inserted, OR the change can be tracked.

Q.What is a Star Schema?
A.The simplest data warehousing schema is star schema. It consists of fact tables that refer any number of dimension tables. It is the special case schema to be considered for snowflake schema.In a star schema comprises of fact and dimension tables. Fact table contains the fact or the actual data. Usually numerical data is stored with multiple columns and many rows. Dimension tables contain attributes or smaller granular data. The fact table in start schema will have foreign key references of dimension tables.

Q.Differences between star and snowflake schema
A.Star Schema: A de-normalized technique in which one fact table is associated with several dimension tables. It resembles a star.

Snow Flake Schema: A star schema that is applied with normalized principles is known as Snow flake schema. Every dimension table is associated with sub dimension table

Q.Explain the use of lookup tables and Aggregate tables.
A.At the time of updating the data warehouse, a lookup table is used. When placed on the fact table or warehouse based upon the primary key of the target, the update is takes place only by allowing new records or updated records depending upon the condition of lookup.

The materialized views are aggregate tables. It contains summarized data. For example, to generate sales reports on weekly or monthly or yearly basis instead of daily basis of an application, the date values are aggregated into week values, week values are aggregated into month values and month values into year values. To perform this process @aggregate function is used.

An aggregate table contains summarized view of data. Lookup tables, using the primary key of the target, allow updating of records based on the lookup condition .

Data Warehouse Interview Questions

HowToGetSoftwareJob

Monday, 26 March 2012

Difference between metadata and data dictionary

No comments:

Post a Comment

Stats

About Me