Data Warehouse Interview Questions: Teradata Datewarehousing Interview Questions Part 2

Q.How do you handle decimal places while importing a flatfile into informatica?
A.while importing flat file definetion just specify the scale for a neumaric data type. in the mapping, the flat file source supports only number datatype(no decimal and integer). In the SQ associated with that source will have a data type as decimal for that number port of the source.
source ->number datatype port ->SQ -> decimal datatype.Integer is not supported. hence decimal is taken care.

Q.What is parameter file?
A.When you start a workflow, you can optionally enter the directory and name of a parameter file. The Informatica Server runs the workflow using the parameters in the file you specify.
For UNIX shell users, enclose the parameter file name in single quotes:
-paramfile '$PMRootDir/myfile.txt'

For Windows command prompt users, the parameter file name cannot have beginning or trailing spaces. If the name includes spaces, enclose the file name in double quotes:
-paramfile ”$PMRootDir\my file.txt”

When you write a pmcmd command that includes a parameter file located on another machine, use the backslash (\) with the dollar sign ($). This ensures that the machine where the variable is defined expands the server variable.
pmcmd startworkflow -uv USERNAME -pv PASSWORD -s SALES:6258 -f east -w wSalesAvg -paramfile '\$PMRootDir/myfile.txt'

Q.What is aggregate cache in aggregator transforamtion?
A.When you run a workflow that uses an Aggregator transformation, the Informatica Server creates index and data caches in memory to process the transformation. If the Informatica Server requires more space, it stores overflow values in cache files.

Q.Why you use repository connectivity?
When you edit,schedule the sesion each time,informatica server directly communicates the repository to check whether or not the session and users are valid.All the metadata of sessions and mappings will be stored in repository.

Q.Briefly explian the Versioning Concept in Power Center 7.1.
A.When you create a version of a folder referenced by shortcuts, all shortcuts continue to reference their original object in the original version. They do not automatically update to the current folder version.For example, if you have a shortcut to a source definition in the Marketing folder, version 1.0.0, then you create a new folder version, 1.5.0, the shortcut continues to point to the source definition in version 1.0.0.Maintaining versions of shared folders can result in shortcuts pointing to different versions of the folder. Though shortcuts to different versions do not affect the server, they might prove more difficult to maintain. To avoid this, you can recreate shortcuts pointing to earlier versions, but this solution is not practical for much-used objects. Therefore, when possible, do not version folders referenced by shortcuts.

Q.What is source qualifier transformation?
A.When you add a relational or a flat file source definition to a mapping, you need to connect it to a Source Qualifier transformation. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session.Join data originating from the same source database. You can join two or more tables with primary-foreign key relationships by linking the sources to one Source Qualifier.Filter records when the Informatica Server reads source data. If you include a filter condition, the Informatica Server adds a WHERE clause to the default query.Specify an outer join rather than the default inner join. If you include a user-defined join, the Informatica Server replaces the join information specified by the metadata in the SQL query.Specify sorted ports. If you specify a number for sorted ports, the Informatica Server adds an ORDER BY clause to the default SQL query.Select only distinct values from the source. If you choose Select Distinct, the Informatica Server adds a SELECT DISTINCT statement to the default SQL query.Create a custom query to issue a special SELECT statement for the Informatica Server to read source data. For example, you might use a custom query to perform aggregate calculations or execute a stored procedure.

Q.What is a source qualifier?
A.When you add a relational or a flat file source definition to a mapping, you need to connect it to a Source Qualifier transformation. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session.

Q.What is source qualifier transformation?
A.When you add a relational or a flat file source definition to a maping,U need to connect it to a source qualifer transformation.The source qualifier transformation represnets the records that the informatica server reads when it runs a session.

Q.What is incremantal aggregation?
A.When using incremental aggregation, you apply captured changes in the source to aggregate calculations in a session. If the source changes only incrementally and you can capture changes, you can configure the session to process only those changes. This allows the Informatica Server to update your target incrementally, rather than forcing it to process the entire source and recalculate the same calculations each time you run the session.

Q.If you are workflow is running slow in informatica. Where do you start trouble shooting and what are the steps you follow?
A.When the work flow is running slowly u have to find out the bottlenecks in this order target source mapping session system.

Q.What is exact use of 'Online' and 'Offline' server connect Options while defining Work flow in Work flow monitor?
A.The system hangs when 'Online' server connect option. The Informatica is installed on a Personal laptop.When the repo is up and the PMSERVER is also up, workflow monitor always will be connected on-line.When PMserver is down and the repo is still up we will be prompted for an off-line connection with which we can just monitor the workflows.

Q.Explain about perform recovery?
A.When the Informatica Server starts a recovery session, it reads the OPB_SRVR_RECOVERY table and notes the row ID of the last row committed to the target database.The Informatica Server then reads all sources again and starts processing from the next row ID. For example, if the Informatica Server commits 10,000 rows before the session fails, when you run recovery, the Informatica Server bypasses the rows up to 10,000 and starts loading with row 10,001.
By default, Perform Recovery is disabled in the Informatica Server setup. You must enable Recovery in the Informatica Server setup before you run a session so the Informatica Server can create and/or write entries in the OPB_SRVR_RECOVERY table.

Q.What is a common data source for the central enterprise data warehouse?
A.ODS=>Operational Data Source

Q. What is explain and how does it work?
A.The EXPLAIN facility is a teradata extension that provides you with an "ENGLISH" translation of the steps choosen by the optimizer to execute an SQL
statement.It may be used oin any valid teradata database with a preface called "EXPLAIN".The following is an example:-
EXPLAIN select last_name,first_name FROM employees;The EXPLAIN parses the SQL statement but does not execute it.
This provides the designer with an "execution stratergy".The execution stratergy provides what an optimizer does but not why it choses them.The EXPLAIN facility is used to analyze all joins and complex queries.

Q.What is an optimization and performance tuning and how does it really work in practical projects?
A.Performance tuning and optimization of a query involves collecting statistics on join columns, avoiding cross product join, selection of appropriate primary index (to avoid skewness in storage) and using secondary index.Avoiding NUSI is advisable.

Q.What is the difference between Global temporary tables and Volatile temporary tables?
A.Global Temporary tables (GTT) -
1.When they are created, its definition goes into Data Dictionary.
2.When materialized data goes in temp space.
3.thats why, data is active upto the session ends, and definition will remain there upto its not dropped using Drop table statement.If dropped from some other session then its should be Drop table all.
4.you can collect stats on GTT.

Volatile Temporary tables (VTT) -
1.Table Definition is stored in System cache
2.Data is stored in spool space.
3.The data and table definition both are active only upto session ends.
4.No collect stats for VTT.

Q.How teradata makes sure that there are no duplicate rows being inserted when its a SET table?
A.Teradata will redirect the new inserted row as per its PI to the target AMP (on the basis of its row hash value), and if it find same row hash value in that AMP (hash synonyms) then it start comparing the whole row, and find out if duplicate.If its a duplicate it silently skips it without throwing any error.

Q.How many error tables are there in fload and what are their significance/use?
A.Load uses 2 error tables
1.Error table 1: where format of data is not correct.
2.Error table 2: violations of UPI

Mload also uses 2 error tables (ET and UV), 1 work table and 1 log table
1.ET TABLE - Data error
MultiLoad uses the ET table, also called the Acquisition Phase error table, to store data errors found during the acquisition phase of a MultiLoad import
task.

2.UV TABLE - UPI violations
MultiLoad uses the UV table, also called the Application Phase error table, to store data errors found during the application phase of a MultiLoad import or
delete task

3.WORK TABLE - WT
Mload loads the selected records in the work table

4.LOG TABLE
A log table maintains record of all checkpoints related to the load job, it is essential/madatory to specify a log table in mload job. This table will be
useful in case you have a job abort or restart due to any reason.

Q.What are the enhanced features in Teradata V2R5 and V2R6?
A.V2R6 included the feature of replica in it.in which copy of data base are available on another system.meam V2R6 provide the additional data protaction as
comprison to V2R5 while if data from one system has been vanishes.

Q.What the default privileges which get assigned ?
A.The GRANT option for any particular table depends on the privilages of the user. If it is an admin user you can grant privilages at any point of time.
The deafult roles associated with the newly created tables depend on he schema in which they are created.

Q.What is cliques? What is Vdisk and how it will communicate with physical data storage at the time of data retrieval through AMP ?
A.A clique is a set of Teradata nodes that share a common set of disk arrays. Cabling a subset of nodes to the same disk arrays creates a clique.
Each AMP vproc must have access to an array controller, which in turn accesses the physical disks. AMP vprocs are associated with one or more ranks
(or mirrored pairs) of data. The total disk space associated with an AMP is called a vdisk. A vdisk may have up to three ranks. Hence Vdisk will communicate
with physical storage through array controllers.

Q.What is basic teradata query language?
A.BTEQ(Basic teradata query) :It allows us to write SQL statements along with BTEQ commands. We can use BTEQ for importing,exporting and reporting purposes.
The commands start with a (.) dot and can be terminated by using (;), it is not mandatory to use (;). SQL statements doesnt start with a dot , but (;) is
compulsary to terminate the SQL statement.BTEQ will assume any thing written with out a dot as a sql statement and requires a (;) to terminate it.

Q.How many codd's rules are satisfied by teradata database?
A.There are 12 codd's rules applied to the teradata database.

Data Warehouse Interview Questions

HowToGetSoftwareJob

Monday, 26 March 2012

Teradata Datewarehousing Interview Questions Part 2

No comments:

Post a Comment

Stats

About Me