HowToGetSoftwareJob

HowToGetSoftwareJob
HowToGetSoftwareJob Google Hangout

Saturday, 17 March 2012

Data warehouse interview questions

Hi all,

I need interview questions for a data warehouse analyst/developer
position. If you have something to share with me, I would appreciate that.
Thanks...

23 Replies

Hi Guys,
Thanks for your input.
I think cheating is very harsh word. Let's be sympathetic to this
guy who is seeking answers to all DW related terms. I think even
if somebody put all the answers here he will not be able to go too
far, as he has to really work hard to understand so many things
because without hands-on experience its very difficult to cheat,
so he will have to burn lot of midnight oil, so I appreciate him
getting ready for that. At some point of time everybody (including
me)is a cheater as in the beginning even in the US you (or your
consulting company)have to show 5 or more years of experience to
place you in the market. So I request all fellow member of this
board let's take it easy.


Thanks
_______________________________________________________________________


I totally understand that it is hard to break into a professional. That's
why I suggested the poster to try and answer the questions first. You tell
me what you think the answers are and I will tell you where you get it
wrong, if the answers are wrong. He can actually learn and benefit from
trying.

Personally, I have worked with so many so-called DW experts who know nothing
but some theories and buzz words, and it is very frustrating. Take my
previous job for example, this guy is such a good talker and he convinced
management to allow him to architect the DW model. It turned out to be a big
mess and the model is useless. Now he left behind a mess for us to clean up,
and he moved on to work for 7-11. I feel sorry for the folks at 7-11. They
are going to find out in the hard way.

Also, at my previous job, we have 4 Teradata Certified Masters, the highest
honor from NCR. Three of them got the certification by memorization, and
they cannot even set up primary index properly. They kept making mistakes
and someone else have to clean up their mess. Now, do you think it is fair
to the people who have to work with these "under-qualified but over-sold"
so-called experts?

I love to help people, but the person seeking help has to show some effort
first.
______________________________________________________________________

Nic,

I cannot change the American society. The cheating industry goes under the
name: EXAM PREP. Starts even in grade school, where the teachers are
helping students to get higher scores so they can get higher bonuses. The
SAT, the LSAT, etc.

Look at the exam prep industry around MCSE and other network admin
certifications. May even be a billion dollar industy.

It took a year for me to get my MCDBA. I am an experienced database
professional, and studied a lot for the exams.

Beginners can get an MCDBA for $6,000 in a boot camp in 2 weeks. That
devalues my MCDBA, I come under the suspicion.

What can you do? America is obcessed with tests and exams, and the cheating
industry is legal.

Some students on these forums proudly announce how smoothly they cheated
themselves through an exam. Then complain that nobody wants to give them a
job.

_______________________________________________________________________

Hi,

The list of details about the interview questions are excellent and has done a good job in collecting those info=2E

Now coming to the poster who wants this questions and answers=2E
If you clearly look at the list, you cannot answer those questions=2E

Any of our DWH guys would definitely be helpful to you only after you have done your first cycle of study on the topics=2E

Remembers, the list contains only topics=2E You can have study in a manner so that you can under the topics=2E If you start studying on those topics, you will then come to know that all are not answerable=2E Coz, each topic itself can be a book to read=2E

As friends told, you have to do your job of learning=2E Even if you do not have experience, not a problem, have atleast a sound book knowledge and apply them at home for practice=2E While doing them, you will find issues which you can post=2E I feel most of them can be answered=2E

Well, regarding the cheating=2E Everyone of us have put our way of meaning=2E Collectively if you see, all does fit into the context=2E

So Mr Poster, do your job, but yes, you can rely on if you have any issues that you face while learning=2E
_______________________________________________________________________

Data Modelling
Design level principles:
Avoid snow flake
pipleline analysis - snapshots
child level fact and parent level dim (order -order item modeling)
partitioning
surrogate keys

dimension level principles:
time & Date dimension
conform dim
slowly changing dim
junk
mini
hierarchies - flat and bridge table

fact level principles:
lowest granularity
calculated fields - views
do not mix fact granularities
avoid null fks

Performance

SQL performance:
Construction of query
Select clause Hints Indexes: index_ffs, parallel index etc
Operators used for effective indexes
Syntax of the joins
Sub queries/sorts - unions /outer joins
Table physical storage parameters
Analyze - statistics
Access mechanism : scanning /hashing/using indexes etc

Partitioning the tables
Constraints

OLTP Tuning
OLAP tuning
ETL Tuning

Tool specific questions like Informatica

1. Logic For Incremental extract
2. Type of Look up and Look up cache
3. ETL Performance
4. Report Performance
5. SQL Joiner and SQL Qualifier
6. Disabling/Enabling indexes through mapping
7. Previous row logic
8. Informatica metadata repository
9. Difference between Informatica power mart and power centre
10. Informatica partitioning - How it works?
11. Update strategy in SIL mapping - How it works?
12. New row_WID for dim
13. Avoid full load logic
14. Audit Trail Logic
15. Product Attribute Model
16. W_MAP_DIM_M Table
17. How to treat 4 types of fields - Number, text, Date, Currency
18. How to Remove footprints of failed incremental extract
19. Email after Session failure
20. Session Parameter File
21. Diff between OLTP and OLAP
22. Informatica features
23. Features of Repository Manager
24. What are the sources supported by Informatica?
25. External data sources : W_ETL_DATASRC_S & S_ETL_DATASRC
26. W_PARAM_G Table
27. W_ETL_RUN_S
28. How to use Debugger
29. Auxiliary Logic
30. Hierarchical SQL
31. What is RestartNewBatch
32. DATASOURCE_NUM_ID
33. Aggregate Tables
34. Hierarchy Tables - Flattened Hierarchy & Helper Table
35. Subset dimension Table
36. Slowly Changing dimension
37. List of Values
38. Session Configuration
39. Incremental Logic Based on Date Filter
40. Partitioning on Flag
41. Bitmap Index
42. Parallel Hinting
43. New Separate I_IMAGE & R_IMAGE Tables
44. Session Parameters for performance - Data & Index Cache
45. Optimise R_IMAGE only for relevant records
46. Tuning SQL override
47. Aggregate aware
48. What is rule based and cost based optimiser and how it works
49. Employee - Position model at Volume analytics
50. Gather my stats
_____________________________________________________________________

Instead of asking people to give you the answers so that you can memorize
them, which IMHO, is really cheating your prospective employers, I think you
should try and answer these questions yourself first. Then if you come
across something that you are not sure, you can post your answers and
someone will correct you.

In this way, you can truly learn something and benefit from it.
_____________________________________________________________________

Getting answers to all data warehouse related terms should not be a big
task, if he opens any Kimball or DW book then he will get all the answers.

In my view IT industry is so dynamic that everyday there are new problems
and your have to show miracles everyday so fakes can not work more than
couple of months. So nobody should be overly worried about fakes getting on
to projects. In the US hiring and firing policy is very quick so clean up is
very fast.

I totally agree with your frustration due to bad experience, but we are not
saints and perfect. Things happen and many of them we have no control over
so let's keeping going.

I would request the person who wants all answer to show some efforts and let
others know he is serious & equally working hard to get into this field and
I welcome all hard working persons.

I also agree with you that nobody should be allowed to take advantage of our
habit of providing help. Past experience (specially bad ones) teach us many
things also.

________________________________________________________________________

Actually, by going thru books to find the answers, you can still benefit
from the process. Understanding the theories set the foundation. Then you
build from this foundation with *hand-on* experience (this is where most
people on the field are lack of).

I am a team lead in my previous and current jobs, and I am teaching my team
members and junior developers every day. I really do not mind doing that.
But I like to teach them to *think* and not to *memorize*. After spending
some time to try and figure it out by yourself first, then I point you to
the right direction, you can truly learn.... like a lightbulb pops above
your head.

Again, I don't mind to help someone out with the answers after he/she showed
the effort. But I DEMAND that person to be honest at the interview. Don't
ever claim to have years of experience because you have the answers to the
interview questions. While it is true that the employer can fire you if you
are "under-par", but that could be six months later, and someone will have
to clean up his/her mess. I sure don't want to be the one to clean up
someone else's mistakes. I have better things to spend my time.
_____________________________________________________________________

People can buy whatever certification they want. It is really up to the
hiring managers to hire truly qualified people. The real problem is that
most hiring managers are incompetent themselves. All they know are the buzz
words and the popular theories, and when the applicants can "talk the talk",
the managers fail to recognize if the applicants can "walk the walk".

Many companies are learning it the hard way now and they begin to interview
applicants with the "team"....to compensate the incompetence of the
managers. The downside is that, if his/her existing team is not very good,
they still cannot weed out the bad ones.

Good managers will not hire anyone based on certification. I don't care if
you have 900 certifications, but if you do not have real life working
experience or success stories, I will not hire you..... well, unless there
is an entry-level position open.

There is simply no shortcut. Everybody has to earn his/her wing. Period!
____________________________________________________________________

The following are the topics.
You cannot termn them as interview questions. But it consists of topics that one needs to be sure of.

Data Modelling
Design level principles:
Avoid snow flake
pipleline analysis - snapshots
child level fact and parent level dim (order -order item modeling)
partitioning
surrogate keys

dimension level principles:
time & Date dimension
conform dim
slowly changing dim
junk
mini
hierarchies - flat and bridge table

fact level principles:
lowest granularity
calculated fields - views
do not mix fact granularities
avoid null fks

Performance

SQL performance:
Construction of query
Select clause Hints Indexes: index_ffs, parallel index etc
Operators used for effective indexes
Syntax of the joins
Sub queries/sorts - unions /outer joins
Table physical storage parameters
Analyze - statistics
Access mechanism : scanning /hashing/using indexes etc

Partitioning the tables
Constraints

OLTP Tuning
OLAP tuning
ETL Tuning

Tool specific questions like Informatica

1. Logic For Incremental extract
2. Type of Look up and Look up cache
3. ETL Performance
4. Report Performance
5. SQL Joiner and SQL Qualifier
6. Disabling/Enabling indexes through mapping
7. Previous row logic
8. Informatica metadata repository
9. Difference between Informatica power mart and power centre
10. Informatica partitioning - How it works?
11. Update strategy in SIL mapping - How it works?
12. New row_WID for dim
13. Avoid full load logic
14. Audit Trail Logic
15. Product Attribute Model
16. W_MAP_DIM_M Table
17. How to treat 4 types of fields - Number, text, Date, Currency
18. How to Remove footprints of failed incremental extract
19. Email after Session failure
20. Session Parameter File
21. Diff between OLTP and OLAP
22. Informatica features
23. Features of Repository Manager
24. What are the sources supported by Informatica?
25. External data sources : W_ETL_DATASRC_S & S_ETL_DATASRC
26. W_PARAM_G Table
27. W_ETL_RUN_S
28. How to use Debugger
29. Auxiliary Logic
30. Hierarchical SQL
31. What is RestartNewBatch
32. DATASOURCE_NUM_ID
33. Aggregate Tables
34. Hierarchy Tables - Flattened Hierarchy & Helper Table
35. Subset dimension Table
36. Slowly Changing dimension
37. List of Values
38. Session Configuration
39. Incremental Logic Based on Date Filter
40. Partitioning on Flag
41. Bitmap Index
42. Parallel Hinting
43. New Separate I_IMAGE & R_IMAGE Tables
44. Session Parameters for performance - Data & Index Cache
45. Optimise R_IMAGE only for relevant records
46. Tuning SQL override
47. Aggregate aware
48. What is rule based and cost based optimiser and how it works
49. Employee - Position model at Volume analytics
50. Gather my stats

__________________________________________________________________

                                            


       

No comments:

Post a Comment