ETL Testing for Technocrats

Share on FacebookShare on LinkedInTweet about this on TwitterGoogle+


ETL is a significant word in the world of data-warehouse implementations. As the full form of ETL (Extract-Transform-Load) suggests, it is a process which is commonly used in migrating data from source to destination at various stages in Information Management industry.

One can relate this process with commercial tools like Informatica, SSIS etc that are used to develop mappings to implement ETL process. Due to complex data migration and transformation, accurate and efficient testing of the data loaded at destination is becoming increasingly important these days.

ETL testing is a concept which can be applied to different tools and databases in information management industry. The objective of ETL testing is to assure that the data that has been loaded from source to destination after business transformation is accurate. It also involves the verification of data at various middle stages that are being used between a source and destination.

Across the tools and databases the following are two documents that will always be two hands of an ETL Tester. But it also important that the following two document should be in complete state before starting ETL testing. Continuous change in the below two documents will lead to inaccurate testing and re-work.

  • ETL mapping sheets (are developed by ETL developers to create ETL mappings )
  • DB schema of Source, Target and any middle stage that is in between.

An ETL mapping sheets contains all the information of source and destination tables including each and every column and their look-up in reference tables.

An ETL Testers needs to be comfortable with SQL queries as ETL Testing may involve writing big queries with multiple joins to validate data at any stage of ETL. ETL mapping sheets provide a significant help while writing queries for data verification. DB schema should also be kept handy to verify any detail in mapping sheets.