SQL Server Intergration Service (SSIS) Architecture
This is an ETL tool a product of MicroSoft.
Integration services is a platform used in building high-performance data integration and workflow solutions, this includes extraction, transformation and loading process for data warehousing.
The data can be processed from various locations and various formats (source locations) and saved to a centralized repository as a Data Warehouse/Data Mart (destination).
Graphical tools and wizards too are included for building and debugging packages.
This is a component of SQL Server and is a descendant of Data Transformation Services-(DTS) that formed a part of SQL Server 7.0. From an end-user perception, DTS and SSIS look similar, but they are different. SSIS has been fully written from the beginning- a new enterprise ETL product and left behind many limitations of DTS. The differences among DTS and SSIS are quite large, and the internal SSIS architecture and DTS too is very different. It has separated the data Flow Engine from the control Flow Engine or SSIS Runtime Engine; designed to attain a higher degree of parallelism and progress the all over performance. Refer the diagram.
SSIS Architecture comprises of the two major components mentioned below
SSIS Runtime Engine
The SSIS runtime engine manages the control flow of a package. This saves the outlay of the packages, runs packages and supports logging, breakpoints, connections, configuration, and transactions. The Run-time engine is a parallel control flow engine which coordinates the implementation of tasks or units of work within SSIS and handles the engine threads which carry out the given tasks.
SSIS runtime engine implements the tasks within a package in a sequential manner. If a runtime engine come across a data flow task within a package while execution, it creates a data flow pipeline and allows that data flow task run in a pipeline.
The Integration service- run on windows is not similar to the SSIS runtime service or engine. It is not necessary only if the design and execution integration service packages are required. This service of windows can be started to handle SSIS packages, such as to connect to multiple SSIS servers, start/stop package in a remote area or locally, control the package store, import or export packages etc.
SSIS Data Flow Engine/Pipeline
SSIS Data Flow engine or Data Flow Pipeline or transformation pipeline engine handles the data flow from the data sources, with transformations on to the target systems. While the execution of Data Flow Task, the SSIS data flow engine extracts the data from one or more sources of data, applies the required transformations and delivers it to one or more destination locations.
The Data flow engine works by buffering. It extracts the data from the source and stores the same in a buffer (memory structure) and applies the transformations in a buffer or memory itself instead of processing it row wise. Processing is much faster without physically copy/stage the data at every step of the data integration; the data flow machine manipulates the data while transferring from the source to the target system or location.
Both the above mentioned components are greatly influenced by the external to SSIS conditions, like network bandwidth and communication with the external systems like database servers, FTP servers or mailing servers.