Javatpoint Azure Data Factory — 2021

Azure Data Factory (ADF) is Microsoft's cloud-native data integration service. It enables businesses to build (ETL/ELT pipelines) that orchestrate data movement and transformation at scale. The name "factory" is fitting—just as a manufacturing factory takes raw materials and transforms them into finished goods, ADF takes raw data from disparate sources and transforms it into actionable insights.

A pipeline is a logical grouping of activities that perform a task together. For example, a single pipeline might contain one activity that copies data from an SQL database and a second activity that runs a Databricks notebook to analyze that data. 2. Activities javatpoint azure data factory

A healthcare provider with multiple clinics needed to consolidate patient data into a central Azure Data Warehouse. They built an ADF pipeline with two key components: one for operations (full refreshes) and another for incremental loads based on last‑updated timestamps.This architecture ensured timely analytics while minimizing data transfer volumes. Azure Data Factory (ADF) is Microsoft's cloud-native data

"name": "CopyFromBlobToSql", "type": "Copy", "typeProperties": "source": "type": "BlobSource", "recursive": true , "sink": "type": "SqlSink", "writeBatchSize": 1000 , "inputs": [ "referenceName": "BlobDataset", "type": "DatasetReference" ], "outputs": [ "referenceName": "SqlDataset", "type": "DatasetReference" ] A pipeline is a logical grouping of activities

Clean, aggregate, and reshape the raw data using visual Mapping Data Flows or external compute services like Azure Databricks and Synapse Analytics.

: This is the compute infrastructure used by ADF to provide data movement and activity execution capabilities across different network environments.

| Transformation | Purpose | |---|---| | | Reads from a dataset (JSON, Parquet, CSV). | | Filter | Removes rows based on condition (e.g., Price > 100 ). | | Derived Column | Creates new columns or modifies existing ones (e.g., Total = Price * Quantity ). | | Aggregate | Group by and compute sum, avg, min, max. | | Join | Combines two streams (Inner, Left Outer, Full Outer). | | Sink | Writes transformed data to destination (Delta Lake, SQL, ADLS). |