Airflow Xcom Exclusive -

: If you use a custom cloud backend, set an Object Lifecycle Management policy on your S3/GCS bucket to automatically delete XCom files after 14 or 30 days to control cloud storage costs. 5. Summary Cheat Sheet Standard XComs Exclusive Custom Backend XComs Storage Location Airflow Metadata DB ( airflow.db ) External Cloud Storage (S3, GCS, Azure) Size Limit Strict limits (~64KB for Postgres/MySQL text blobs) Virtually unlimited (Gigabytes scale) Performance Impact High risk of DB bloat and UI sluggishness Zero impact on DB transactional performance Best Used For Metadata, operational flags, small string IDs Pandas DataFrames, large JSON strings, heavy logs

For enterprise data pipelines, storing data in the metadata database is a significant anti-pattern. Airflow provides an exclusive feature to override this behavior: .

class ExclusiveXCom(BaseXCom): ALLOWED_PULLS = ("dag_etl", "extract", "load"): ["rows_count"], ("dag_etl", "transform", "report"): ["aggregated_metrics"], airflow xcom exclusive

@task def get_exclusive_token(): return "secret-token-123" @task def process_data(token): print(f"Using token") # Airflow handles the XCom exchange automatically token = get_exclusive_token() process_data(token) Use code with caution. Explicit Key Management

In practice, an XCom Exclusive looks like this: a task generates a unique resource identifier (e.g., a S3 key, a BigQuery table name, a Spark application ID). It pushes only that string . Downstream tasks then use that string to interact with the external system directly. : If you use a custom cloud backend,

This public link is valid for 7 days and shares a thread, including any personal information you added. This link or copies made by others cannot be deleted. If you share with third parties, their policies apply. Can’t copy the link right now. Try again later.

The primary way to handle these communications is through the xcom_pull() method Airflow provides an exclusive feature to override this

The TaskFlow API drastically reduces boilerplate code while maintaining identical underlying database operations. 3. Custom XCom Backends: Breaking the Database Constraint

| Issue | Consequence | |-------|--------------| | DB becomes bottleneck | Many large XComs slow down scheduler | | Not designed for streaming | Only final values, not incremental | | No automatic cleanup (unless configured) | XCom rows accumulate | | Cross-DAG XCom is fragile | Requires manual conf passing |

XCom data accumulates rapidly, leading to performance bottlenecks. Implement a maintenance DAG that runs weekly to purge expired or non-essential XCom rows directly from the metadata database using the SecretKeeper pattern or standard SQLAlchemy cleanup tasks:

from airflow.decorators import dag, task from datetime import datetime @dag(start_date=datetime(2026, 1, 1), schedule=None, catchup=False) def exclusive_xcom_dag(): @task def producer_task(): # This data is exclusive to the consumer_task below return "special_key": "exclusive_data_123" @task def consumer_task(data): print(f"Received: data['special_key']") # The flow makes the dependency explicit and exclusive producer_data = producer_task() consumer_task(producer_data) exclusive_xcom_dag() Use code with caution.