This page describes the fundamental concepts you need to understand to effectively use the Matia platform.

Source

A source is any system from which you extract data, such as a database (PostgreSQL, MongoDB, etc.), SaaS application (Salesforce, Google Analytics, Intercom, etc.), file storage (Amazon S3, Google Drive, SFTP, etc.), or, in the context of a Reverse ETL, a data warehouse (Snowflake, Databricks, Google Bigquery, etc.). Matia supports a wide variety of sources, making it easy to centralize data from across your organization.

Destination

A destination is where Matia loads your data, including data warehouses, databases, or operational tools. Destinations can be targets for both ETL and Reverse ETL integrations. Matia ensures data lands where your teams need it, in the right format and on schedule.

Integration

An integration is a pipeline that moves data from a source to a destination. Integrations could be either ETL or Reverse ETL. Each integration has its own sync schedule, schema configuration, notifications settings, and potential post sync actions.

Sync

A sync is the process of an integration run in which Matia transfers data from a source to a destination according to a defined configuration.

Integration Triggers

Integration Triggers define how and when a Matia integration sync is initiated. You can choose to run integrations manually, on a fixed interval (with a customizable base time), or by specifying a cron expression for advanced scheduling flexibility. This allows you to align sync timing with business needs, whether you want predictable intervals or complex, custom schedules.

Matia also supports event-based triggers, such as launching a sync whenever a specific dbt Cloud job finishes successfully. This enables seamless orchestration between your data transformation workflows and data movement, ensuring that syncs always run at the optimal moment—whether on a schedule or in response to upstream events

Integration Schema

A schema is a specific integration configuration which defines the data that would be included in each integration sync. It contains the different streams (tables), their fields (columns), the relationship between streams, whether specific fields need to be hashed, and the sync mode of each stream. Matia allows you to enable/disable individual streams and fields so you can move only the data you need.

Sync Mode

A sync mode determines how Matia moves data for each stream within an ETL integration. Sync modes define the strategy for extracting and loading data—such as Full Refresh (replacing all data each sync), Incremental (syncing only new or changed records), or Append Only (adding new records without modifying existing ones). The available sync modes depend on the specific connector and the capabilities of the underlying source and destination systems.

For Reverse ETL integrations, sync modes are defined at the integration level rather than per stream. Common Reverse ETL sync modes include Upsert (insert new records and update existing ones), Update Only (modify existing records without adding new ones), Append Only (add new records without updates), and Full Refresh (replace all destination data each sync). Selecting the right sync mode ensures data is moved efficiently and accurately, tailored to your business needs and the requirements of each integration.

Integration Schema Change

Integration schema changes refer to any modifications in the structure of your data as defined in an integration’s schema. This can include adding or removing streams (tables), changing field (column) names or data types, or updating the relationships between streams. Matia automatically detects and surfaces these changes, helping you stay informed and ensuring your integrations remain accurate and reliable as your source data evolves.

Matia gives you control over how each integration responds to schema changes. You can choose to automatically enable all schema changes—adding data from new schemas, tables, and columns as they appear. Alternatively, you can opt to only allow data from new columns while ignoring new schemas and tables, or you can choose to ignore all schema changes entirely. This flexibility ensures your data pipelines stay resilient and aligned with your organization’s data governance preferences.

ETL Emitted Records and Committed Records

Within the context of an integration sync run, Emitted Records represent the total number of records Matia detects and processes from a source stream for transfer, while Committed Records are those that are actually written and confirmed in the destination stream. This distinction offers valuable visibility into both the data Matia prepares to move and what ultimately arrives at its destination. By surfacing these metrics at the stream level, Matia empowers you with a more nuanced understanding of your data flows, helping you quickly identify potential discrepancies and optimize pipeline reliability.

Historical Re-Sync

A Historical Re-Sync is the process of reloading all historical data from your source into the destination for a given integration or stream. This operation is typically used to recover from data integrity issues, correct past errors, or ensure the destination reflects the current state of the source when incremental syncs alone are insufficient. During a historical re-sync, Matia extracts and processes the entire dataset—overwriting or replacing existing records in the destination as needed—according to your integration’s configuration.

Historical re-syncs can be triggered at the integration or stream level, depending on your needs and the connector’s capabilities. While essential for maintaining data accuracy, historical re-syncs may take longer to complete than incremental syncs, especially for large datasets, and can temporarily pause regular incremental updates until the process is finished.

Post Run Actions

Post Run Actions are automated tasks that Matia executes after each integration sync run, allowing you to seamlessly trigger downstream workflows. For example, you can configure Matia to run a dbt Cloud job after every sync, ensuring your data transformations always operate on the latest data. This automation reduces manual effort, maintains data freshness, and helps keep your analytics and business processes in sync with your data pipelines.

Reverse ETL Models

Reverse ETL models define the structured datasets prepared for activation in operational tools. These models specify which warehouse data to extract, how to transform it, and what business logic to apply before syncing to destinations like CRMs or ad platforms. Matia models support granular control, enabling you to filter, aggregate, or enrich data to meet destination requirements.

Reverse ETL Record Statuses

During a Reverse ETL sync, Matia classifies each processed record to provide transparency and actionable insight into your data delivery. Successful Records are those that Matia pushes to the destination without error, confirming that your data has landed as intended. Rejected Records are records that the destination system refuses—often due to issues like data formatting, missing fields, or invalid associations—and for each rejected record, Matia presents the actual API request and response, giving you full visibility into what was sent and why it failed. Invalid Records are identified by Matia as unsuitable for syncing before they ever reach the destination, typically due to validation failures or transformation errors within Matia itself.

By surfacing these statuses and providing detailed context at the record level, Matia empowers you to quickly troubleshoot issues, maintain high data quality, and ensure operational systems receive only clean, actionable data.