
How the Matillion ETL tool works
& Matillion Data Loader

What is Matillion?
Matillion is a cloud-native platform that provides an easy-to-use, scalable, and powerful solution for data integration, transformation, and orchestration. It allows organizations to quickly and easily move and transform data from various sources into cloud-based data warehouses, such as Amazon Redshift, Google BigQuery, and Snowflake.
Matillion claims to be an all-encompassing SaaS solution and cloud data integration platform that handles the entire data integration process, all the way through from acquisition and ingestion to transformation. It offers a low-code approach that utilizes the native compute power of popular cloud data warehouse systems. That is to say that Matillion leverages the power of your cloud environment to transform your data. Matillion has two flagship products, Matillion ETL and Matillion Data Loader. Matillion Data loader is strictly used for moving data. On the other hand, Matillion ETL is Matillion’s flagship product. It is a more robust ETL solution. Both offer a detailed level of configuration for advanced users.
Matillion Features and Capabilities
Matillion provides a wide range of features and capabilities to support data integration, transformation, and orchestration.
Some of its key features and capabilities include:
- Cloud-Native Architecture: Matillion is designed for cloud-based data integration and transformation, providing a highly scalable and flexible platform that can easily integrate with popular cloud data warehouses, including Amazon Redshift, Google BigQuery, and Snowflake.
- Low-Code/No-Code Interface: Matillion's intuitive drag-and-drop interface allows users to easily create data transformation workflows without requiring complex programming skills. This enables business users to quickly and easily create and manage ETL processes.
- Pre-Built Connectors: Matillion provides pre-built connectors for a wide range of data sources, including structured and unstructured data, databases, and cloud-based applications. This enables users to easily integrate data from various sources into their data warehouse.
- Data Transformation: Matillion provides a wide range of data transformation capabilities, including data mapping, data cleansing, data enrichment, and data aggregation. It also supports advanced data quality and validation checks, enabling users to ensure data accuracy and consistency.
- Job Scheduling: Matillion's job scheduling capabilities enable users to schedule ETL processes at specific times or intervals. This allows for automated data integration and transformation, reducing manual effort and increasing efficiency.
- Data Lineage: Matillion provides data lineage tracking, allowing users to trace the origin of data and the transformations it has undergone. This enables users to ensure data accuracy and integrity, as well as comply with data governance regulations.
- Metadata Management: Matillion's metadata management capabilities provide users with a central repository for managing metadata, allowing for easy tracking of data lineage, data quality, and other important information.
Overall, Matillion provides a powerful and flexible platform for data integration and transformation, with a wide range of features and capabilities that can help organizations to quickly and easily move and transform data in the cloud.
What is Matillion Data Loader?
Matillion Data Loader is a free solution that Matillion offers to help extract data from your source systems and load it into your target destination (i.e. your data warehouse). Since Matillion Data Loader is free, it comes with fewer features. For instance, you can’t perform any transformations. Currently, Matillion Data Loader only supports around 35 connectors and does not appear to have support for Delta Lake or Azure Synapse as a destination. However, it does support Snowflake, Amazon Redshift, and Google BigQuery. Matillion Data Loader provides a simple wizard to build data pipelines for extracting and loading your data. Matillion Data Loader doesn’t offer any extra developer tools or any automation capabilities. It’s really just a point-to-point tool for moving data.
Here's how Matillion Data Loader works:
- Data Source Selection: Matillion Data Loader allows users to select the data source they want to load data from. It provides pre-built connectors for popular data sources, such as Amazon S3, FTP, SFTP, and databases, as well as custom integrations for other data sources.
- Data Mapping: Once the data source is selected, Matillion Data Loader allows users to map the data fields to the target data warehouse schema. It provides an intuitive interface for data mapping, allowing users to easily map source data fields to target database columns.
- Data Validation and Error Handling: Matillion Data Loader provides advanced features for data validation and error handling, allowing users to ensure data accuracy and consistency. It performs data validation checks during the data loading process and automatically handles errors, ensuring that only valid data is loaded into the target system.
- Data Loading: Once the data mapping and validation are complete, Matillion Data Loader loads the data into the target data warehouse. It provides a simple and intuitive interface for data loading, allowing users to quickly and easily load data into their target system.
- Monitoring and Alerting: Matillion Data Loader provides job monitoring and alerting features, allowing users to monitor the status of data loading jobs and receive alerts if any issues arise.
Matillion Data Loader provides a simple and intuitive platform for data loading into cloud-based data warehouses, with advanced features for data validation and error handling. Its pre-built connectors and intuitive interface make it a popular choice for organizations looking to quickly and easily load data into their target system.
The Problems with Matillion
Matillion has positioned itself as a SaaS solution, but in reality, Matillion is an iPaaS (integration platform as a service) solution. It supplies you with a platform to enable data integration, but it is up to you to handle all of the nitty-gritty details and set everything up. Matillion’s UI is very intuitive and user-friendly and it can be relatively simple to create various data pipelines and orchestrate ELT jobs. Although Matillion is an ELT platform, it should really be thought of more as a data orchestration platform. That is to say that Matillion enables you to coordinate the execution and monitoring of your data pipelines and workflows.
The problem with this method is that it does not scale well. Depending on your data ecosystem, setting up these data pipelines quickly creates complex workflows. Even worse, you have to maintain and address errors when they inevitably occur. ETL and ELT solutions are meant to free up the time of engineers, but if your engineers are maintaining the pipelines without writing code, you have to wonder how much value you are deriving from the solution.
Matillion Alternatives: Why Fivetran + dbt is a Better Solution
Whereas Matillion is an iPaaS solution that supplies you with a data integration platform to build ELT pipelines and orchestrate them, Fivetran is a fully managed SaaS platform that manages all of that for you. Fivetran currently supports over 150 different connections for various data sources and around ten different destinations like Snowflake, Azure Synapse, Google BigQuery, Amazon Redshift, Databricks, etc.
Unlike Matillion, Fivetran does not have a graphical UI that forces you to build and connect your various data pipelines and map out all of your workflows. Fivetran simply connects to your data source and you tell it where to load your data. You don’t have to worry about the entire data orchestration aspect or any of the common factors that break data pipelines like random errors, schema changes, execution order, changes in data models, etc.
Similar to Matillion, Fivetran is an ELT solution, but it extracts data from more sources and loads data to more destinations. However, aside from the fact that Fivetran handles all of the data orchestration for you, it is important to note that Fivetran does not currently have any transformation abilities. To be specific, Fivetran only handles the “E” (extract) and “L” (load) aspects of ELT. It doesn’t offer built-in transformations like Matillion.
Thanks to dbt, this is not a problem. If you are not familiar with dbt, it is a transformation tool that leverages SQL. It is extremely efficient at transforming data that is already loaded into your warehouse for analytics purposes. Strictly speaking, dbt gives you the ability to create data models that can be reused. Better yet, if your data models are dependent upon one another, one change in one data model will update another. If you are not using dbt today, then you will end up building it internally down the line. Pairing Fivetran and dbt together creates a flexible solution that is more efficient than Matillion.
Why Can’t I Use Matillion for Reverse ETL?
ELT solutions like Fivetran and Matillion read from the source and write to the warehouse. Reverse ETL solutions like Hightouch read from the warehouse and write to the source. The process between the two is completely different. Matillion actually reads and writes in both directions for some connectors, but this only provides some underbaked capabilities because Matillion specializes in ELT, not reverse ETL.
Matillion is a tool for data integration that places a heavy amount of work on your team. Matillion is not the industry standard data integration tool. A modern ELT product like Fivetran and a dedicated transformation tool like dbt provide the best basis for creating a modern data stack. If you are using ETL/ELT, Reverse ETL is the missing piece in your architecture to activate your data.
This alternative ETL approach provides the best way to create actionable insights. At the end of the day, cloud data warehouses help power business intelligence and analytics, but they do little to leverage and democratize that data for day-to-day operations to improve the overall customer experience. This is the exact reason Reverse ETL is so valuable.