American Student Assistance allocates tens of millions annually to digital experiences, with 70% of its budget dedicated to areas like educational exploration. Data is essential in helping them stay competitive in the higher education landscape.
The data team is fundamental to the nonprofit’s mission in the education sector, as it is responsible for capturing, measuring, and analyzing data to uncover insights and assess outcomes. Their data team partners with Archetype Consulting to build and maintain their data and analytics environment.
The nonprofit’s marketing team leverages data from their advertising campaigns across platforms like Instagram, Meta, Snapchat, and TikTok. The team tracks ad data across all social media platforms to gain deeper insights into customer acquisition strategies, including analyzing metrics like cost-per-click (CPC) and customer acquisition cost (CAC).
Their product team analyzes data on the product health of the digital programs above to answer questions like:
- What activities are users doing on their platform?
- How long are they spending on these activities?
- Which tasks in the application are most popular among users to complete?
- What’s the value of these user activities for their partners?
Challenges
For the last few years, the data team used Fivetran to handle data ingestion and to trigger transformations in Snowflake via SQL procedures despite rising Fivetran licensing costs. That process broke down when Fivetran stopped supporting SQL transformations. After they onboarded dbt to replace that functionality, it became more challenging to understand where pipelines started and ended between Fivetran and dbt, and how to orchestrate them properly.
Additionally, the team struggled to find good ways to deploy new solutions in production environments easily. They had five main challenges with their original setup:
- Lack of visibility into data pipelines: The data gathered through partnerships, school districts, government statistics, and digital data assets all live in isolated source files and data silos. This led to an increase in time investigating where problems originated
- Too many data tools: They had a tool for data ingestion, transformation, warehouse, and BI tool, and needed another tool to do orchestration, which was painful to perform across various tools. They needed a tool that allowed them to orchestrate dbt Cloud jobs and Tableau data extracts once the dbt Cloud job was done.
- Fivetran was too expensive: Fivetran prices had grown to a multiple of their Snowflake costs and were set to continue rising as their data operations increased.
- Limited flexibility in Fivetran: With Fivetran, they only one option of how to structure data from each source in its data warehouse. This created a data reconciliation challenge with the digital marketing acquisition team which was never solved because their system presents data in a different granularity.
- Limited SDLC: There was no CI/CD strategy for production deployments. New developments were done manually by cloning data and pipelines from production on Snowflake and then adjusting SQL queries to point to the right environment schema. This process required tedious quality reviews and was slow to execute.
Solution
When the data team began looking at alternative tools to manage this process, they narrowed the selection between Rivery and Matillion. The team had a good understanding of what they were looking for in their next option:
- Seamless connection to their data sources.
- Simple to use for all data profiles, from their teams one data engineer to their team of data analysts.
- Highly secure to meet the requirements of an organization that manages PII data from students.
- Significantly less expensive
They chose to try Rivery over Matillion because they preferred a native SaaS option with no overhead for infrastructure management. After conducting a successful POC using Facebook Ads data, Rivery became a clear choice for their data team.
Migrating from Fivetran to Rivery
Before migrating from Fivetran to Rivery, the team at Archetype Consulting sat down with the nonprofit’s data team to define their migration priorities based on an effort vs value matrix.
The first area they decided to focus on was social media ads.
Once you get four social media sources into the data warehouse, you get an idea of what the whole universe of campaign spending will look like.
– Mark Cardarelli, Senior Director at Archetype Consulting
After that, they migrated over application data that the product team relied on for decision-making. They securely replicated data from their AWS infrastructure, configuring their data source to be a PostgreSQL database managed on Amazon RDS into their target Snowflake data warehouse using an SSH tunnel running through their Amazon EC2 instance.
The migration process involved examining the affected dbt data models, tables, and specific field names to modify using a source-to-target mapping file. While it required adjusting some of their dbt transformation models, they found that being able to select which tables they needed from the source, as opposed to working with a predefined data model generated by Fivetran, was more helpful, as their data model is streamlined and doesn’t include many tables or logic they don’t need.
The switch from Fivetran to Rivery took the team a month per business domain, which included building out new functionalities that they couldn’t do earlier due to product limitations (such as orchestration).
The process they followed was to set up new ingestion pipelines for the different sources which took just a few hours. Then spending most of their time adjusting the transformation logic and analyzing new options including the possibility of leveraging existing data models template kits from within Rivery.
Results:
- Centralized Visibility and control of pipelines: Gained control over their pipelines and ran those according to their data engineering best practices. They could now only ingest the data they wanted and control their own dimension facts tables definitions and incremental loads. This took down the time it took to validate data with the business from days to hours as they could easily point out the data in the source system and see it the same way in Snowflake. In addition, they could also maintain a historical table so they can do their data reconciliation if needed, and not depend on the Fivetran black box for history resync.
- Single tool: Orchestrated the entire process: ingestion, trigger transformation in dbt Cloud, and finally executed Tableau extract refreshes in a single platform. This was huge to see the end-to-end process timing and troubleshooting as needed. Also, it created cost savings by avoiding unnecessary tableau extract refresh schedules that incur additional Snowflake compute costs.
- Cut their cost in half: Their Fivetran bill was quickly adding up, and they weren’t able to justify the value of spending that much money for a tool that could only do one part of the process. They cut their cost in half by switching to Rivery and its optimized pricing model distinguishing between databases and APIs. With Rivery, they were able to achieve more with fewer resources.
Rivery gives our data engineering team much more flexibility for all of our data sources, even non-native sources, and allows for much more collaboration and efficiency within our warehouse, and the price point allows us to scale without compromising our data strategy.
– Alan White, Senior Director of Data Analytics at American Students Assistance
- Data Development life cycle management: Received specific error notifications instead of unnecessary warnings from Fivetran that were creating more noise for their team to solve. With Rivery’s built-in environments and deployment settings, their team could easily develop in dev, push to QA for testing, and then to production with a single package that includes all the dependent data pipelines that are automatically adjusted to each environment using variables (i.e., to manage schema names or credentials).
- Freed up data engineering time: As a growing, but lean team, the Rivery migration allowed for engineering to be done much more efficiently so that the team is now spending significantly less time preparing data and allowing more time for dashboard development and analysis for the business to leverage.