Ariel Pohoryles
OCT 24, 2024
icon
5 min read
Ingest data using Rivery

We are happy to announce the launch of a new key integration made simple by Rivery – Using Snowflake as a Source for any Target data lake or cloud data warehouse!

This new integration enables data engineers and analysts alike to easily replicate or migrate data out of Snowflake. But why would one want to ELT / ETL data out of Snowflake to begin with? 

We heard you loud and clear – with over 100 requests for this Snowflake integration there is no doubt it will quickly bring value to many businesses.

We already see the great adoption for Rivery’s Google BigQuery as Source and Redshift as Source integrations, confirming the rising need in replicating data from warehouses. Even though warehouses are typically considered to be used as targets, there are more and more use cases for those to be treated as a data source:

The primary use cases for replicating Snowflake data:

Syncing data between multiple data warehouses

It’s not uncommon these days to have organizations using multiple data warehouses and having to sync some of the datasets across those. These could be other types of data warehouses like BigQuery, Databricks lakehouse, Amazon Redshift or even just another Snowflake account. A few common scenarios for this case are:

  • Databricks is used by the centralized data platform team and Snowflake is used by specific business units: Thanks to Snowflake’s simplicity, many organizations opt to dedicate a Snowflake instance to a business unit that has fewer technical resources (i.e. HR departments). In this scenario, the centralized data platform team relies on Databricks for most of their data workloads and still need to replicate HR data from Snowflake to Databricks to create 360 reports outside of the use of the HR team alone. 
  • BigQuery is used for AI and ML workloads while Snowflake is used for analytics workloads: Google Cloud’s strong support for AI and ML services such as Vertex AI, made it a good choice for data scientists to run their AI and ML workflows on BigQuery. On the other hand, warehouses like Snowflake are extremely popular for analytics workflows where data needs to be modeled using a more consistent data transformation layer. This scenario requires syncing the raw/analytical data stored in Snowflake into BigQuery for AI processing. 
  • External Snowflake datasets are shared with your organization and are needed in your warehouse: Snowflake makes it easy to share data externally via external data shares which is great if all you want to do is consume the data as is. However, sometimes you need to replicate that data to your own warehouse for further integrations with your own internal datasets to create deeper analysis.

Migrating from Snowflake to another data lake or data warehouse

  • Change is constant, especially in a digital first world: Migrations are a big part of it and sometimes there is a need to move data from Snowflake to another warehouse. These can be complex projects but one way to speed it up using a lift and shift approach to your data, could be using this new integration.

Activating Snowflake data in transactional databases or AI applications

  • Data activation or Reverse ETL is increasingly becoming a standard integration data engineers need to handle: After all, this is a very straightforward way to leverage the value of your data and close loops using data faster. Once your data is enriched in Snowflake, the next step is to push it back into an operational system such as your application PostgreSQL or Azure SQL database. This new integration makes this process extremely simple. Another activation increasing in popularity is to sync data into AI application such as Amazon Bedrock or Amazon Q. In this scenario, Snowflake date will be replicated into Amazon S3 dedicated dataset, serving downstream AI applications.  

How does it work?

Integrating Snowflake data into any other database or data lake is no different than any other source to target integration in Rivery. The data pipeline is set up with just a few clicks where you just need to provide your connection credentials, select the data you want to replicate and run it. 

Here is quick tour of how to do so:   

If you are a data engineer or analyst using Snowflake, this could be the integration you have been waiting for! Go ahead and give it a try and don’t forget to let us know about your use case.

Minimize the firefighting.
Maximize ROI on pipelines.

icon icon