Ariel Pohoryles
SEP 7, 2023
icon
3 min read
Don’t miss a thing!
You can unsubscribe anytime

We’re excited to announce the launch of Google BigQuery as a SourceThis new integration enables data engineers and analysts alike to easily replicate or migrate data out of Google BigQuery into any target lake or cloud data warehouse!

So you might be asking yourself the following questions: Why should I extract data out of BigQuery to begin with? Isn’t BigQuery a classic target where you store data to serve other analytical applications? 

The short and simple answer is – yes. BigQuery is a data warehouse. But the truth is, data integration and architectures have evolved and are more complex. The use cases for extracting data from Google BigQuery and loading it into other destinations are growing and demand a more flexible data movement between destinations and systems. Let’s review the primary use cases to better understand how you can leverage this new integration as well. 

Primary Use Cases for Google BigQuery as a Source 

Syncing data between multiple data warehouses 

Forward-thinking and fast-growing organizations are using multiple data warehouses. They need to sync datasets across data warehouses like Snowflake or lakehouses like Databricks. A few common scenarios for this case are:

  • BigQuery is used as the marketing data warehouse while the rest of the organization uses another warehouse: Google BigQuery has a native integration with Google Analytics data (as well as Google Ads) which makes it highly attractive for  marketing teams. The issue is, if a marketing team uses BigQuery as its primary warehouse, while the rest of the organization has standardized on Amazon Redshift or Databricks – data silos are created and seamless integration becomes imperative. To create complete 360-degree customer analytics report and gain full visibility, some datasets still need to be constantly replicated from BigQuery to the destination warehouse. 
  • BigQuery is used for AutoML workloads while another warehouse is used for analytics workloads: Google Cloud’s strong support for ML services is very popular among data scientists to run their AutoML workflows on BigQuery. And warehouses like Snowflake are extremely popular for analytics workflows where data needs to be modeled using a more consistent data transformation layer. This scenario requires syncing the ML scored results back into the analytics designated warehouse. 
  • External BigQuery datasets are shared with your organization and are needed in your warehouse: BigQuery makes it easy to share data externally via Analytics Hub Listings which is great if all you want to do is consume the data as is. However, sometimes you need to replicate that data to your own warehouse for further integrations with your own internal datasets to create deeper analysis. 

Migrating from Google BigQuery to another data lake or data warehouse

Change is constant and migrations are a big part of it.  Sometimes there is a need to move your BigQuery data to another warehouse. These can be complex projects but one way to speed them up is by using a lift and shift approach to your data, which is possible with this new integration.

Activating BigQuery data in transactional databases or business applications

Data activation or Reverse ETL is becoming a standard integration that data engineers need to manage. It’s a faster way to unlock the value from your data and close loops faster. Once your data is enriched in BigQuery, the next step is to push it back into an operational system such as your application PostgreSQL or Azure SQL database. This new integration makes this process extremely simple. 

On this note, many of these use cases could be applicable to other warehouses as well. This is why we already support extracting data out of Amazon Redshift and expect to add more “warehouses as a source” in the near future.

How it works 

Integrating Google BigQuery data into any other database or data lake is no different than any other source to target integration in Rivery. You can set up a data pipeline with just a few clicks. All you have to do is provide your credentials, select the data you want to replicate and run it. 

Check out this video: 

If you are a data engineer or analyst using BigQuery, this could be the integration you have been waiting for! Go ahead and give it a try and don’t forget to let us know about your use case.

Minimize the firefighting.
Maximize ROI on pipelines.

icon icon