Brandon Gubitosa
MAY 24, 2024
icon
5 min read
Don’t miss a thing!
You can unsubscribe anytime

You likely have data in Salesforce that needs further analysis in a cloud data warehouse like BigQuery.

So, you begin researching how to load your data from Salesforce into BigQuery as fast as possible.

You don’t want to worry about manually writing scripts and figuring out funky APIs just to later have a bug emerge in your data pipelines due to a change that you had no clue occurred.

Your time as a data analyst and engineer is precious, you can’t afford to build out custom data pipelines every time you need to integrate data from one source to another. 

Throughout this post, we’ll walk you through the two ways of integrating data from Salesforce to Google BigQuery, and why an automated way of doing so is preferred.

 

What is Salesforce?

Salesforce is a cloud-based customer relationship management (CRM) platform. It’s an all-in-one tool that manages your sales, marketing, and customer service data efficiently and safely. 

As a result, it removes the requirement for multiple systems, which streamlines workflows and increases productivity. 

Salesforce’s major features include:

  • Integrated marketing automation: Salesforce has sophisticated marketing automation capabilities, such as targeted campaigns and email marketing. 
  • Collaborative workspace: With Salesforce’s scalable architecture, your staff can effortlessly collaborate with support for multiple users and queries. 
  • Data sharing capabilities: Salesforce enables the sharing of information across the organization without requiring data duplication or relocation. 
  • Advanced security measures: Salesforce’s platform prioritizes data security with features such as role-based access control and multi-factor authentication. 

With its combined features and easy usage, Salesforce has become one of the world’s top CRM platforms. 

What is BigQuery?

BigQuery is a cloud-based data warehousing platform provided by Google Cloud. It’s a fully managed service for storing large datasets, presenting you with a powerful tool for data management. 

BigQuery’s major features Include:

  • Scalable data warehousing: BigQuery handles large datasets, allowing businesses to store and analyze huge volumes of data.
  • Strong security: BigQuery prioritizes data security with fine-grained access controls and encryption.
  • Quick data sharing: BigQuery enables seamless data sharing across teams to promote effortless collaboration. 
  • Advanced analytics: BigQuery supports numerous data analytics functions, such as SQL queries, geospatial analysis, and machine learning. 

BigQuery is an excellent place to store your data; it’s efficient, safe, and hassle-free, making it a superb choice for most companies. 

Why Integrate Data From Salesforce and BigQuery?

Integrating data from Salesforce and BigQuery can significantly enhance an organization’s ability to leverage its data for improved decision-making and operational efficiency. Here are some compelling reasons to integrate these two powerful platforms:

Comprehensive Data Analysis

Salesforce contains valuable customer relationship management (CRM) data, including sales, customer interactions, support tickets, and marketing activities. BigQuery, on the other hand, is a robust data warehousing and analytics platform capable of handling large datasets and complex queries. By integrating Salesforce data into BigQuery, organizations can perform comprehensive analyses that provide a holistic view of business operations and customer behaviors.

Enhanced Reporting and Business Intelligence (BI)

Combining Salesforce data with BigQuery allows for advanced reporting and visualization using BI tools such as Looker, Tableau, or Google Data Studio. This integration helps create detailed and insightful dashboards that drive better business decisions.

Data Enrichment

Integrating Salesforce data with BigQuery enables businesses to enrich their CRM data with additional datasets, such as web analytics, financial data, or external market data. This enrichment leads to more informed and nuanced insights.

Real-time Data Access

BigQuery supports real-time data processing. By integrating Salesforce data, businesses can perform real-time analytics and reporting, leading to faster and more agile decision-making.

Improved Forecasting and Predictive Analytics

With integrated data, businesses can build more accurate predictive models. For instance, combining sales data from Salesforce with historical performance data in BigQuery can enhance sales forecasting and customer churn prediction.

Streamlined Operations

Integration simplifies data workflows and reduces the need for manual data exports and imports. Automated data pipelines can keep Salesforce data in sync with BigQuery, ensuring data consistency.

Scalability and Performance

BigQuery’s serverless architecture and ability to handle large-scale data queries complement Salesforce’s robust CRM capabilities. This integration ensures optimal performance even as data volumes grow.

Cost Efficiency

Utilizing BigQuery for data analysis can be cost-effective, particularly for organizations already using Google Cloud. BigQuery’s pay-as-you-go model allows for efficient resource utilization and cost management.

The Manual Method of Integrating Salesforce Data with BigQuery 

This manual method of integrating Salesforce to BigQuery requires manual coding. You must employ developers to create custom scripts to extract, transform, and load data into a data warehouse or central repository. 

These developers need expertise in Python, Java, and SQL. 

Although manual coding can offer flexibility and control over the data, it can be expensive, time-consuming, and need ongoing maintenance. However, using Rivery to integrate Google Sheets into Snowflake is simple and requires no coding. 

Here’s how you do it:

Integrating Salesforce to BigQuery Using Rivery 

The integration from Salesforce to BigQuery is easy with Rivery, but there are a few things you’ll need: 

  1. An API token from your Salesforce account.
  2. A Google BigQuery account. 
  3. An active Rivery account. 

You can find the API token on the setup page of your Salesforce account. Search the option ‘reset my security token’ under the quick find box and check it. Once you create a token it will be visible. 

Then, copy it and use it in the ‘Salesforce Rivery Connection’. 

If you encounter any issues fetching the tokens, find more information here

Now you have these prerequisites ready—you’re ready for the next steps: 

Step 1: Establish a Salesforce Connection 

The first step is to establish a Salesforce Connection; thankfully, this is straightforward. 

Here are the steps:

  1. Insert a name for the connection into the connection name input. 
  2. Insert the user name of a user with access to the Salesforce account. 
  3. Insert the password of a given user. 
  4. Insert the Security Token from the Salesforce account in the Token input. 
  5. Click on the Test Connection. If it passes the test, you’re ready to work with Salesforce. 

You may also connect to a sandbox environment. If so, enter your sandbox environment username and the access token. 

Also, mark the “connect to a sandbox environment” checkbox. 

Step 2: Pull Data from Salesforce 

Now you’ve created a Salesforce connection, it’s time to pull data from Salesforce. Here’s the first step:

  1. Select ‘Create New River’ from the top right of the Rivery screen. 
  2. Choose ‘Data Source to Target’ as your river type. 
  3. Find the ‘General Info’ tab. Then, name your river and give it a description.
  4. Next, navigate to the ‘Source’ tab. 
  5. You’ll then see Salesforce in the list of data sources. Select it. 

There are three methods to pull data from Salesforce: 

  1. Bulk API: This is the preferred way to pull large amounts of data from Salesforce. However, it has a limit of 10,000 batches within a 24-hour sliding window. 
  2. SOAP API/SOQL: This method is a popular option, but it’s slower than Bulk API. 
  3. Metadata: You can use a Metadata report to extract data of an entity or several entities. This method is useful when comparing API to UI naming conventions. 

How to pull data from Salesforce:

  1. Choose your River mode. You have two options: Single or multi-tables. Single will load a single entity to your target, whereas multi-tables will load multiple entries. 
  2. In the mapping tab, select the entities to load. You can click the ‘edit’ button on the right to change individual entities’ table settings. 
  3. Add any filter to act as a WHERE clause for pulling data. It’s a good idea to pull data without Salesforce filters and filter it using the Logic in Rivery. 
  4. You should include deleted rows. This will include rows deleted by the Salesforce ‘soft-delete’ mechanism. In this case, the isDeleted field’s value will be correct.
  5. Select the entity to pull data for. Ensure you click on the input to see the list of available entities in your Salesforce account. 
  6. Use the extract method to pull the data incrementally or entirely. 
  • All: You can fetch all of the data in the table using chunks
  • Incremental: Gives you an option to run over a column in a table. 
  • Incremental field: Click the input to see the list of available columns in the chosen entity.
  • Incremental type: After choosing the incremental field, select what type of increment the selected incremental field is—either date, timestamp, or running number. 
  • Time period: Choose the time range of data to pull from the chosen entity. 
  1. Select the start date: Rivery will pull data with the selected incremental field later than your chosen start date. 
  2. Select the end date: If you want to pull data until the moment the river runs, leave the end date field empty.
  3. Include end value: If you don’t check the end value, Rivery will pull those records in the next run. 

Step 3: Set Google BigQuery as a Target 

Before you set Google BigQuery as the target, ensure you’ve signed up for the Google Cloud and have a console admin user. 

Here are the steps to set Google BigQuery as the target: 

Enable BigQuery API: 

  1. Go to APIs & Services on the Google Cloud Console menu pane and click on library
  2. Search for Google Cloud Storage JSON API and click on Enable
  3. Search for BigQuery API and click on Enable. 

Grant permission to the service account: 

  1. Sign in to Google Cloud Platform Console
  2. Go to IAM & Admin
  3. Click on +add
  4. Then, add the service account created by Rivery in the new window. 
  5. Choose the BigQuery admin role and click save. 

Configure your BigQuery in Rivery:

  1. Log into Rivery. 
  2. Create a new connection for your BigQuery:
  3. Go to Connections
  4. Select New Connections
  5. From the source list, choose Google BigQuery

Enter the credentials information for your Google Platform Service Account:

  1. Connection name. 
  2. Description (optional). 
  3. Project ID (can be found on the Google Cloud Platform Home Section). 
  4. Project Number (can be found on the Google Cloud Platform Home Section). 
  5. Region (Where your default bucket and database will be created). 

Click ‘Test Connection’ at the bottom to test. If successful, you have a working connection to Rivery. 

Moving Foward 

Integrating from Salesforce to BigQuery is a must-do for your business. It will streamline your data and produce better long-term results. 

The best news? Rivery will ensure this transition is seamless. You can ETL from Salesforce to BigQuery in a few clicks, with little to no code.

Minimize the firefighting.
Maximize ROI on pipelines.

icon icon