Setting Up Azure Blob Storage as a Target
  • 22 Jun 2020
  • 4 Minutes To Read
  • Contributors
  • Print
  • Share
  • Dark
    Light

Setting Up Azure Blob Storage as a Target

  • Print
  • Share
  • Dark
    Light

Setting Up Azure Blob Storage as your Target in Rivery

Overview

This guide will show you how to set up Azure Blob Storage as a target in Rivery. Along the way, the guide will show you how to create the proper requirements for loading, in addition to proper configuration of Azure resources that permit data loads into your Azure Blob Storage from Rivery.

Before you use this guide, please make sure you have a valid Azure account and compatible permissions to update or create resources in your account.

Rivery needs an Azure Blob Storage container to be set as a FileZone. You can either use the FileZone bucket or objects as a base to other Hadoop or Spark services that are operated by Azure HDInsights or by your other services.

Note: You can find the up to date documentation of Blob Storage operations here .

Creating an Azure Blob Storage account and container:

If you have an Azure Blob Storage account and container, you may skip to the next step.

Let's create a Blob Storage account and a container within the account:

  1. Click on + Create a Resource in the Azure Portal 's main menu.

    setting-up-azure-blob-storage-as-a-target_mceclip4.png

  2. In the New window, choose Storage, and click Storage account - blob, file, table, queue .

    setting-up-azure-blob-storage-as-a-target_mceclip10.png

  3. Define your storage account in the form. Name it, choose Subscription , and your Resource Group as created before.

    It is important that your storage type (Account Kind) is set as Blob s torage.

    Recommended: Use the "Hot " access tier to enable faster data loads and pulls.

    setting-up-azure-blob-storage-as-a-target_mceclip12.png

After creating a storage account, you must create a container and copy the access keys.

  1. Click on All Resources in the main menu.

  2. Search for your storage account name and click on it.

  3. In the Storage Account panel menu, under Blob Service click on Blobs.

  4. Click on +Containers and create a container name.

  5. Go to Access Keys in the storage account menu.

  6. Copy one of your keys and save it in a safe place. We will use this when creating the Azure Blob Storage connection in Rivery.

Optional: Adding a SAS Token (Shared Access Signature)

You can use the SAS token as a replacement for the account key for the copy command load only.

  1. Log in to Azure Portal
  2. From the Azure services bar select Storage Accounts or search for Storage accounts in the search resource bar at the top of the page
    azure_bar
  3. Select the storage account you have just configured from the list of storage accounts available
  4. Under settings, click on Shared access signature
    sas
  5. Fill in the following definition for your sas token:
    1. Allowed services: Blob
    2. Allowed resource types: Container and Object
    3. Allowed permissions: Read, Write, and list
    4. Start and end expiry dates. Note that once the token is expired you will have to go over this process from scratch.
      azure_set_sas
  6. Click on Generate SAS and connection string
  7. You new SAS token will appear under the SAS token input.
    Please copy it in whole, including the 'sv?=' part. You will use it when creating the connection.
    azure_generated_sas_token

Configure your Blob Storage as File Zone in Rivery

Now, we must setup Azure Blob Storage for data loads in the Rivery UI.

  1. Log into Rivery.

  2. Set your container as the default container for the file zone in Rivery:

    1. In the main menu, go to Variables

    2. Set your {azure_file_zone} variable value to the blob storage container name that you’ve created. This will be saved automatically by clicking anywhere on the screen.

    3. If you don’t have the { azure_file_zone } variable, you should add a new variable with this name and the blob storage container name that you’ve created as the value. Click on + Add Variable to create this variable.

      setting-up-azure-blob-storage-as-a-target_image3.png

  3. Create a new connection for your Azure Blob Storage:

    1. Go to Connections.

    2. Click on New Connection .

    3. From the source list, under the Storage section, choose Azure Blob Storage .
      setting-up-azure-blob-storage-as-a-target_mceclip0.png

    4. Now, enter your credentials information that you created in Azure:

      • Blob Storage Account Name.
      • Blob Storage Account Key.
      • SAS token (optional)
      • Connection Name
      • Connection Description
    5. You can test your connection by clicking Test Connection.

    6. Give your connection a name, and click Save .

azure_blob_connection

  1. Now, you can use this connection for any river that uses Azure Blob Storage as a source or target.

Azure Blob Storage as a Target

In a river that uses Azure Blob Storage as a target, you will find several simple configuration inputs:

setting-up-azure-blob-storage-as-a-target_mceclip32.png

  1. First, choose the connection that you've created. If you have only one connection, Rivery can choose it for you automatically.

  2. In the Bucket Name, you can use the {azure_file_zone} variable that you've created, or you can enter the Container Name that you've created.

  3. In the File Zone Path, you can enter the prefix for the container where Rivery will send the data files. Don't worry, the {river_name}_{river_id} variables you see will be replaced with the actual river names and the river id after the river is saved.

  4. The File Zone Folders Period Partitions is the way Rivery splits the data in the File Zone Path . You can choose to have Rivery split the data by the data insertion day, by the Day/Hour, or by the Day/Hour/Minute. This means Rivery will create data files from your sources under folders that represent the correct partition you've chosen.

Conclusion

This guide showed you how to:

  • Define your Azure platform for use in Rivery
  • Define your Blob Storage account container as a default file zone in Rivery
  • Create a new connection to your Azure Blob Storage.
Was This Article Helpful?