Elesh Mistry
AUG 9, 2022
icon
5 min read
Don’t miss a thing!
You can unsubscribe anytime

Creating a single source of truth for all of your data has always been a difficult task.

Connecting to various data sources, such as databases or applications, with all the different schemas and connection methods gave birth to a plethora of SaaS ETL and data integration tools.

All of them help you move data from one place to another. But most of them also have a hidden horror, which should fill you with as much concern as a red balloon floating out of a manhole cover. 🎈🤡

Because these solutions are SaaS, when data is moved from a source system to a target system, it needs to flow through the vendor’s servers, and also stored (however temporarily) on the vendor’s cloud architecture.

This process leads to a lot of important questions about your data movement:

  1. Where is the data temporarily stored?
  2. How long is the date stored?
  3. Who has access to the data?
  4. How can you be sure it was deleted?
  5. Can you limit access to the data using role-based access control?

But even with all the likely assurances vendors provide it still raises some big questions about data residency and in all likelihood, this is not acceptable for very sensitive use cases.

The antidote to this harsh reality of security issues should have been ELT – as the name states – you extract, load and then transform… but hold on there a second… <cue scary clown music>

ELT vendors almost always stage the data in a cloud bucket before loading into your warehouse. Even if your data is encrypted in transit and at rest, your data is leaving the infrastructure under your control, into the vendor’s architecture, and only then into your warehouse.

There are many technical and business reasons for using a cloud bucket as temporary storage when moving data into a Cloud Warehouse. It’s faster, and cheaper, as you’re not wasting warehouse compute power, but we can’t ignore the downside of storing it in your vendor’s infrastructure.

So where does this leave you?

If only there was an ELT vendor that allows you to extract data from your sources and move it to your warehouse without having to store it in the vendor’s cloud storage. A vendor that allows you to select your own cloud bucket, so your precious data does not leave the infrastructure under your own control.

On a side note, a great side-effect of allowing your source data to run through your own cloud storage area is that you are creating a data lake / staging area before your warehouse insert is completed – Imagine that – creating an integration and getting a data lake on your architecture as a natural by-product.

As you might have guessed by now there is such a vendor: Rivery.

Rivery adds an extra layer of security to the extract and load process by allowing you to bring your own Custom File Zone into the journey.

Rivery is a complete SaaS ELT tool with a self managed staging layer, which allows you to extract from a multitude of sources (DBs, Applications, Files and Events) and allows you to move that data securely to your warehouse using your own architecture – No more scary clowns included just a self managed data lake. 🙌

For more information on how to secure your data in transit, speak to a data expert today! 

Minimize the firefighting.
Maximize ROI on pipelines.

icon icon