Reverse ETL, ELT, data activation
When I hear the varying terminologies for data integration, the first thing that crops up in my mind is the scarecrow from the Wizard of Oz, “This way, that way, all the way…” It perfectly epitomizes the rapidly evolving data industry with its ever-changing terminology.
Let’s take Reverse ETL, it’s based on the idea that a System of Record (SoR) can receive updates to its data which was altered somewhere else. It used to be called a data synchronization process, but that doesn’t have the heft of an acronym. Both born out of the expectation that a “System of Record” should house the most accurate representation of its data. In reality, that’s rarely the case.
To be fair, this is a changing trend as platforms respond from demand and provide more flexibility by offering miscellaneous fields or expandable objects. Reverse ETL has gained more popularity as a result. It removes the need to bounce between platforms such as Salesforce and a Data Warehouse such as Snowflake to get to all the data required. With Reverse ETL, it’s possible to retrieve the calculated NPS score from the warehouse and store that value on the customer record in your CRM. This helps to make the end user’s experience much more palpable.
Not too long ago, we used printouts from varying systems and make do with what we had. That’s no longer the case. As the demand for evidence-based decisions has increased, so has our reliance on systems to create and maintain data that fuels those decisions.
This imperative, combined with the transformative and expansive capabilities of the Cloud has exploded our expectations. We’ve discovered that we need to enable Reverse ETL everywhere. The old, batched, legacy server ecosystem that once got us through the dotcom bubble ETL tool, but rather an entirely new way of thinking about how data ought to flow in the enterprise. Enter data activation.
What is data activation?
If Reverse ETL is data flowing up-river (pun intended), then we can think of data activation as the swelling of the Nile, providing sustenance to an entire geographic region. Data activation is the realization that a SoR can’t possibly manage all of the context and usage of a data entity; nor can it derive potential actions for given scenarios.
Take a “Customer” for example. We all know what a customer is, but the context by which we interact with customers varies according to our role or situation. A point of sale simply needs to know the payment method: cash or credit and an address to deliver a refrigerator. A marketing system needs to know when you last purchased or your favorite color. A support system needs to establish if you are a happy customer or not.
All these systems have specific purposes for interacting with customers, and as such have the potential to impact the customer in a SoR kind of way. The combined contexts and perspectives of all the systems can be combined (yet another term: insights) to inform the ideal action that we should take to meet our objectives and goals, like not to lose customers and increase retention.
Suddenly, an event triggers a sudden dip in NPS – this is time sensitive and outreach should happen quickly to avoid churn. This can’t happen with the data warehouse all alone. CRM systems are purpose-built for this precise motion. Additionally, the CRM system could automatically ask for feedback when the NPS changes, and respond using AI or other activated technologies. The possibilities are endless.
This is a basic example, but it demonstrates how we rely on data in ways that we have never been so acutely aware of before. We need data to be accessible and accurate, according to our contextual realities, and for us to be effective and efficient. Data activation is the application of the composite insight(s) that is the result of blended, constantly changing contexts.
Essentially, Reverse ETL is a point-to-point gap-fill, where data activation democratizes data, and also takes the next step of enabling action.
What we really need
For action, we need insights to be automatic. This implies (contentious term coming) real-time data availability – everywhere. It isn’t that we are performing “real-time analytics”, but rather applying analytics in real time – taking action on insights quickly, which precludes a certain responsiveness, or integration. Businesses know that this is a competitive advantage.
The varying platforms also need to encompass our perspectives, not just our data. The only way to do that within these systems is to integrate them. This form of system integration is focused on data, with insights built-in, making the most of all of our data, across all of our contexts and perspectives. Platform developers are taking note of these demands and are making their systems more “open”.
The resultant ease of programmatic integration has created the welcome byproduct of reducing the demand and burden on legacy ETL systems and teams. This has a further knock-on as well; reduced overall load and dependency on the monolithic Data Warehouse, while enabling a broader audience to take action. Work smarter, not harder so they say.
That’s nice, but HOW?
The simplest, most obvious solutions tend to be the ones that stick. Data activation is typically implemented using APIs. Application Programming Interfaces have been around for a very long time (in computer terms), and their varying protocols and scope have been the topic of debate for just as long. REST APIs, however, have proven to be the preferred standard to system integration in our Internet-enabled world.
REST APIs offer a consistent, yet flexible way to interact with applications securely. As such, every application that makes use of the Internet or browser will likely have already embraced APIs to some extent. Facebook, Google, and even banks are making more and more data activities available to programmers and continue to evolve. Some even offer “bulk” interfaces in addition to an already verbose library.
Composition of APIs is not a panacea any more than a Data Warehouse is, but it does open the door to a much more connected and insights-driven future. This interface standardization ultimately takes much of the complexity and time out of integrating systems. Because the systems are allowed to retain their business context (not repeating in ETL or warehouse), the context is allowed to change with predictable consequence, which creates more relevance and capability of the whole enterprise.
The Future is Nigh
So here we are, in the future and all of our data is integrated. Our time is spent contemplating strategy, analyzing insights and building value as the result. Maybe Reverse ETL isn’t becoming extinct. It’s possibly part of a growing expectation that data does need to be shared and that there shouldn’t be friction between systems.
Data activation tools are on the rise. What we should be considering and investing even more in now is DataOps. Not only are we going to need to compose and orchestrate data integrations, but we also need to maintain the rest of our data ecosystem (data operations and analytics) at the same time. We are demanding a lot from our data, and only getting started. A platform that is flexible enough to integrate any systems, combined with the ability to orchestrate and monitor data beyond just a data warehouse – hybrid-cloud included – is now imperative for companies that want to crush the competition.