CDP: Turn Scattered Data into Powerful Strategies

What is a CDP? (For technical teams)

This article provides an overview of what a Customer Data Platform (CDP) is for technical teams. By technical teams, we primarily mainly Data/Software Engineers. The article’s focus is to briefly describe what a CDP is, how they are categorized, and the main functionalities that engineers should interact with in a CDP.

A Customer Data Platform (CDP) is a software solution designed to enable a unified view of the customer throughout the organization (often called a 360º view). It provides tools and mechanisms to perform identity resolution (ID) between different customer contexts. In the context of CDPs, ID resolution refers to identifying and connecting varied representations of customer profiles from different sources, systems, and databases. This unified view allows the creation of highly detailed and accurate customer segmentation. These segmentations are the main factor that makes CDPs useful and valuable for business teams, enabling use cases such as Customer Journey Orchestration and decision-making tools like Marketing Mix Modeling.

Types of CDP: CDPs are categorized according to different taxonomies, ranging from two to ten categories. One of the most widely used taxonomies is the one proposed by the CDP Institute, which includes four categories: Data CDP, Analytics CDP, Campaign CDP, and Delivery CDP. These categories can be represented as concentric circles, with each category extending the previous one by adding new functionalities.

  • Data CDP: Contains the core functionalities of a CDP, offering data integration with external sources, ID resolution, and data export to other systems.
  • Analytics CDP: In addition to the feature of the Data CDP, it includes analytical applications for customer data, such as building segmentations and, in some cases, basic machine learning applications.
  • Campaign CDP: Adds personalized treatments for individual profiles within segmentations, such as building customer orchestration journeys.
  • Delivery CDP: Combines all the previous functionalities and adds the ability to send messages directly to customers, eliminating the need for third-party software for this task.

Technical Teams in the CDP Context: Although the real value of a CDP is experienced by business and marketing teams, this would not be possible without the involvement of technical teams. CDPs cannot be deployed, maintained, or properly used without the contribution of technical professionals. The contributions of these teams can be divided into three categories: importing data into the CDP, configuring the ID resolution process, and exporting data from the CDP. In summary, the technical operation of a CDP can be described as a highly specialized ETL (Extract, Transform, Load) process.

Types of Customer Data Plateforms (CDP)

CDP’s types

Importing Data into the CDP

To obtain a unified view of customers, it is necessary to import data from different services or sources into the CDP. Most CDPs on the market offer highly abstract native connectors for various services, which usually only require authentication credentials and can be configured through simple graphical interface. These connectors often offer detailed ingestion modes, such as incremental, batch, or real-time ingestion. For more complex or unusual services, additional work may be required from technical teams, such as the use of Tag Management Systems (TMS). Some CDPs provide specific TMS solutions, which may involve JavaScript code, for instance.

ID Resolution Configuration

The ID resolution process varies according to the CDP but is usually carried out using SQL or similar code. The task consists of preparing the imported data and setting up the workflow to unify the different sources into a single customer profile. A key step is creating a unique ID field to identify each customer across all the databases. This ID is typically derived from fields already present in the imported data, such as emails or government identification numbers. When strong IDs are not available, the profile’s email is often used as a starting point for ID resolution. Below is a more concrete example of ID resolution at CDP Treasure Data.

CDP: Id Resolution Configuration

The initial part of the flow defines the keys that will be used in the ID, e-mail, document number and name resolution process. Here, you indicate the fields that are solid enough to identify the same profile in several different databases.

Identification of the tables to be unified

Identification of the tables to be unified

The next part of the flow lists the tables that will be used in the unification process. In this part of the unification process, the tables that register profiles or have any profile identification records are indicated. It is also in this part that the keys for each table that will be used in the unification are specified.

IMAGE

 

Definition of the construction of the unified ID

Definition of the construction of the unified ID

This section defines what, in the context of some CDPs, is called the canonical ID. This is the ID that is sought to be created/identified during the ID resolution process, it is the ID that identifies a profile among all the databases used in the ID resolution process. At this point, there are different strategies for selecting such and such ID(s). In the example above, the canonical ID will be an ID made up of three different fields.