Data activation - frequently asked questions.



This article attempts to bring together a number of Frequently Asked Questions regarding one of the hottest practices in digital marketing of late. Is your question not here? Don’t hesitate to contact me!


Data Activation, what do we mean by that?

With Data Activation, we mean the gathering of (near-) real time applicable data about website visitors and other customer facing platforms at organizations. The data is gathered, interpreted and used for the creation of meaningful user segments, that form the basis of personalized communication. This communication can happen on-site or via a plethora of marketing channels.

As soon as the systems have been set up, the data flows as follows:

  1. Visitor enters the site or other platform

  2. Owner gathers data about visitor’s engagements

  3. Data analyst interprets this data and segments visitors accordingly

  4. Strategist, Marketer and data analyst form hypotheses about how to influence their behavior to improve the commercial results

  5. Marketer uses the segments to target specific messages

  6. User reacts (actively or passively)

  7. Data analyst interprets that data and refines segments based on that

  8. Process repeats itself

This is a cyclical process where user engagements help to specify increasingly accurate visitor profiles, making communication with them more accurate and relevant. Going forward, the optimization of these segments will increasingly be the work of self learning algorithms.


What data do we need to gather, in order to make this a reality?

Data is being gathered on a variety of levels:

  • Product interest - what products are users interested in, or better, what product features are users interested in - for this you need to gather all relevant product features.

  • Product pairings - what products are usually purchased together, what product features are usually interesting for the same people?

  • Basket analysis - what products end up in specific user’s baskets and what is the conversion rate following

  • RFM - recency, frequency, monetary value of each of the visitors

  • Psychographic - to what influencers is the person susceptible

  • Demographic / Geographic / Sociographic - based on CRM data, not available in the system through surf data gathered.

Not only does the advertiser need to gather engagement data about their visitors, they also need to have in place a solid product database, where all product features are stored and shared through the data layer on the client facing platform.


What tooling do we need to implement?

There are several options for tooling with which you can realize this ideal of personalized communication with your visitors and customers. They fall basically into 3 categories.

First you have the audience centers that in many cases are linked to advertising platforms. Examples of that are Doubleclick and Facebook, that let you store your data in their systems, segment your users and build audiences, target them within their network and expand on these audiences by targeting similar profiles in the wider network. Advantage of these systems is that implementation is relatively quick and easy and you pay only for what you target. Obvious drawback is that you are not fully in control of your audiences, you enrich their (and with this other advertisers’) profiles and you can only target them within those networks, not anywhere else.

Secondly you have the type of Data Management Platforms (DMP) that allow for your own data collection through a tag management system on your platform. This systems gathers all relevant data from a dataLayer and stores it per unique user ID. From these, profiles can be defined and managed and connectors with DSP’s and other marketing platforms allow for use of these audiences in pretty much any digital marketing environment. Examples of these types of DMP’s are Tealium and Relay42. Implementation of this tooling can be a hairy and lengthy process depending on legacy systems that need to be connected and the structure of the company. Six to twelve month projects are no exception.

Thirdly there is a standalone type of DMP, that is fed with data from other systems, like CRM databases or data warehouse systems. This type of data management platform is particularly suitable for merging first party, with third party data sets. However, not for all regions, third party datasets are commonplace, for instance in the Netherlands the amount and the quality of these sets is rather low.


What does an implementation project team for a system like this look like?

Generally a project like this in any organization is guided by a marketing technology agency that has experience with these implementations and a good idea how to run it most efficiently and avoid pitfalls. In that case the team at the agency could include the following roles:

Project manager: Planning, milestones, organisation

Data strategist: data architecture, furnishing DMP

Marketer: determine goals and KPI’s with the client, furnishing marketing systems

CRO specialist: A/B tests

Frontend developer: implementation of the tool on the platform

Data Scientists: data analysis, segmentation, visualisation, creation self learning algorithms

On client side:

Project manager / owner: runs the project within the organisation

Marketers: determine goals, reimagine the entire marketing landscape

IT: connect all data systems that need to be connected

Front-end development: responsible for making changes on front, placing tags, etc

Data analists: data analysis, segmentation


Collaborators - not core team, but affected

Content management: creation of the website / app content

Marketing Creatives: creation of marketing material

Tech suppliers: support of the implementation and the workings of technology supplied



Client exec board sponsor: represents the project in the board of the advertiser, makes sure the various teams keep prioritizing the project over other ones for the duration of the project.

Evt Consultant Project Partner; counterpart of the board sponsor at the agency


What does a Data Activation Project typically look like?

A data activation project consists of several stories, starting with inventory and scoping and ending with the automation of the processes. An overview:

Story 1: Scoping

    Activities: Goals & KPI’s, Inventory of data streams, Digital Marketing Strategy
Lead Time: 10-20 days
Teams: Analytics, Marketing, IT
Deliverable: Project Plan

Story 2 Architecture

Activities: Optimization of data gathering, Optimization of data management, Rollout of new systems and workflow
Lead Time: 20 - 240 days
Teams: Analytics, Marketing, IT
Deliverable: Consistent, relevant data

Story 3 Analytics

Activities: Analysis of gathered data, Clustering of customers and prospects, Hypotheses of behaviors
Lead Time: 10 days (recurring)
Teams: Analytics, Marketing, Data Science
Deliverable: Segments & hypotheses

Story 4: Rolling out and Testing

Activities: Rolling out targeted messages A/B testing the variations, Analysis of the results, Hypotheses for improvement 
Lead time: 5-10 days (recurring)
Teams: Analytics, Marketing, Data Science, Deliverable: A/B tests & Results

Story 5: Automation

Activities: Clustering testing and optimizationAnalysis of the results, Hypotheses for improvement of self learning algorithm
Lead Time: 5-10 days (recurring)
Teams: Analytics, Marketing, Data Science,
Deliverable: self optimizing algorithms


What impact can I expect on the workflow of my marketing teams?


There will be a gradual change from your current way of working, to a data and technology centered approach. Your marketing budget will shift over time from your current partners to the connected systems. Good practice is to start small, with a number of enthusiastic team members, dip your toes in the water and get your first learnings. Celebrate successes and celebrate failures and build on top of them. Changes will then be gradual and there is time to train all people in your department in the new way of working.


What type of data analyses will I need to be able to perform?

All data is aggregated under Unique visitor ID’s. By clustering ID’s with similar characteristics, we create segments and target groups. We can for example cluster ID’s with a similar spend behavior, similar return behavior or  product interest.

Part of these behaviors are registered directly and are straightforward in their interpretation, like for instance spending, return behavior, number of products purchased etc. Segmentation based on this data is relatively easy, as the lines between the different behaviors is clear. In data analysis we call this deterministic. Profiles either do or do not fall in the segment, no ambiguity possible.

Other segments are less explicit, because profiles are built up around a combination of different data points. In these cases statistical models calculate the probability that people fall within these categories. This analysis is aptly called probabilistic. For example, we may not have demographic data about whether a user is male or female. Still, we would like to deduct this from their user behavior. Various combinations of product views, with a weighting attached may determine for us whether a user is male or female.

Other example is an interest group. People never fall within only one interest group, so based on their behavior we may conclude that they with a certain likelihood fall within one group and with another in a different group. Here specific interactions may weigh heavier in the calculation (e.g. more recent, more frequent or certain combinations of touchpoints weigh heavier than others). In the end a lot of data points together paint the picture of a normal distribution per segment that we are creating.  Obviously, not all normal distributions look alike. Two examples i give you below, where on the left side you see a segment with a relatively low maximum probability and great variance and on the right side, the opposite, a very homogenous group with high probability and low variance. Generally, the larger the segment, the lower the probability, the smaller the segment, the higher the probability and lower the variance.

Two 'normal distributions', left relatively heterogeneous, right relatively homogeneous, 

Two 'normal distributions', left relatively heterogeneous, right relatively homogeneous, 



Once I have created the segments, what’s next?

Like so often with data analysis, once you start, more questions will pop up and this will take you ever deeper into the matter. User segments also shift over time, so it’s not like you will be done once you have created your first set. This is a continuous process. The value of the segments depends on the marketing value created with them. Low value segments can still be useful to for instance exclude in certain communication, if that makes marketing sense. Based on continued interactions by the users, more and more meaningful segments can be made.

From there the marketing department does A/B tests with these segments to see which creatives work better on what segment, to add that information to the profiles and further specify the segments. This leads to ever more targeted communication. Next to that, not only proven products should be tested, also new and other types, in order to better understand what people’s potential is with those.

Finally, part of these actions can be automated: the clustering of visitors based on the similarity of treats is possible without manually defining all characteristics. This would however, require specific segmentation systems not generally sold as part of the DMP.


How does Data Activation relate to….


All first party data, as recorded on own platforms, is data about users that already to a certain extent are familiar with your brand and/or your products. Targeting these will have a certain similarity to retargeting. However, regular retargeting systems do not record the visitors’ total profile and retargeting therefore usually happens solely on the view of one particular page. Data activation is much more complex and therefore much more relevant and targeted.  On top of that you can in most DSP’s target so called ‘similar audiences’ based on those profiles and reach people who are not familiar with your brand, but based on their profile within the DSP are likely to become interested, once they have been made aware.



With the term ‘Storytelling’ we refer to the practice of showing users a series of sequential messages that change and are made increasingly specific based on users’ interactions with them. For this, the marketer creates a matrix of their audiences, the products that are most relevant for these audiences and various stages in the buying cycle of the users: Are they oblivious, aware, interested, ready to buy, or repeat buyers? The message can be adjusted based on this and change for every time the user does or does not proceed.

Audience segments can be created in the DMP, they are shared with the DSP through cookie syncing and in the DSP network the ads are served as needed. Rich media and Video are particularly useful for storytelling, as more complex interactions are possible within the ad format, than just a click to the advertiser's website.

Without a DMP storytelling is also possible, but as with retargeting, you’re not building your own set of data and you’re thus not getting smarter about your advertising for every campaign that you’re running, which is a waste.