It’s easy to personalize offers to website visitors a retailer can identify from past visits. It’s effectively identifying the other site visitors that’s the hard part.

Bob Gaito, CEO, 4Cite

Bob Gaito, CEO, 4Cite

Consistent shopper identification is the key to marketing success. When done right, identifying consumers as they shop online makes people-based marketing easy—leading to stellar ROI and thriving customer relationships.

Unfortunately, it’s not often done right. While the ad industry’s billboard mentality of the past has evolved to incorporate identification techniques that enable more personalized ad placements, they still rely heavily on imprecise methods that use a lot of guesswork. Beyond this, ramped-up efforts by the browser giants to thwart third-party cookies, including Apple’s new Intelligent Tracking Prevention, further compromise the accuracy of these techniques. It’s become clear that in this age of personalization, relying on the ad industry for people-based marketing will not get you where you need to be.

Marketing should be based on the highest-quality data available. First and foremost, your IT shop or vendor needs to identify consumers as they interact with your brand. Only through consistent identification can shopping interests and histories be effectively captured and used to inform person-based marketing. When identification fails, retailers treat repeat shoppers like strangers. This won’t fly with consumers today, who expect highly-personalized, right-time, right-channel shopping experiences.

The methods you use for identification should capture data directly from your website, collecting personal signals associated with each individual as they shop. This first-party data is far superior to third-party data typically used in the ad industry. Once you have effective data collection mechanisms in place, the more individuals shop and the more these signals will multiply—email addresses, physical addresses, mobile phone numbers, device IDs, customer IDs, loyalty numbers and an ever-growing and changing collection of cookies picked up in browsers. Signals can be linked to a persistent, person-based identifier that acts as a repository for relevant shopping data, such as browsing activity and purchasing history.

Only through consistent identification can shopping interests and histories be effectively captured and used to inform person-based marketing.

Capturing the right data elements often requires multiple steps to make them actionable and meaningful. For example, to comply with Google policies, when you identify a web visitor who lands on your site by clicking through an email, you must translate the email address to a surrogate ID and then your identification layer needs to translate it back. Even with mastery of the various intricacies of data capture, your identification rate will be substantially less than 100%.

Let’s assume you are able to identify approximately 40% of your web visitors using first-party data collection. To identify the remaining 60%, you’ll need to use some of those third-party methods that aren’t as accurate. But the great thing is, with the first 40% identified effectively, methods to identify the remaining 60% will be much more accurate.

Say you want to determine the age of 1,000 different women, and you can use birth certificates to determine 40% of them. You develop a facial assessment tool to estimate age based on photographs of these women, then apply it to the remaining 60%. You can be fairly confident in your results, because your method is based on solid data and can be tested on the women whose ages you already know. Now suppose that instead of using birth certificates, you asked each woman her age. And suppose stereotypes are true, and many of them shave off a few years (or more). As a result, you develop your facial assessment tool based on less accurate data, and therefore the ages determined for the remaining 60% are less accurate.

Clearly the data you start with is important, but so are the methods themselves. In our example, the facial assessment tool would be a fairly strong method. But what if instead, the women’s ages were guesstimated based on the ages of their children? That would be stretching it. When identifying online shoppers, stretching it too far can lead to inaccuracies that reduce the impact of your messaging, or worse, result in delivery of the wrong message.

For web visitors who can’t be identified using first-party data, a great option is to use a network of trusted sources that pool data—otherwise known as second-party data. It’s essential that the network use trusted sources, each of which employ strong identification techniques for first-party data. In these networks, identification signals are anonymously pooled and activated when a match is found for one of your unidentified web visitors.


When it comes to identifying that elusive 60%, using poor quality data is simply not worth it. You’ll end up with an identity crisis that wastes money, lowers your marketing ROI, and possibly alienates consumers who receive the wrong message.


4Cite provides technology for personalizing email and website interactions with consumers.