We’re on it! We will reach out to email@company.com to schedule your demo. So we can prepare for the call, please provide a little more information.
We’re committed to your privacy. Tamr uses the information you provide to contact you about our relevant content, products, and services. For more information, read our privacy policy.
Tamr Insights
Tamr Insights
AI-native MDM
SHARE
Updated
February 18, 2024
| Published
August 4, 2023

Know Your Customer Programs and How AI Helps

Tamr Insights
Tamr Insights
AI-native MDM
Know Your Customer Programs and How AI Helps

Editor’s Note: This post was originally published in August 2023. We’ve updated the content to reflect the latest information and best practices so you can stay up to date with the most relevant insights on the topic.

As a financial services provider, you’ve likely invested in Know Your Customer (KYC) programs, whether for a traditional reason (risk assessment and regulatory compliance) or for a strategic growth reason (sell more or serve customers better). And they’re humming along, presumably.

Or are they?

If you don’t have clean data feeding your KYC programs, you can’t possibly have an accurate picture of your customers, one that’s trustworthy enough for making critical decisions.

Customer-related data, perhaps even more so than other data, faces an uphill battle in the “clean data wars.” It resides in many silos such as CRM systems, trading systems, and client reporting systems. And there’s the natural “drift” and disconnect that happens when so many different people create, modify, and update customer data as part of their daily jobs. 

Using traditional master data management (MDM) solutions to unify customer-related data is slow, particularly given the variety of customer-related data in different systems. Processes like data deduplication, records clustering, schema mapping, entity resolution, and data mastering at any kind of scale take too much time to execute properly because these steps still require extensive human involvement. Many processes require complex skills and intimate knowledge of the data. 

Further, with traditional MDM solutions, processes operate according to top-down-developed rules for data flow and logic. Business people define these rules, which programmers must then interpret, code, and deploy. Often this process has to be repeated over and over again. And whenever data changes, or the business adds new datasets, the process repeats, resulting in unacceptable latency for real-time, business-critical applications that store KYC data.

Breaking the Data Unification Logjam

As an example of this process, let’s take a look at a popular KYC application: fraud risk assessment.

A global financial institution needs to perform ongoing risk assessment on its customer list to ensure that they are all legitimate customers. The goal of these assessments is to ensure 1) the customers are who they say they are (versus masquerading as a company with a similar-sounding name) and 2) are not on a list of money-launderers or terrorists. The institution’s customers are globally dispersed, including commercial business and various governments, both large and small.

To get that real-world view of its customers, the institution needs to develop a deduplicated and up-to-date list of customers. Here are the hoops the financial institution has to jump through to get there:

Data Ingestion: Ingest the data from the various siloed data sources.

Schema Mapping: Align the ingested data to a canonical schema. Wade through the various datasets for similarly-named fields for the same thing, such as “Org Name,” “Name, “ or “Organization.” For example, one dataset may have multiple columns such as “Org Name,“ “Alt Org Name,” and “Alias” (e.g., the company’s stock symbol), and others may have duplicate columns (frequent). Resolve and map data into a unified attribute called, for example, Organization Name. Integrate this attribute into your schema and repeat for other attributes as necessary.

Mastering: At this point, the data has been mapped to a unified schema and into the desired standardized or normalized format. It’s fairly clean, but it still has duplicate rows. In the mastering phase, knowledgeable people must define what records may be duplicates and label data according to business rules to create clusters of like records.

Validation: This step involves people going through similar clusters of records and making sure they are being clustered appropriately. They add more labels and other metadata to make the records searchable and (hopefully) findable down the road.

Golden Records: Finally, data experts create a single, canonical, validated record that describes the entity, based on the institution’s business needs and desired view of the data.

Sounds relatively simple, right?

Now, imagine that you have one million records–or more. And that you’re going through them on Excel spreadsheets. At this point, you have an n-squared issue on your hands that you can’t possibly address in a reasonable period of time. Not so simple at all. 

AI-native MDM to the Rescue

By automating data unification with an AI-native approach, the financial institution described above can break the logjam in accessing and maintaining the up-to-date, real-world picture of its customers.

Tamr’s AI-native MDM uses machine learning (ML) combined with human escalation to automate about 95% of the work, invoking knowledgeable humans only to resolve cases with high levels of uncertainty or outliers.. Here’s how Tamr simplifies the time-intensive activities above:

Data Ingestion: Tamr works seamlessly with cloud data lakes, data warehouses, database systems, and flat files containing your most important, cross-source customer data.  In addition, Tamr can integrate directly to operational systems with an event-based integration model so that as data changes it is sent to Tamr in real time.

Schema Mapping: Tamr can take your defined customer data schema, import it, and create links between the columns from your input datasets and unified-attribute target columns. The system looks not just at the column name but also at any associated metadata or descriptions as well as the actual data within those columns. Tamr can thus deduce that numbers separated by dashes are probably a phone or fax number, words with an @ sign are probably email addresses, and so on. If you provide examples from your first two datasets, Tamr’s AI/ML models can use them to automate the mapping for the rest of your datasets in your KYC project.

Mastering: Here, Tamr really shines with its AI-powered, human-refined approach. With B2B and B2C customer mastering data products that have been proven over the course of engagements with globally-recognized organizations, Tamr delivers golden records in days or weeks, not months or years as is the case with traditional rules-based MDM solutions. Further, SMEs only need to provide a handful of responses; AI does the rest. A process that could have been excruciating and prolonged is now very easy, automated, and, most important, scalable across the KYC project.

Validation: With Tamr, as much as the AI accelerates the process, humans remain in control. Tamrs provides your data team and data owners with a simple curation interface for inspecting record clusters, filtering them according to Tamr-generated confidence metrics, and reviewing and refining them to meet your accuracy requirements.

Golden Records: Tamr expedites creation of golden records to any level of granularity, from required fields to desired metrics.

Real-time APIs:  Tamr enables real-time access to mastered views of customer entities. Using the “search before create” workflow, Tamr looks for potential matches within the system while the data is still in motion. If a matching record exists, Tamr updates the information. If a record does not exist, Tamr creates a new one. 

Thanks to Tamr’s AI-native approach, you can let your data speak for itself, without the additional, time-consuming and potentially confusing layer of business rules and extensive, error-prone human involvement.

It’s About Time

KYC truly is in the eye of the beholder – and it’s different for every company. A financial institution may have very strict requirements and processes for risk assessment and apply lots of rules to address these criteria. A retail company with a Customer360 program might need a deduplicated “golden record” with detailed profiling of their customers or maybe just their highest value customers.

But the basic approach is generally the same. And common threads are time and effort.

An AI-native approach to mastering customer data can save enormous time by reducing the amount of manual effort up front and speeding up the availability of clean, current, and correctly-classified data to business analysts or applications, which, in turn, improves analytical and operational outcomes.

Below are some of the results we’ve seen from applying this approach to the spectrum of data unification activities in various KYC programs:

  • One financial services firm ingested and profiled 35 large data sources with 3.7 million rows of data to produce 325,000 clusters of customer records, all in less than six months.
  • Another financial services institution gained the ability to onboard a new system from landing data to mastery in just five to seven days and to create golden records in two days. 

Whether you’re an established KYC professional or new to the concept, Tamr can support you throughout your MDM journey by helping you to assess what data you have and improving it before using it for analytical, operational or compliance purposes. Download our ebook to learn more about how Tamr can accelerate your KYC program’s success across the MDM journey.

Get a free, no-obligation 30-minute demo of Tamr.

Discover how our AI-native MDM solution can help you master your data with ease!

Thank you! Your submission has been received!
For more information, please view our Privacy Policy.
Oops! Something went wrong while submitting the form.