Tokenization is used for securing sensitive data, such as a credit card number, by exchanging it for non-sensitive data - a token.
Tokenization is used for securing sensitive data, such as a credit card number, by exchanging it for non-sensitive data - a token. Tokenization is an excellent data security strategy that, unfortunately, only a few companies take advantage of. Perhaps its lack of adoption is because many believe tokenization is the same as encryption. I hope to dispel that myth and help you feel more acquainted with one of the best security strategies for credit card data and Payment Card Industry Data Security Standard (PCI DSS) scope reduction: tokenization.
See also: PCI Experts Say Reduce Your Scope
The whole point of tokenization is to limit the usage and storage of plain-text sensitive data to as few places in your environment as possible. According to Alex Pezold, CEO at TokenEx (a cloud-based credit card tokenization and data vaulting service), tokens can be used to replace any sensitive or non-sensitive data set (from protected health information (PHI) to automated clearing house data), but the most popular data for tokenization remains primary account number (PAN) data or credit card numbers.
Tokenization basically comes in two flavors: reversible and irreversible. Reversible tokens can be mapped to one or multiple pieces of data. This can be accomplished using strong cryptography, where a cryptographic key rather than the original data is stored or by using a data look-up in a data vault. Irreversible tokens make it impossible for any party to recreate the original value from its irreversible token. Irreversible tokens can be either authenticatable or non-authenticatable. Authenticatable irreversible tokens are created mathematically through a one-way function that can be used to verify that a known PAN was used. Non-authenticatable irreversible tokens cannot be associated to a specific PAN.
The best part of a correctly implemented tokenization system is that merchants never see customer credit card information. They only see tokens, which are essentially useless strings of information.
Now that you understand the basics of tokenization, let’s run through five steps that explain what happens to credit cards from the point of swipe until the payment process is completed.
See also: SecurityMetrics PCI Guide
Pezold says there are two types of token formats: format preserving and non-format preserving.
Format-preserving tokens maintain the look and feel of the original payment card data. For example:
Payment Card Number: 4111 1111 1111 1111
Format Preserving Token: 4111 8765 2345 1111
Non-format preserving tokens don’t resemble the original data and could include both alpha and numeric characters. For example:
Payment Card Number: 4111 1111 1111 1111
Non-format Preserving Token: 25c92e17-80f6-415f-9d65-7395a32u0223
According to Pezold, most organizations use format preserving tokens to avoid causing validation issues with existing applications and business processes.
As was mentioned before, the data vault is the keystone to the tokenization process. So, what happens if a merchant takes billions of transactions each year? How does that affect the data vault? To answer that, you must understand the difference between single-use tokens and multi-use tokens.
A single-use token is typically used to represent a single transaction, and processes much faster than multi-use tokens. Pezold says if you plan to use single-use tokens, expect your data vault to grow exponentially over time.
“Every time a repeat customer purchases something, a new token will be created in the vault. Because of this, single-use tokens are far more likely to cause a token collision scenario than multi-use tokens.”
A token collision scenario is when two identical tokens are generated, but actually represent two different pieces of data. (This is why validation of previously existing tokens in the token generation process is crucial.)
A multi-use token always represents the same card number and may be used for multiple transactions. Every time a payment card is entered into a payment system, the same token is generated and used.
“The two most common benefits of multi-use tokens include reducing data vault bloat and data analytics,” says Pezold. “Other benefits more specific to the payments space include recurring payment support and loyalty tracking.”
The question of whether to use single-use or multi-use tokens is dependent on 1) an organization’s need for retaining tokens and 2) plans for storage expansion.
The short answer? No. Although some tokens (reversible cryptographic tokens) are created using cryptographic functions, tokenization and encryption are very different technologies that have diverse pros and cons. To comprehend the strong tokenization vs. encryption debate, you must first understand that encryption is reversible, and tokenization is irreversible.
Pezold explains the difference well. “Irreversible tokens have no mathematical relationship to the original data point. In other words, you cannot mathematically reverse-engineer the token value to get back to the original data point. Irreversible tokens, in our humble opinion, are the only true types of tokens.”
Encryption (P2PE, etc.) on the other hand, maintains a mathematical relationship to the original data point, which means encryption methods are only as good as their algorithm strength. If a hacker cracks the algorithm, they are also able to decrypt all encrypted values. Not to mention the precarious security of encryption keys, which are very vulnerable to exposure, especially in large environments.
Says Pezold, “considering the astronomical rate by which computing power is multiplying–it’s just a matter of time before encryption mechanisms are invalidated.”
So, is tokenization more secure than encryption?
There’s no simple answer to that question because there are applications for both tokenization and encryption. Encryption is great for transmission of sensitive data. But because tokenization can’t be exploited through computer algorithms or mathematical formulas, some argue it makes a better overall data security solution, especially where payment card data is involved. Even if tokenized payment card data is stolen, it would remain secure as long as the data vault was protected.
According to the PCI DSS, “Tokenization solutions do not eliminate the need to maintain and validate PCI DSS compliance, but they may simplify a merchant’s validation efforts by reducing the number of system components for which PCI DSS requirements apply. Storing tokens instead of PANs is one alternative that can help to reduce the amount of cardholder data in the environment.”
Since credit cards aren’t available outside of the original point of capture and the data vault, risk (and therefore scope) is dramatically reduced with credit card tokenization. However, how tokenization reduces a company’s individual scope completely depends on how a company’s technology and business processes interact with payment card data.
Learn other ways to reduce PCI DSS scope.
Technically, the elements of the tokenization system (like the card vault and de-tokenization) are part of the cardholder data environment and therefore in scope for PCI requirements. But if the card vault is handled by a third party, it’s out of scope for the business taking the payment cards. All the business must do is ensure their tokenization vendor is approved through the PCI SSC, and that they protect tokenization systems and processes with strong security controls.
Pezold says the risks of tokenization are “pretty far and few between,” but names cross-domain tokenization, token commingling, and multiple tokenization solutions as three risky situations.
1. With cross-domain tokenization, businesses request the ability to tokenize data across all of their customers in a single data vault. “This scenario creates a situation where a token for one merchant can be used across all merchants in that vault–essentially making a token a credit card. For service providers with multiple merchant customers in particular, each customer must have their own data vault so a cross-domain scenario is not introduced,” says Pezold.
2. He goes on to explain the challenges behind data commingling, which essentially means an organization stores both card data and tokens. “Organizations that opt for a phased approach to tokenizing data can actually end up storing payment card data as well as tokens in their databases. This can create a challenge with some token schemes, as it makes it nearly impossible to determine what is a token and what is a payment card number.”
The PCI DSS requests that merchants prove they do not have payment card data within their environment. Data commingling makes that request nearly impossible.
3. Similar to data commingling, some companies elect to use multiple tokenization solutions, which could lead to some unique card processing challenges.
“In the event you have tokens from multiple providers present with no business logic around which tokens can be used with different service providers, there exists an opportunity for the merchant to try and use the wrong token to process a transaction. In other words, the merchant uses token from Company B to try and process a transaction through Company A. Unfortunately for the merchant, it’s not going to work,” says Pezold.
All things considered, if a merchant implements tokenization correctly, the risks associated remain quite limited.
Any business environment handling sensitive data should use tokenization to reduce risk and secure data. Companies ranging from e-commerce startups to multinational Fortune 500 companies utilize tokenization in mobile environments, throughout call centers, for file batching, and more.
Those hosting their own tokenization platform in-house must plan for additional architecture and memory storage that goes beyond the initial cost of implementation, especially those using single-use tokenization. With single-use tokenization every transaction equates to a newly generated token, which limits storage and slows down lookup response time. For those using a cloud-based provider, more storage will still cost you, but your infrastructure can remain the same.
Before you even talk to a tokenization provider, you need to understand where sensitive data exists in your environment. The best way to do this is through a card flow diagram. With this diagram, you can answer questions like, what technologies/people/software store, handle, maintain, and transfer credit card data? Then you have to answer the even bigger question: will you roll tokenization out all at once or in phases across the different acceptance channels?
PCI tokenization as a technology is pretty straightforward. The hard part is selecting the right partner. First you must decide if using a third party or using a payment processor is right for you. Using your payment processor as your tokenization solution provider limits you to processing only with that processor, but you may be able to work out a good implementation deal. However, with third party PCI tokenization solutions it’s simple to work with multiple payment processors. Ultimately you’ve got to decide what’s right for your unique environment and business model.
It’s essential to carefully evaluate a provider before jumping headfirst into a new tokenization solution. Perform a risk assessment when selecting a tokenization service provider to ensure you’re contracting with a secure entity. Make sure that your service provider is PCI DSS compliant, and follow up with them each year to verify their compliance.
For more information about tokenization products and how they could affect your PCI DSS requirements, check out tokenization guidance from the PCI SSC.