What are Encryption and Tokenization?
Encryption and tokenization are complementary strategic tools used by systems that need to protect sensitive information, especially in the realm of secure payments and private transactions. Encryption uses shared keys to hide the content of communications as they are transmitted across networks; by contrast, tokenization replaces sensitive data with randomly-generated strings. Encrypted data can be converted back to its original state if a user has access to the key (or to sufficient computing power to ‘crack’ the code), while tokenized data can never be converted back to its underlying values.
The types of data protected by both encryption and tokenization include:
- Personally identifiable information (PII), like name and social security number
- Cardholder data (CHD) like card numbers and expiration dates
- Protected Health Information (PHI), such as medical test results
- Other personal data, which is legally defined differently around the world through statutes like GDPR, CCPA, LGPD, and the Australian Privacy Act
While they are closely connected, encryption and tokenization actually deliver parallel benefits, not competing ones.
How Does Encryption Work? What is it For?
Encryption works very much like the codes you might see in a spy movie: the plain text of a communication is scrambled according to a shared algorithm, so that only someone who has the key can unscramble it back to its original state. A very simple algorithm may be to replace the letter a with the number 1, the letter b with the number 2, and so on through 26 for z, so that the word hello is represented as 7.3.12.12.15.
The level of encryption strength is determined by the algorithm used to safeguard the data. An intricate algorithm will result in a more robust encryption that is harder to decode. Although the algorithms used to secure data on the web are much harder to break, the above example shows one of the downsides of encryption: although in theory only someone with the key should be able to read the encrypted message, with enough effort all encryption schemes can be cracked, whether by stealing the key, or by applying enough computing power to work it out. While encryption adds complexity to uncovering the actual information concealed within encrypted data, it does not guarantee that it cannot be exposed.
Therefore, encryption works great for:
- Data that needs to be preserved in its original format for future use and reversibility (medical records, PII)
- Data that needs to be searched and analyzed directly without decryption using techniques like homomorphic encryption or searchable encryption, preserving privacy while enabling data utilization
- Securing backups of employee records or cloud storage
How Does Tokenization Work? What is it For?
Tokenization works more like a cloakroom ticket: when you drop off your coat, you receive a ticket, which can be used to retrieve your property at a later date. Similarly, in a tokenized environment, sensitive data is exchanged for a different string (generally randomly-generated characters); the sensitive data is stored in a secure vault, and can only be retrieved by presenting the token.
Unlike encrypted data, tokenized data alone cannot be used to uncover the raw data because there is no implicit relationship between the characters in the token and those of the underlying data. Instead a hacker would need to access the tokenized data and authenticate with the token vault, which nearly eliminates the potential for a breach if NIST standards are followed.
Therefore, tokenization works great for:
- Quickly and efficiently accessing data as it doesn't require intensive encryption/decryption processes for each read/write operation
- Data that doesn't need to be reversed to its original form (e.g., credit card numbers for billing purposes)
- Sharing sensitive data with third parties without exposing the original information because the token substitutes the data
How do Encryption and Tokenization Secure Payments?
When a buyer wants to securely pay for something electronically, they embark on a tricky security journey with the seller. First, they open a Checkout page, into which they type their Cardholder Data; that CHD is then transmitted across the Internet to the seller; the seller then passes that data to a Payment Service Provider (PSP), which confirms the deal, taking money from the buyer’s account and moving it to the seller’s account; the seller then transmits a confirmation back to the buyer and starts the process of delivering whatever was purchased.
There are so many places this could go wrong if it weren’t for both encryption and tokenization playing their parts.
For effectively all online vendors, the following process is true. All communications between the buyer and the seller are encrypted using SSL - without going into the details of the public/private key asymmetric encryption system that SSL uses, suffice to say that when the little padlock appears in the address bar, the connection is encrypted. Once the seller has received the CHD, communication with the PSP is also encrypted. Depending on the level of their PCI-DSS compliance certification, the vendor may store much of the CHD, as well as customer PII, in their own databases; this data will also be encrypted. That said, vendors cannot store all CHD - for instance, the Card Verification Value (CVV) should never be stored by the vendor.
Advanced vendors know to add tokenization to this trail, owing to one uncomfortable reality: if their own data storage is hacked, and the hackers are able to decrypt the information they find, all that CHD and PII can be harvested and sold. So they use a tokenization service, such as the one provided by Basis Theory, to place all that sensitive data into a secure vault, and store only otherwise-meaningless tokens in their own systems. Now they are protected both by encryption - which secures data in motion - and tokenization - which secures data at rest.
What Are the Benefits of Tokenization Over Encryption?
The benefits of combining tokenization with encryption are two-fold, reducing the risk of data theft while also reducing the cost and complexity of compliance.
Because CHD and PII are no longer physically present in the merchant’s database, it cannot be hacked and sold. It also cannot be accessed using social engineering because the raw data is generally accessed only as-needed by systems that are part of a business process and not accessible by employees.
Merchants using tokenization know that they have secured their customers’ personal data and they are able to more easily meet compliance requirements like PCI DSS. They can additionally accelerate the design and execution of new products and business flows because they reduce the need for complex and time-consuming compliance procedures that would stifle innovation. For instance, by storing CHD in a secure vault, accessed with tokens, sellers can opt to execute a multi-PSP payments process, allowing them to select the provider that will both offer the highest likelihood of successful transaction, and the most favorable terms. .
Setting Up Tokenization
Merchants can use payment tokens provided by their PSP, which is adept at setting up, and leveraging, advanced encryption systems, from SSL to AES and everything in between. But this ties the merchant to one payment provider and doesn’t include the ability to tokenize other PII.
Tokenization is best tackled by working through a provider like Basis Theory, which makes it simple to store, access, and control sensitive data. This can help reduce the stress and strain of attempting to reach PCI-DSS Level One, and enable a shift away from the restrictions of a single PSP like Stripe, and toward a multi-PSP, cascading payments-oriented approach.