A simple tutorial on the basics of encrypting and decrypting column data in SQL Server
If you're a database administrator, it is more than likely that sensitive data is stored within your database. Examples of such data include social security numbers, passwords, medical history, private government data, etc. When it comes to sensitive data of any kind, it is very important to ensure its confidentiality, which means protecting the data from being accessed (or viewed) by unauthorized entities. This is why it is important to understand the basics of encryption and how to utilize encryption algorithms and methods to protect your data. The purpose of this entry is to demonstrate a basic method of data encryption and decryption in SQL Server (applicable to versions 2008 and above).
In a nutshell, Encryption is the process of converting data (in this case, referred to as clear text) into a form which cannot be easily understood by anyone except authorized parties, called ciphertext. On the other hand, Decryption is taking this ciphertext and converting it back to its original clear text.
There are many ways to encrypt data using SQL Server. In fact, Microsoft has a diagram that describes the hierarchy of the various approaches one can take to encrypt data. This tutorial will demonstrate an example of how to encrypt data using a symmetric key secured by a certificate and a master key. Additionally, it will explain the different layers of SQL Server's Encryption Hierarchy and its components.
Common SQL Server encryption methods begin with the Service Master Key (SMK), which is essentially the root of the encryption hierarchy. The SMK is a symmetric key encrypted using the Windows Operating System Level Data Protection API (DPAPI) and the local machine key.
The SMK can only be opened by the Windows service account under which it was created or anyone with access to both the service account name and its password. There is only one Service Master Key per instance of SQL Server and is automatically generated when it is needed to encrypt another key. Therefore, the use of this key is essentially automated and there is no need for this key to be generated by the database administrator.
The next level of the encryption hierarchy is the database level. The Database Master Key (DMK) is a database level Symmetric key that is unique to each database within the SQL Server instance. This key is encrypted using the Service Master Key and a password (which is required in SQL Server, but not in Azure SQL Database). The DMK is not automatically generated and must be created by the database administrator. Let's say in my SQL Server instance, I have a database called "DataEncr". To create a DMK, I will execute the following command:
USE [DataEncr]
CREATE MASTER KEY
ENCRYPTION BY PASSWORD = 'mpwd';
The first line indicates the name of the database for which the key will be created and will be used for each command The second line indicates that we are creating a master key. The last line indicates we are encrypting the key with the password 'mpwd' (this is just an example password; the real password must be complex enough to follow Window's Password Policy).
The primary role of the DMK is to protect the private keys of certificates and asymmetric keys that are present in the database for which the Master Key was created. I will explain more about these in detail in the next session.
The next level of the encryption hierarchy is data level encryption, which is handled by a Symmetric Key. This is the private key that will be used to encrypt clear text and/or decrypt ciphertext.
When analyzing the SQL Server encryption hierarchy, there are many possible ways a database admin can construct and encrypt his or her symmetric key with varying complexity. For example, the admin can choose to encrypt the symmetric key with simple methods such as a password or another symmetric key, or with complex methods such as multiple asymmetric keys, each with a different cryptographic algorithm. Generally, the more complex the key's encryption algorithm is setup, the more secure the encrypted data is. However, the performance of encryption and decryption tends to decrease as the complexity increases.
In the end, no single algorithm is ideal for all situations. However, a symmetric key must be encrypted by at least one method. To take advantage of a Database Master Key we created, it is best to use a certificate and/or asymmetric key to encrypt the symmetric key, as the DMK adds extra granularity to both. In this entry, I will do so using a certificate.
A Certificate is a digitally signed data level security object. It is used as a means of Public Key Cryptography (PKI), a form of message secrecy in which a user creates a public key and a private key pair as a means of data encryption and decryption, otherwise known as an Asymmetric Key. In this case, the certificate contains the public key and the symmetric key we will create is the private key. Together, they create the public and private key pair.
To implement my certificate, I will use the following statement:
USE [DataEncr]
CREATE CERTIFICATE certKey
WITH SUBJECT = 'Database Encryption Key',
EXPIRY_DATE = '20201231';
The first line creates the certificate and gives it the name "certKey." The second line creates the certificate's subject, which is a brief meaningful description of the certificate. Finally, the third line line sets up the expiration date of the certificate. After the provided date, the certificate can no longer be used to encrypt or decrypt data. The date can be provided in any format and is always in UTC. The value I provided will make the certificate expire 12/31/2020. However, if a date is not provided, the certificate will automatically expire one year after its creation.
Another method that utilizes PKI is the CREATE ASYMMETRIC KEY command, which also creates a public-private key pair. However, this command does not have the ability to provide meta data like an expiration date. There are more differences, but they are beyond the scope of this entry.
Now that we have finished creating the certificate, we can now create our symmetric key. I will do so with the following command:
USE [DataEncr]
CREATE SYMMETRIC KEY symKey
WITH ALGORITHM = AES_128
ENCRYPTION BY CERTIFICATE certKeys;
The first line creates the symmetric key, which I give the name "symKey." The second line indicates the encryption algorithm that will be used to encrypt data. The algorithm I am using in my example is the Advanced Encryption Standard (AES) symmetric key algorithm with a key size of 128 bits. The AES algorithm has been proven to be a reliable algorithm and can be implemented with larger key sizes using the AES_192, and AES_256 algorithms. I will not go into detail explaining the differences between the three, but the AES_128 algorithm is efficient enough to ensure data security (beginning with SQL Server 2016, all other symmetric key algorithms offered by SQL Server besides these are deprecated). Finally, the last line indicates the mechanism in which we use to encrypt the symmetric key. In this case, it will be my certificate "cerKeys."
With this we now have our symmetric key and we can begin using it to encrypt data.
As mentioned before, there are many ways that data can be encrypted. Along with choosing encryption mechanisms, the database admin must also decide the level in which to encrypt the data. One method is called Transparent Data Encryption (TDE) or encrypting data at rest. When enabled, all data that is inserted and retrieved in the database overall is implicitly encrypted and decrypted with a certificate and database encryption key. While not all encryption structures support TDE, my encryption scheme allows for TDE to be implemented.
The method I will be implementing is called Column Data Encryption which allows me to explicitly encrypt and decrypt particular columns of data in any table of my choosing.
Before any data can be encrypted with this method, we must first open (or decrypt) the newly created symmetric key to make it available for use. This is done by executing the following command:
USE [DataEncr]
OPEN SYMMETRIC KEY symKey
DECRYPTION BY CERTIFICATE certKey
The first line indicates that we are opening our newly created symmetric key "symKey". The second line indicates that we are decrypting it with our certificate "certKey." Take note that the key can only be correctly decrypted using the same convention we used to encrypt it, which in my implementation was indeed "certKey".
Now that the key is open we can begin to encrypt and store data. It is important to note that when data is encrypted, it is always represented as a binary value. Therefore, in order to store encrypted data, the column where the ciphertext is being stored must be of BINARY() or VARBINARY() datatype. The length of the value depends on how many bytes the stored data is expected to be. As you will see, encrypted data usually results in a long binary string. To be safe from data truncation and variable data length, I recommend declaring the datatype to be of long length such as VARBINARY(200).
As an example, let's say our database contains a table called Person which contains the basic information on a person, including his or her social security number (or SSN). Here is how our table may be constructed:
| # | Name | Data Type |
|---|---|---|
| 1 | ID | int |
| 2 | Last_Name | varchar(20) |
| 3 | First_Name | varchar(20) |
| 4 | SSN | varbinary(200) |
Of all the columns in this table, the SSN is the most sensitive, and thus we wish to encrypt the data added and/or contained in it. This is done with the function EncryptByKey(). The following is how I will implement the function:
EncryptByKey(Key_GUID('symKey'), @clear_text);
The function EncryptByKey() takes two mandatory arguments. The first is the Globally Unique Identifier (GUID) of the symmetric key you are using to encrypt data. As I have done, you can obtain this ID by using the function Key_GUID(), which takes the name of your symmetric key as its parameter and returns the GUID. The second argument takes the cleartext that will be encrypted by the key. The function then returns a binary value, which is the encrypted cleartext.
The following INSERT query demonstrates how to insert a new row with an encrypted SSN into the table:
INSERT INTO Person (ID, Last_Name, First_Name, SSN) VALUES (1, "John Doe", EncryptByKey(Key_GUID('symKey'), "123456789"))
The query will insert the first two values normally, but will encrypt the SSN string.
Let's say instead your table has an SSN field with a data type varchar(9) that has previously stored social security numbers and wish to encrypt the data in all rows. First you need to create a new field of a binary() or varbinary() datatype (let's call it SSN_encr). Second, and execute the following update query to update all rows:
UPDATE Person SET SSN_encr = EncryptByKey(Key_GUID('symKey'), SSN)
Then, delete the old column containing the cleartext and rename the "SSN_encr" field to "SSN".
If the social security number was successfully encrypted, using a SELECT statement on the resulting table should display as follows:
| ID | Last_Name | First_Name | SSN |
|---|---|---|---|
| 1 | Doe | John | <Binary Data> |
| 2 | Doe | Jane | <Binary Data> |
As you can see, the rows in the SSN field now display <Binary Data> which is shorthand for it's actual encrypted data's binary representation. Here's an example of what the actual data might look like:
0x00F7AE42D8203E4EB248C1252455A217010000009B0552940829CC11B10CDA30E6D2C7386FB8D0FAAA4478D815B6256B33E2340E5D6DBDE3580A4D50A82805FCA96CB376
As you see, even for such short cleartext like a 9 character datum, the resulting ciphertext appears as a very long hexadecimal string, which in fact, is shorthand for binary (hence, the field is of binary() datatype). Just to note, the substring before the six 0's is related to the key and is the same for all data encrypted by this key. The remaining substring after the six 0's is the ciphertext of the original string.
Now say we want to retrieve the original value of the SSN data that we encrypted. We must decrypt the ciphertext back into cleartext. To do so, we must make sure to open the same symmetric key that we used to encrypt the data and use a decryption function. Note that if a symmetric key hasn't been explicitly closed or if the current SQL Server session has not ended, the key should still be open.
An example of a decryption function is the DecryptbyKey() function. The function is implemented as follows:
DecryptbyKey(@cipher_text)
The only mandatory parameter this function takes is the ciphertext you are looking to decrypt. The symmetric key used to decrypt the data is implied after the OPEN SYMMETRIC KEY command is executed.
As an example, let's use the function in a SELECT statement to retrieve the decrypted SSN values in our Person table. To do so, I will implement the command as follows command:
SELECT ID, Last_Name, First_Name, COVERT( VARCHAR(9), DecryptbyKey(SSN) ) AS 'Decr_SSN' FROM Person
Since the DecryptbyKey() function returns a binary value, it is important to use the CONVERT() function to convert the decrypted data to its original datatype. In this case, I will use VARCHAR() in which the result will be my original social security string.
The result of my SELECT statement would be as follows:
| ID | Last_Name | First_Name | Decr_SSN |
|---|---|---|---|
| 1 | Doe | John | 123456789 |
| 1 | Doe | Jane | 987654321 |
Once you finish encrypting or decryption data, it is important to close the symmetric key, or else it will remain open throughout the rest of the user's session (i.e. between the time the user logs in and logs out of SQL server). To do so, use the CLOSE SYMMETRIC KEY command. To close my symmetric key, I will implement the command as follows:
CLOSE SYMMETRIC KEY symKey
After executing this command, the symmetic key "symKey" will be closed and will not be able encrypt or decrypt data until it is opened again.
Alternatively, there is a shorthand way to decrypt our data without the need to explicitly open or close our symmetric key. This can be done using the using the DecryptbyKeyAutoCert() function. This function works the same way as DecryptbyKey(), except that the information pertaining the to the key is passed as parameters into the function. With the key's information provided, the function implicitly calls OPEN SYMMETRIC KEY and CLOSE SYMMETRIC KEY. Particularly, this function can only decrypt data that was encrypted by a symmetric key using a certificate, which is indeed our case.
The mandatory parameters for the function are:
DecryptbyKeyAutoCert(@cert_id, N'@password, @cipher_text )
The first parameter is the ID of the certification that was used to encrypt the symmetric key. This can be obtained by using the CERT_ID() function with the certification name as the parameter. The second parameter is the password used to encrypt the certificate. The password must be provided as an NVARCHAR() datatype, hense the 'N' character before the password string. If a password was never used however, the parameter takes 'NULL'. The last parameter is the ciphertext we with to decrypt.
For our SSN data, I will implement the function using a SELECT statement as follows:
SELECT COVERT( VARCHAR(9), DecryptbyKeyAutoCert(cert_id('certKey'), NULL', SSN ) ) FROM Person
This will return a column with all encrypted social security numbers. DecryptbyKeyAutoCert() also returns data of BINARY() datatype which is why I used the CONVERT() function.
Summary
From this tutorial, you should now have a basic idea as to how to encrypt your data to protect its confidentiality in SQL Server. Specifically, I have encrypted individual column data using a symmetric key, certification, and database master key. Below is a summary of the SQL Server commands and functions used in this tutorial:
-- Reference Database
USE DataEncr
/* Setting up our encryption hierarchy */
-- Creating a Master Key
CREATE MASTER KEY
ENCRYPTION BY PASSWORD = 'mpwd';
-- Creating a certificate
CREATE CERTIFICATE certKey
WITH SUBJECT = 'Database Encryption Key',
EXPIRY_DATE = '20201231';
-- Creating a Symmetric Key
CREATE SYMMETRIC KEY symKey
WITH ALGORITHM = AES_128
ENCRYPTION BY CERTIFICATE certKey;
-- Opening the Symmetric Key
OPEN SYMMETRIC KEY symKey
DECRYPTION BY CERTIFICATE certKey
/* Encrypting Data */
-- Inserting encrypted social security number into the Person table
-- Using EncryptByKey() function
INSERT INTO Person (ID, Last_Name, First_Name, SSN) VALUES (1, "John Doe", EncryptByKey(Key_GUID('symKey'), "123456789"))
-- ALTERNATIVELY - if an unencrypted column with data is already present
-- Create new binary column
ALTER TABLE Person ADD SSN_encr VARBINARY(200);
-- Encrypt social security numbers and place in new column
UPDATE Person SET SSN_encr = EncryptByKey(Key_GUID('symKey'), SSN);
-- Delete old column
ALTER TABLE Person DROP COLUMN SSN;
-- Rename New column
EXEC sp_rename 'Person', 'SSN', 'SSN_encr';
/* Decrypting and retrieving Data With DecryptbyKey() (Key Must Be Open)*/
SELECT COVERT( VARCHAR(9), DecryptbyKey(SSN) ) FROM Person;
/* Close Symmetric Key */
CLOSE SYMMETRIC KEY symKey
/*ALTERNATIVE - Decrypt and retrieving Data With DecryptbyKeyAutoCert() (No need to open key; Decryption with Certificate only) */
SELECT COVERT( VARCHAR(9), DecryptbyKeyAutoCert(cert_id('certKey'), NULL', SSN ) ) FROM Person;
To enable additional users to utilize a symmetric key and decrypt data, they must have specific permissions to access the key. This is especially required for usernames that are used to access data in web application.
For example, the following queries allow for SQL Server username 'User' to be able to encrypt and decrypt data when accessed in a web application:
GRANT VIEW DEFINITION ON CERTIFICATE::certKey TO User
GRANT VIEW DEFINITION ON SYMMETRIC KEY::symKey TO User
GRANT CONTROL ON CERTIFICATE::certKey TO User
Here are some links to Microsoft's official SQL Server documentation for a more in depth analysis on data encryption and the Transact-SQL functions mentioned in this entry: