— by Lori Brown @SQLSupahStah
In my last blog post I showed how to implement TDE encryption on a database and the performance impact of doing so. https://sqlrx.wordpress.com/2015/06/25/sql-server-tde-encryption-and-query-performance/ Today’s post will show how to implement encryption at the column level and the performance impact as well as a way to help make queries using encrypted columns faster.
When should data be encrypted?
Data is usually encrypted when a database holds sensitive information. The common example of sensitive data is credit card numbers (CCN) or in the US a Social Security Number (SSN). However, with identity theft on the rise many companies are finding it necessary to protect all of their customer’s personally identifiable information (PII) such as name, address, phone number, etc. Basically any information held in a database that can be used to identify or find an individual is a good candidate for encryption.
How to encrypt columns of a table
To demonstrate, I created a database for my ABCompany then grabbed some data out of an AdventureWorks database to create a fictional Employee table that contains EmpID, FirstName, LastName, Address, City, StateProv, PostalCode and SSN. Notice all the PII info that we will need to encrypt. And finally I created a couple of indexes as well. The table holds 8,688 records so it is not big but should work for this demo.
USE [ABCompany]
GO
CREATE TABLE [dbo].[Employees](
[EmpID] [int] IDENTITY(1,1) NOT NULL, [FirstName] [varchar](60) NULL, [LastName] [varchar](60) NULL, [Address] [varchar](60) NULL, [City] [varchar](30) NULL, [StateProv] [varchar](3) NULL, [PostalCode] [varchar](15) NULL, [SSN] [varchar](12) NULL) ON [PRIMARY]
GO
CREATE CLUSTERED INDEX [CIX_Employees] ON [dbo].[Employees]
([EmpID]) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_Employees_Lname_Fname] ON [dbo].[Employees]
([LastName], [FirstName]
) ON [PRIMARY]
GO
Demo query
For this demo I am going to use a simple query and will set STATISTICS IO ON and STATISTICS TIME ON as well as have the execution plan displayed after each run of the query.
SELECT EmpID, FirstName, LastName, StateProv, SSN
FROM Employees
WHERE LastName = ‘Bell’
Here are the results for this query with no encryption on columns:
SQL Server parse and compile time:
CPU time = 10 ms, elapsed time = 10 ms.
(29 row(s) affected)
Table ‘Employees’. Scan count 1, logical reads 70, physical reads 0
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 83 ms.
Encryption
Now let’s begin the process to encrypt PII columns. We have to create a database master key, a certificate and a symmetric key with passwords. All have to be opened and the database master key and certificate should be backed up.
— Create Database Master Key
USE ABCompany
GO
CREATE MASTER KEY ENCRYPTION BY PASSWORD = ‘Pa$$w0rd’
GO
— Have to open the master key before you can do anything else.
OPEN MASTER KEY DECRYPTION BY PASSWORD = ‘Pa$$w0rd’
GO
— Backup DMK
BACKUP MASTER KEY
TO FILE = ‘C:\KeyBkup\ExportedMK.dmk’
ENCRYPTION BY PASSWORD = ‘Sq1&x!’;
GO
— Create cert for column encryption
CREATE CERTIFICATE ABCoCert
WITH SUBJECT = ‘Cert for Column Level Encryption’;
GO
— Backup cert
BACKUP CERTIFICATE ABCoCert
TO FILE = ‘C:\KeyBkup\Certbkup.cer’
WITH PRIVATE KEY
(FILE = ‘C:\KeyBkup\Certbkup.pvk’,
ENCRYPTION BY PASSWORD = ‘St40ngPw6!’);
GO
— Create symmetric key
CREATE SYMMETRIC KEY ABCoSymKey
WITH ALGORITHM = AES_256
ENCRYPTION BY CERTIFICATE ABCoCert
GO
— Grant control permission to a user or a role so the key and certificate can be used
GRANT CONTROL ON CERTIFICATE::[ABCoCert] TO [Domain\User]
GRANT VIEW DEFINITION ON SYMMETRIC KEY::[ABCoSymKey] TO [Domain\User]
GO
— Open the symmetric key
USE ABCompany
GO
OPEN SYMMETRIC KEY ABCoSymKey DECRYPTION
BY CERTIFICATE ABCoCert
GO
This sets the stage to encrypt things. Next we have to add columns that will hold encrypted data. The new columns must be the VARBINARY data type. You can do this either in SSMS or through T-SQL. I actually use a hybrid approach. I will generally add the new columns which by default will be created at the end of the list of columns already present. Once data from the target columns are encrypted into the new ones I will then drop the original unencrypted column and then rename the new encrypted column to the original column’s name. However this will make it so that the new columns will not be in the same ordinal position which if application or stored procedure code uses anything like “SELECT * FROM Employees” or like “…ORDER BY 3,2,5” then code is likely to break because the columns will not be in the expected order any more. Some people may want to put their columns back in the original order which is more complicated.
— Add new columns for encryption
USE ABCompany
GO
ALTER TABLE dbo.Employees ADD FName varbinary(128) NULL
GO
ALTER TABLE dbo.Employees ADD LName varbinary(128) NULL
GO
ALTER TABLE dbo.Employees ADD Add1 varbinary(128) NULL
GO
ALTER TABLE dbo.Employees ADD cty varbinary(128) NULL
GO
ALTER TABLE dbo.Employees ADD st varbinary(128) NULL
GO
ALTER TABLE dbo.Employees ADD zip varbinary(128) NULL
GO
ALTER TABLE dbo.Employees ADD ss varbinary(128) NULL
GO
After the new columns are added, I will generally use SSMS to reorder the columns. Just right click on the table then go to Design and move the columns into the order that you want them. Yes this does simply drop and recreate the table but it does it for you including dropping and recreating column defaults, indexes, foreign keys and anything else on the table. If you don’t want to have SSMS do it all then you can make it generate the code to do all the work at a later time. I also tend to change the column order before adding encrypted data to the new columns so that it is having to work on as little data as possible. You will have to change the column order when users are not in the database so plan an outage.
Next we encrypt all PII data by pushing data from the unencrypted columns and using ENCRYPTBYKEY to encrypt and update the new varbinary columns. I found that this step is what can take quite a while. The last time I did this exercise I was working on a 10 million record table and encrypted about the same number of columns as in this example. It took over 6 hours to complete. So, plan on lots of time to do this if you have a large table.
— Encrypt PII columns
UPDATE Employees
SET FName = ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, FirstName) )
GO
UPDATE Employees
SET LName = ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, LastName) )
GO
UPDATE Employees
SET Add1 = ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, Address) )
GO
UPDATE Employees
SET cty = ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, City) )
GO
UPDATE Employees
SET st = ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, StateProv) )
GO
UPDATE Employees
SET zip = ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, PostalCode) )
GO
UPDATE Employees
SET ss = ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, SSN) )
GO
Now it’s time to drop the old unencrypted columns and rename the new encrypted columns to the original name. Be aware that if you drop a column that is part of an existing index, you will have to either remove the column from the index or you will have to drop the index first before you can drop and rename the column. In my case I have to drop my non-clustered index then I can rename things. I really am not going to need those columns to be part of an index any longer since indexing an encrypted column will not help query performance.
DROP INDEX [IX_Employees_Lname_Fname] ON [dbo].[Employees]
GO
ALTER TABLE [dbo].[Employees] DROP COLUMN [FirstName]
GO
EXECUTE sp_rename N’dbo.Employees.FName’, N’FirstName’, ‘COLUMN’
GO
ALTER TABLE [dbo].[Employees] DROP COLUMN [LastName]
GO
EXECUTE sp_rename N’dbo.Employees.LName’, N’LastName’, ‘COLUMN’
GO
ALTER TABLE [dbo].[Employees] DROP COLUMN [Address]
GO
EXECUTE sp_rename N’dbo.Employees.Add1′, N’Address’, ‘COLUMN’
GO
ALTER TABLE [dbo].[Employees] DROP COLUMN [City]
GO
EXECUTE sp_rename N’dbo.Employees.cty’, N’City’, ‘COLUMN’
GO
ALTER TABLE [dbo].[Employees] DROP COLUMN [StateProv]
GO
EXECUTE sp_rename N’dbo.Employees.st’, N’StateProv’, ‘COLUMN’
GO
ALTER TABLE [dbo].[Employees] DROP COLUMN [PostalCode]
GO
EXECUTE sp_rename N’dbo.Employees.zip’, N’PostalCode’, ‘COLUMN’
GO
ALTER TABLE [dbo].[Employees] DROP COLUMN [SSN]
GO
EXECUTE sp_rename N’dbo.Employees.ss’, N’SSN’, ‘COLUMN’
GO
Demo Query – with encryption
The query being used to test performance now has to change to be able to display encrypted data. We have to use the DECRYPTBYKEY function to decrypt data especially so that we can search for values. If we don’t decrypt then searching for LastName = ‘Bell’ will return 0 records. The query now looks like this:
OPEN SYMMETRIC KEY ABCoSymKey DECRYPTION
BY CERTIFICATE ABCoCert
GO
SELECT EmpID,
CONVERT(VARCHAR(60),DECRYPTBYKEY(FirstName)),
CONVERT(VARCHAR(60),DECRYPTBYKEY(LastName)),
CONVERT(VARCHAR(3),DECRYPTBYKEY(StateProv)),
CONVERT(VARCHAR(12),DECRYPTBYKEY(SSN))
FROM Employees
WHERE CONVERT(VARCHAR(60),DECRYPTBYKEY(LastName)) = ‘Bell’
Don’t forget … Before running any query that encrypts or decrypts you have to open the symmetric key. I did this before running the above query. Here are the performance results of the query with encrypted columns and no indexes.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 3 ms.
(29 row(s) affected)
Table ‘Employees’. Scan count 1, logical reads 872, physical reads 0
SQL Server Execution Times:
CPU time = 32 ms, elapsed time = 107 ms.
The logical reads increased by 800 reads and the Execution time for both CPU and elapsed went up. This could be more efficient if it had a better index to use. The clustered index was used to scan the table. However searches are commonly performed on last name and\or first name and those columns are now encrypted and not good candidates for indexes. In my case in working with the 10 million record table, the query response went from 200 ms to 3 minutes to get my query to return results. This was unacceptable!! What to do????……
Partial plain-text value solution (PPV)
In researching this problem I came across the partial plain-text value solution that worked really well. Partial plain-text means to take part of the value in the encrypted column(s), obfuscating the value by hashing it in some way and placing the obfuscated value into a searchable unencrypted column that can be indexed. Searches would be done using the searchable column and will make performance better.
Add a column to hold the searchable value.
— This column will be used for name searches
ALTER TABLE dbo.Employees ADD NSrch varchar(50) NULL
GO
Add a stored procedure or function that is used when a new value is put into the column. In my case I made a random string that can be used to make the partial values virtually unrecognizable.
CREATE PROCEDURE dbo.StrRand
@RandStr VARCHAR(44) OUTPUT
AS
DECLARE @Len INT
DECLARE @PoolLength INT
DECLARE @LoopCount INT
DECLARE @CharPool VARCHAR(200)
SET @Len = RAND() * 20 + 15 — Set length here – min length 15, max length 35
— define allowable characters – easy to read this way and easy to omit easily confused chars
— can be a string of any characters
SET @CharPool = ‘abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ’ — Alpha characters only
SET @PoolLength = LEN(@CharPool)
SET @LoopCount = 0
SET @RandStr = ”
— loop through grabbing random chars and adding to the random string until it reaches max length
WHILE (@LoopCount < @Len) BEGIN
SET @RandStr = @RandStr + SUBSTRING(@Charpool, CONVERT(INT,RAND()*@PoolLength), 1)
SET @LoopCount = @LoopCount + 1
END
GO
Now do a mass update using the code below of the NSrch column to create a column that can be used in searches. In this case I have decided to concatenate the first 3 characters of the last name + the random string + the last 3 characters of the first name as my searchable value. A record that holds info for the employee Julia Bell would have a searchable value that would not be encrypted that looks like this – BeldewAqpqbxBRfjsjdSBASJul. I know…I know…This is done using a dreaded cursor but it works for a mass update. I ran this a few times to get all the records in the table set up with a searchable column.
OPEN SYMMETRIC KEY ABCoSymKey DECRYPTION
BY CERTIFICATE ABCoCert
GO
CREATE TABLE #namework (empid INT, namesearchable VARCHAR(50))
DECLARE @col1 INT
DECLARE @col2 VARCHAR(60), @col3 VARCHAR(60), @strrand VARCHAR(50)
DECLARE srchcur CURSOR FOR
SELECT TOP 1000 EmpID, — set TOP x number for the number of records to work on
LEFT(CONVERT(VARCHAR(60),DECRYPTBYKEY(FirstName)),3),
LEFT(CONVERT(VARCHAR(60),DECRYPTBYKEY(LastName)),3)
FROM Employees
WHERE NSrch IS NULL
OPEN srchcur
WHILE (1=1)
BEGIN
EXECUTE StrRand @RandStr = @strrand OUT
FETCH NEXT FROM srchcur INTO @col1, @col2, @col3
IF @@FETCH_STATUS < 0 BREAK
IF LEN(@col2) < 3 –check to see if FirstName is less than 3 characters, if so then add a space
SET @col2 = @col2+SPACE(1)
IF LEN(@col3) < 3 –check to see if LastName is less than 3 characters, if so then add a space
SET @col3 = @col3+SPACE(1)
SET @strrand = @col3+@strrand+@col2
INSERT INTO #namework VALUES (@col1, @strrand)
END
CLOSE srchcur
DEALLOCATE srchcur
UPDATE t1
SET t1.NSrch = t2.namesearchable
FROM Employees t1
JOIN #namework t2
ON t1.EmpID = t2.empid
DROP TABLE #namework
GO
And finally, add an index to the NSrch column.
— add index to NSrch column to make searches on column faster
CREATE NONCLUSTERED INDEX IX_Employees_NSrch ON [Employees](NSrch)
GO
Here is what the NSrch column looks like with the encrypted first and last name columns…
Demo Query – with encryption and searchable column
The query that was originally looking for employees with the last name = Bell has to be changed a bit to use the searchable column in the WHERE clause.
OPEN SYMMETRIC KEY ABCoSymKey DECRYPTION
BY CERTIFICATE ABCoCert
GO
SELECT EmpID,
CONVERT(VARCHAR(100),DECRYPTBYKEY(FirstName)) AS FirstName,
CONVERT(VARCHAR(100),DECRYPTBYKEY(LastName)) AS LastName,
CONVERT(VARCHAR(100),DECRYPTBYKEY(StateProv)) AS StateProv,
CONVERT(VARCHAR(100),DECRYPTBYKEY(SSN)) AS SSN
FROM Employees
WHERE NSrch like ‘Bel%’
Here are the performance results of the query with encrypted columns and using the index on the NSrch column.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
(29 row(s) affected)
Table ‘Employees’. Scan count 1, logical reads 100, physical reads 0
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 95 ms.
Performance is much better! It is not as good as when there was no encryption but that is to be expected. The index on the NSrch column was used and less work was performed which is important. I put together a table of all the query runs so it is easier to see. On my 10 million record table that was encrypted using a searchable column in the query made response time go from 3 minutes back to a little over 320 ms. On larger tables using an unencrypted searchable column in query WHERE clauses will be very important.
Changes to code because of encryption
Your application or database code would have to be changed to decrypt encrypted data and encrypt new data anywhere that encrypted columns are used in code. Additionally the symmetric key must be opened and as a best practice should be closed when sessions are finished querying data.
— Open symmetric key
OPEN SYMMETRIC KEY ABCoSymKey DECRYPTION BY CERTIFICATE ABCoCert
— Decrypt
CONVERT(VARCHAR(50),DECRYPTBYKEY(<<encrypted column>>))
— Encrypt
ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, <<encrypted column>>))
— Close symmetric key
CLOSE SYMMETRIC KEY ABCoSymKey
Clean up
As a last thing I always like to give the statements that remove keys should you ever want to.
— Remove encryption keys
USE ABCompany
GO
CLOSE SYMMETRIC KEY ABCoSymKey
GO
DROP SYMMETRIC KEY ABCoSymKey
GO
DROP CERTIFICATE ABCoCert
GO
DROP MASTER KEY
GO
Original PPV reference – https://wateroxconsulting.com/archives/optimizing-protected-indexes/
For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRxSupport@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. We love to talk tech with anyone in our SQL family!
Session expired
Please log in again. The login page will open in a new tab. After logging in you can close it and return to this page.
Instead of creating a new column and doing the encryption, can we not do that on the existing column?
UPDATE Employees
SET FName = ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, FirstName) )