SpyglassMTG Blog

  • Blog
  • How To Automate Data Masking in Azure Synapse

How To Automate Data Masking in Azure Synapse

How To Automate Data Masking in Azure Synapse

What is data masking?

From Microsoft:

“Azure SQL Database, Azure SQL Managed Instance, and Azure Synapse Analytics support dynamic data masking. Dynamic data masking limits sensitive data exposure by masking it to nonprivileged users.

Dynamic data masking helps prevent unauthorized access to sensitive data by enabling customers to designate how much of the sensitive data to reveal with minimal effect on the application layer. It's a policy-based security feature that hides the sensitive data in the result set of a query over designated database fields, while the data in the database isn't changed.”

So, you can apply various masks on top of your data, without changing your data to protect viewing of sensitive data.

Data Masking occurs at the table and column level and a user can only see unmasked data if they are in an Admin role or have been granted access to view unmasked data. This is not data encryption, so the data at rest is still in its original state, but the display of it is masked.

This makes masking a great feature for reporting, since the masking can be done at the database level, instead of in every report. However, note that masking does not completely protect the data. If a user puts a masked column in the where clause, the where clause will still work the same as if the column were not masked. For example, suppose the salary column is masked and a user runs the query “Select name, salary where salary = 100000.00” and the result is:

Name                 Salary

John Smith       0

Even though John Smith’s salary appears 0, the person that wrote the query knows that his salary is really $100,000.

The following table shows how data can be masked:

 adam1

How does it all work?

To create a mask on a column, you would use the following syntax:

ALTER TABLE {table name]

ALTER COLUMN [column name} ADD MASKED WITH (FUNCTION = '{masking function from above}');

Examples:

ALTER TABLE Data.Membership

ALTER COLUMN LastName ADD MASKED WITH (FUNCTION = 'partial(2,"xxxx",0)');

 

ALTER TABLE Data.Membership

ALTER COLUMN LastName VARCHAR(100) MASKED WITH (FUNCTION = 'default()');

 

To remove the data mask, use this command:

ALTER TABLE {table name}

ALTER COLUMN {column name} DROP MASKED;

You can now GRANT and REVOKE access to masked data via:

GRANT UNMASK TO {user or group};

 

REVOKE UNMASK TO {user or group};

Making it a data driven process

We have enhanced this functionality by creating a masking meta data table that will contain table and column names as well as the masking rules to use on any given column. This allows organizations to add and remove columns from being masked and can adjust how a column is masked. This table is processed by a stored procedure to dynamically affect tables/columns in the meta data table.

This is the table:

CREATE TABLE [dbo].[MASKING_META_DATA]

(

              [Data_Source] [nvarchar](50) NULL,

              [Table_Name] [nvarchar](50) NULL,

              [Field_Name] [nvarchar](500) NULL,

              [MaskRule] [nvarchar](100) NULL,

              [Is_Sensitive] [int] NOT NULL,

              [Filter] [nvarchar](2) NULL,

              [ModifiedDate] [datetime] NULL

)

Entries are added to indicate a column should be masked or unmasked:

 adam2

 

The above entries in the table will cause the data masking pipeline to mark the table/columns as masked using the MaskRule column as the masking method. If the Is_Sensitive flag is set to 1, the column will be masked; if its set to zero, it will be unmasked.

A stored procedure in the SQL dedicated pool called SP_MASK is invoked by a data pipeline on a frequent basis (weekly to start) and will mask or unmask entries contained in the masking meta data table. If nothing changes in the meta data table, no masking rules will change.

This is the source code of the stored procedure that will loop through entries in the meta data table and will mask/unmask the data. It will not perform any security functions but that could easily be added as additional columns in the meta data table for AD (Active Directory) Groups, users, etc.Note, some additional tables will be needed for logging, etc. (or you can skip the logging).

CREATE PROC [dbo].[SP_MASK] AS

BEGIN

select *, row_number()

OVER(ORDER by Field_Name,Table_Name) CH into temp from [MASKING_META_DATA]

declare @sql nvarchar(max)

declare @maskrule varchar(max)

declare @maskflag varchar(5)

declare @sql_unmask nvarchar(max)

declare @n int select @n = 1

declare @IsToBemask varchar(10) select @IsToBemask = 'Yes'

declare @cnt int select @cnt = count(*) from temp

declare @source varchar(100)

declare @table varchar(100)

declare @col varchar(500)

 

while (@cnt >= @n)

BEGIN

BEGIN

TRY

select @source = Data_Source , @table = Table_Name ,@col = Field_Name from temp

where ch = @n

select @maskflag = [Is_Sensitive], @maskrule = [Maskrule]

from [temp]

where [Table_Name] = @table

and [Field_Name] = @col

IF @maskflag = 0 and @IsToBemask = 'Yes'

BEGIN

select @sql = 'ALTER TABLE '+@source+'.['+@table+']

ALTER COLUMN '+@col+' MASKED WITH (FUNCTION = '''+@maskrule+''')'

EXEC(@sql)

 

Update [MASKING_META_DATA] set [Is_Sensitive] = 1 ,[modifieddate] = getdate() where [Table_Name] = @table and [Field_Name] = @col

END

END TRY

BEGIN CATCH

INSERT INTO ETL.MaskingErrorLog SELECT 1,getdate(),ERROR_PROCEDURE(),@table,ERROR_MESSAGE()

END CATCH

SET @N = @N+1

END

drop table temp

END

GO

Summary

Data masking is a very powerful tool in the security toolbox and is easy to implement and maintain, especially in the dynamic manner described. We have implemented this at several clients to follow security needs and regulations. For higher levels of compliance, like EU (European Union) GDPR (General Data Protection Regulation), we would implement a deeper encryption strategy, but that discussion is for another day! If you do not want to go full steam with the stored procedure and meta data table, you can still easily maintain data masking via SQL scripts that you can execute as needed. Enjoy and happy masking! If you have any questions or need help with your masking, contact us today!