Mask PII
In this recipe, we'll show you how to mask personally identifiable information (PII) captured in your application logs.
Guide setup
This guide assumes you've created a Datable.io account.
This recipe is best implemented after standardizing your log data, and makes a great compliment to the standardize logs recipe.
Code Sample
We've noticed that some of our log data includes PII - a payment service is logging credit card information. We need to make sure it doesn't reach our data infrastructure to ensure regulatory compliance.
First, we create a new pipeline, or a new transformation step if we're adding to an existing pipeline.
You will see the following pre-populated code in your transform step:
/***
* You have access to the following inputs:
* - `metadata`: { timestamp, datatype }
* -> datatype is a string, and can be 'logs' or 'traces'
* - `record`: { resource, body, ... }
*/
// These are the key attributes of an opentelemetry formatted record
const { attributes, resource, body } = record
const { timestamp, datatype } = metadata
// Here we only allow records tagged as 'logs' to pass through,
if (datatype !== 'log') return null
/***
* You have access to the following inputs:
* - `metadata`: { timestamp, datatype }
* -> datatype is a string, and can be 'logs' or 'traces'
* - `record`: { resource, body, ... }
*/
// These are the key attributes of an opentelemetry formatted record
const { attributes, resource, body } = record
const { timestamp, datatype } = metadata
// Here we only allow records tagged as 'logs' to pass through,
if (datatype !== 'log') return null
Pass-through unrelated data
Next, we want to narrow down our data stream to just the service that is emitting sensitive information. All other records are forwarded to the next step.
if (record.resource['service.name'] !== "paymentService") return record
if (record.resource['service.name'] !== "paymentService") return record
Identify and mask credit card information
Now that we've passed irrelevant logs forward, we can write logic against the payment service to mask the credit card information.
if (!regex.creditCard.test(record.body)) return record;
record.body = record.body.replace(regex.creditCard, match => {
return 'x'.repeat(match.length - 4) + match.slice(-4);
});
return record
if (!regex.creditCard.test(record.body)) return record;
record.body = record.body.replace(regex.creditCard, match => {
return 'x'.repeat(match.length - 4) + match.slice(-4);
});
return record
Credit card numbers embedded as text in log messages will now follow PCI-DSS standards, masking all but the last four digits.
Datable makes it easy to perform any transformation on your data with pure JavaScript. Check out our recipes on tail-based sampling for log data to start reducing your cloud costs.