How to mask sensitive data in Mule 4 using DataWeave?

As Software Integration architects and developers, our companies and clients entrust us with data. This data typically ranges from insensitive to top-secret. The level of sensitivity of a piece of data, often determines which secure practices development teams choose to use in building an integration solution.

Functional and non-functional requirements can also drive the decision the to expose or not, the entire or portions of sensitive field data while in transmission or at rest. One way this can be achieved is through masking.

Mule 4 offers the following ways to mask sensitive fields:

  1. DataWeave Functions

  2. JSON Logger - Mask fields

  3. Anypoint Security (RTF) - Tokenization Service

This article is the third in a series that focuses on security around Mule applications. The previous article focused on validating the data integrity of a downloaded file. If interested in the previous post, feel free to check it out here. This article will focus on keeping data confidential through utilizing masking functions in DataWeave.

Lets begin by introducing two DataWeave functions that can be of use and then two scenarios in which we can use them.

Introduction to Mask and Replace DataWeave Functions

  • Mask: DataWeave offers the mask function, a part of the dw::util::values module. It replaces the desired field with a masked version of it throughout the object or collection.

  • Replace: DataWeave offers the replace function, a part of the dw::core module. It replaces a portion (substring) of a String based on a regular expression with another String.

Now lets look at two use cases where we can see these functions in action.

Use Case: Mask United States Social Security Number (SSN)

The social security number is a major identification number in the United States. It is very sensitive and if someone gets a hold of a person's number, it can do significant damage to a person's identity. The following example shows how to mask a SSN.

Example Payload:

The demonstration provided in this post will come from the following payload.

 "firstName" : "Staci",
 "middleInitial" : "A",
 "lastName" : "Cane",
 "dateOfBirth" : "02/03/1999",
 "ssn" : "000-00-0000"
 "firstName": "Temi",
 "middleInitial": "O",
 "lastName": "Bukola",
 "dateOfBirth" : "10/11/20