{"id":617,"date":"2020-03-03T15:59:13","date_gmt":"2020-03-03T15:59:13","guid":{"rendered":"https:\/\/i-spark.nl\/uncategorized\/zo-voorkom-je-de-verwerking-van-persoonsgegevens-deel-1-2-toolonafhankelijk-blacklisten-en-whitelisten\/"},"modified":"2026-01-02T11:18:17","modified_gmt":"2026-01-02T11:18:17","slug":"this-is-how-you-prevent-the-processing-of-personal-data-part-1-2-tool-independent-blacklists-and-whitelists","status":"publish","type":"post","link":"https:\/\/i-spark.nl\/en\/blog\/this-is-how-you-prevent-the-processing-of-personal-data-part-1-2-tool-independent-blacklists-and-whitelists\/","title":{"rendered":"How to prevent the processing of personal data (part 1\/2)"},"content":{"rendered":"<p><a href=\"https:\/\/www.marketingfacts.nl\/berichten\/dataverzameling-en-privacy-blacklisten-en-whitelisten-in-google-analytics\"><em>Published by Marketingfacts<\/em><\/a><\/p>\n<p><strong>Read part 2 <a href=\"https:\/\/i-spark.nl\/en\/blog\/this-is-how-you-prevent-privacy-sensitive-data-collection-part-2-2-blacklisting-and-whitelisting-in-google-analytics\/\">here<\/a>!<\/strong><\/p>\n<h2>Tool-independent blacklists and whitelists<\/h2>\n<p><span style=\"font-weight: 400;\">Since 25 May 2019, the General Data Protection Regulation (AVG) applies. Within the AVG, a broad definition is used for the concept of &#8216;personal data&#8217;, namely &#8216;all information relating to an identified or identifiable natural person&#8217;. This refers to data that directly relates to a person, or data that, in combination with other data, can be traced back to this person.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Personal data may only be processed for the purpose for which it was obtained. This is called &#8216;purpose limitation&#8217;. In order to prevent organizations from collecting personal data without being able to justify this properly, they must implement the principles of privacy by design and privacy by default. Privacy by design means that an organization ensures that personal data is properly protected when designing products and services. Privacy by default means that an organization must take technical and organizational measures to ensure that, as a standard, it only processes personal data that is necessary for the specific purpose it wants to achieve. This means, for example, that it is not permitted to ask for more data than is necessary to subscribe to a newsletter.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">There is a chance, however, that you may process personal data unlawfully, without you being aware of this. For example, you use tools or scripts on your website that measure which URL a visitor is on. This URL can contain an email address, for example, if the customer visits your website via a customer e-mail.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In this two-part blog, I would therefore like to show you what measures you can take to prevent the unintentional collection and processing of personal data on your website as much as possible. In doing so, personal data is overwritten, not deleted &#8211; so if certain data is collected unintentionally, you can easily recognize it and trace its source.\u00a0\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In part 1, I introduce the &#8216;PII prevention matrix&#8217; and show you how you can use Google Tag Manager to blacklist and whitelist personal data in a tool-independent way &#8211; i.e. not for every script or tool again, but only once for all scripts and tools. The developed measures are intended for technical web analysts who do not shy away from using Google Tag Manager and JavaScript. I also use regular expressions (regex) to describe patterns.<\/span><\/p>\n<h3><b>The PII-prevention matrix<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">I classify the measures you can take to prevent the processing of personal data according to two classifications. Together, they form a matrix which I call the PII prevention matrix &#8211; here the abbreviation PII refers to the term &#8216;Personal Identifiable Information&#8217;, which is more or less equivalent to the term &#8216;personal data&#8217;. The classifications are: <\/span><\/p>\n<p><b>1) Tool-independent vs. tool-dependent.<\/b><\/p>\n<p><span style=\"font-weight: 400;\">In a tool-independent solution, the personal data must be replaced for each new tool. The order in which different events take place is as follows:\u00a0<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Data, such as a URL become available<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Scripts and\/or tools are loaded<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Each script and\/or tool replaces the personal data<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Each script and\/or tool processes the data that has been stripped of personal data<\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400;\">Because the tool-dependent solution requires step III to be performed several times, this is rather inefficient. In addition, some scripts and\/or tools do not even offer the possibility to edit data such as URLs before they are actually processed by the scripts and\/or tools. The tool-independent solution offers a solution for this. With a tool-independent solution, the solution is tool-transcending and therefore only needs to be applied once. The sequence in which various events take place is as follows:\u00a0<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Data such as a URL becomes available<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Any personal data within the available data will be replaced<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Scripts and\/or tools are loaded<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Each script and\/or tool processes the data that has been stripped of personal data\u00a0<\/span><\/li>\n<\/ol>\n<p><b>2) blacklisting vs. whitelisting<\/b><\/p>\n<p><span style=\"font-weight: 400;\">With blacklisting, you define a list of data that may not be processed. Is a certain data not on the blacklist? Then it may be processed. Whitelists work the other way around: you define a list of data that can be processed. A particular item of data may only be processed if it is on the list. This makes whitelists stricter than blacklists.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The classifications above result in a quadrant of 4 solutions to prevent the processing of personal data:<\/span><\/p>\n<table style=\"height: 138px;\" width=\"707\">\n<tbody>\n<tr>\n<td><span style=\"font-weight: 400;\">\u00a0<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Blacklisting<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Whitelisting<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">Toolindependent<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Current blog<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Current blog<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">Tooldependent<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Part 2\/2<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Part 2\/2<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span style=\"font-weight: 400;\"><br \/>In the rest of this blog, I will go deeper into the tool-independent blacklisting and whitelisting of personal data using Google Tag Manager. In the next blo,g I will show you how to apply blacklisting and whitelisting on Google Analytics.<br \/><\/span><\/p>\n<h3><b>Replacing tool-independent personal data<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">Since most scripts on a page have access to the metadata of a web page &#8211; think of the URL and page title &#8211; especially these data are suitable to protect personal data in a tool-independent way. This means: before the data is available for other scripts. It regularly happens that this metadata, intentionally or unintentionally, contain personal data. Think, for example, of an e-mail address that is sent as a query parameter on the destination page of a client e-mail. Or a postal code that is sent as a query parameter from a comparison site.<\/span><\/p>\n<p><b>Tool-independent blacklisting &#8211; how do I do that?<\/b><\/p>\n<p><span style=\"font-weight: 400;\">With the solution of tool-independent blacklisting, you specify the regular expressions with which personal data must comply (the blacklist). Next, you check whether these patterns occur in the data you want to process. If this is the case, you replace the substrings that meet these patterns. You do this before the data is processed by other scripts (tool-independent).<\/span><\/p>\n<p><span style=\"font-weight: 400;\">As an illustration, I would like to show you how to replace the URL parameter &#8220;foo&#8221; and\/or &#8220;bar&#8221; with &#8220;[REDACTED]&#8221; and e-mail addresses with &#8220;[REDACTED EMAIL]&#8221;. The URL \u201c<\/span><span style=\"font-weight: 400;\">https:\/\/www.domein.nl?foo=waarde&amp;bar=waarde&amp;email=siemon@i-spark.nl<\/span><span style=\"font-weight: 400;\">&amp;foobar=waarde\u201d then becomes \u201c<\/span><span style=\"font-weight: 400;\">https:\/\/www.domein.nl?foo=[REDACTED]&amp;bar=[REDACTED]&amp;email=[REDACTED_EMAIL<\/span><span style=\"font-weight: 400;\">]&amp;foobar=waarde\u201d. Below I explain step by step how you can achieve this using Google Tag Manager:<\/span><\/p>\n<ol>\n<li><b> Define your blacklist.<\/b><\/li>\n<\/ol>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Create a new variable of the type &#8220;Custom JavaScript macro&#8221; and call it &#8220;PII&#8221;. See the example below.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Within the new variable, define an array containing an object for each type of personal data. In our example, two types of personal data are involved, namely blacklisted parameters and e-mail addresses.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Give the objects you defined for each type of personal data 3 keys: &#8216;name&#8217;, &#8216;regex&#8217; and &#8216;replacement&#8217;. For the name key, enter a string describing the type of personal data. This is especially useful for yourself. For the regex key, enter the regular expression with which the type of personal data complies. For the replacement key, enter the string with which the personal data must be replaced.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Return the defined array. <\/span><\/li>\n<\/ul>\n<table>\n<tbody>\n<tr>\n<td><span style=\"font-weight: 400;\">function(){<\/span><span style=\"font-weight: 400;\">\u00a0<\/span> <span style=\"font-weight: 400;\">var piiRegex = [{<\/span><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span> <span style=\"font-weight: 400;\"> \u00a0 \u00a0 name: &#8216;BLACKLISTED PARAMETER&#8217;,<\/span><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span> <span style=\"font-weight: 400;\"> \u00a0 \u00a0 regex: \/[?&amp;](foo|bar)=([^&amp;$#]+)\/gi,<\/span><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span> <span style=\"font-weight: 400;\"> \u00a0 \u00a0 replacement: &#8220;[REDACTED]&#8221;<\/span><span style=\"font-weight: 400;\">\u00a0<\/span> <span style=\"font-weight: 400;\">},{<\/span><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span> <span style=\"font-weight: 400;\"> \u00a0 \u00a0 name: &#8216;EMAIL&#8217;,<\/span>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span> <span style=\"font-weight: 400;\"> \u00a0 \u00a0 regex:\u00a0 \u00a0 \/(([a-zA-Z0-9_\\-\\.]+)(@|%40)([a-zA-Z0-9_\\-\\.]+)\\.([a-zA-Z]{2,5}))\/gi,<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span> <span style=\"font-weight: 400;\"> \u00a0 \u00a0 replacement: &#8220;[REDACTED_EMAIL]&#8221;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0<\/span> <span style=\"font-weight: 400;\">}];<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0<\/span> <span style=\"font-weight: 400;\">return piiRegex;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">}<\/span><\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span style=\"font-weight: 400;\">\u00a0<\/span><\/p>\n<ol start=\"2\">\n<li><b> Create a new function that replaces personal data.<\/b><\/li>\n<\/ol>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Create a new variable of the type &#8216;Custom JavaScript macro&#8217; and call it &#8216;return editorData function&#8217;. Use the JavaScript below.<\/span><\/li>\n<\/ul>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">The variable returns a function with 2 parameters: the 1st parameter is a data string, the 2nd parameter is the previously defined blacklist we called &#8220;PII&#8221;. For each of the defined personal data in the blacklist, the function checks if the regular expression occurs in the data string until there is a match. At that moment, the personal data is replaced by the corresponding value of the replacement key, and the function returns the data string, where the personal data is replaced.<\/span><\/li>\n<\/ul>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">By placing the function in a separate variable, you can always call it from any tag. Suppose you read the values of form fields and want to send them to Google Analytics. In order to spare this data from personal data before sending it to Google Analytics, you can call the newly defined function from a tag. To do so, give the data string and the defined blacklist &#8220;PII&#8221; as arguments. This will then look like this:\u00a0<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">{{return redactData function}}(datastring, {{PII}})<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0<\/span><\/p>\n<table>\n<tbody>\n<tr>\n<td><span style=\"font-weight: 400;\">function(){<\/span><span style=\"font-weight: 400;\">\u00a0\u00a0return function(data, PII){<\/span><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0for (var i = 0; i &lt; PII.length; i++){<\/span><span style=\"font-weight: 400;\">\u00a0\u00a0<\/span> <span style=\"font-weight: 400;\">data = data.replace(PII[i].regex, PII[i].replacement);<\/span><span style=\"font-weight: 400;\">};<\/span><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0return data;<\/span><span style=\"font-weight: 400;\">\u00a0\u00a0}<\/span>\n<p><span style=\"font-weight: 400;\">}<\/span><\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<ol start=\"3\">\n<li><b> Create a tag that replaces personal data.<\/b><\/li>\n<\/ol>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Create a new tag of the type &#8216;Custom HTML&#8217;. Use the JavaScript below.\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">The tag (a) reads the title and URL of the webpage on which the tag is loaded, (b) replaces the previously defined personal data in the title and URL of the webpage, using the function mentioned above (c) adjusts the URL in the browser and\/or replaces the title of the webpage, if personal data are present, and (d) sends a &#8216;piiRedacted&#8217; event to the dataLayer, together with the new URL.<\/span><\/li>\n<\/ul>\n<table>\n<tbody>\n<tr>\n<td><span style=\"font-weight: 400;\">&lt;script&gt;<\/span><span style=\"font-weight: 400;\">(function(){<\/span><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span> <span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 <\/span><span style=\"font-weight: 400;\">\u00a0\u00a0var PII = {{PII}};<\/span><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">\u00a0\u00a0var URL = {{Page URL}};<\/span><span style=\"font-weight: 400;\">\u00a0\u00a0var newURL = {{return redactData function}}(URL, PII);<\/span>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0if (newURL !== URL) {<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0window.history.replaceState({}, document.title, newURL)<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0}<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0var title = document.title;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0var newTitle = {{return redactData function}}(title, PII);<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0if (newTitle !== title) {<\/span><\/p>\n<p><span style=\"font-weight: 400;\">document.title = newTitle;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0}<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0window.dataLayer = window.dataLayer || [];<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0window.dataLayer.push({<\/span><\/p>\n<p><span style=\"font-weight: 400;\">&#8220;event&#8221;: &#8220;piiRedacted&#8221;,<\/span><\/p>\n<p><span style=\"font-weight: 400;\">&#8220;Page URL&#8221;: newURL<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0});<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">})();<\/span><\/p>\n<p><span style=\"font-weight: 400;\">&lt;\/script&gt;<\/span><\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span style=\"font-weight: 400;\">\u00a0<\/span><\/p>\n<ol start=\"4\">\n<li><b> Create a trigger based on the &#8220;piiRedacted&#8221; event.<\/b><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400;\">Create a new trigger of type &#8216;Custom Event&#8217; based on the &#8216;piiRedacted&#8217; event. This event indicates the moment that all actions in the Custom HTML tag from above have been executed &#8211; at this moment, the URL and title of the webpage are completely free of personal data.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0<\/span><\/p>\n<ol start=\"5\">\n<li><b> Replace the existing &#8216;All Pages&#8217; trigger with the new &#8216;piiRedacted&#8217; trigger.\u00a0<\/b><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400;\">To ensure that tags from other scripts are not loaded until after the URL and page title have been removed from personal data, the &#8216;All Pages&#8217; trigger on existing tags should be replaced by the new trigger based on the &#8216;piiRedacted&#8217; event.<\/span><\/p>\n<h3><b>Tool-dependent whitelists &#8211; how do I do that?<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">With the solution of tool-independent whitelists, you define the data that is not personal data (the whitelist). Next, you turn them into blacklist patterns because you want to replace values that are not present in the whitelist. An example will make this clear.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">As an illustration, I want to replace the value of all URL parameters with &#8220;[REDACTED]&#8221;, except for the parameters &#8220;foo&#8221; and &#8220;bar&#8221; &#8211; this is my whitelist. Specifically, this means that the URL &#8220;https:\/\/www.domein.nl?foo=waarde&amp;bar=waarde&amp;email=siemon@i-spark.nl&amp;foobar=waarde\u201d will be replaced by \u201chttps:\/\/www.domein.nl?foo=waarde&amp;bar=waarde&amp;email=[REDACTED]&amp;foobar=[REDACTED]\u201d. Below, I explain step by step how you can achieve this using Google Tag Manager:<\/span><\/p>\n<ol>\n<li><b> Define your blacklist\u00a0<\/b><\/li>\n<\/ol>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Turn your whitelist into a blacklist. The goal is to replace all parameters except those in the whitelist.\u00a0 So specify &#8216;all parameters except the whitelist&#8217; in a regular expression. Similarly, you can specify a pattern that matches every word except those of a whitelist.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Create a new variable of the type &#8220;Custom JavaScript macro&#8221; and call it &#8220;PII&#8221;.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Within the new variable, define an array containing an object with 3 keys: &#8216;name&#8217;, &#8216;regex&#8217; and &#8216;replacement&#8217;. For the name key, enter a string describing the type of data if it does not appear in the whitelist. In this case, I use &#8216;NON-WHITELISTED PARAMETER&#8217;. For the regex key, specify the regular expression of the type of data that does not appear in the whitelist &#8211; in this case all parameters except &#8216;foo&#8217; and &#8216;bar&#8217;. For the replacement key, specify the string with which the personal data is to be replaced. A &#8220;$&#8221; followed by a number indicates the number of the capturing group whose match must be maintained in the replacement.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Return the defined array. <\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">\u00a0<\/span><\/p>\n<table>\n<tbody>\n<tr>\n<td><span style=\"font-weight: 400;\">function(){<\/span><span style=\"font-weight: 400;\">\u00a0<\/span> <span style=\"font-weight: 400;\">var piiRegex = [{<\/span><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span> <span style=\"font-weight: 400;\">name: &#8216;NON-WHITELISTED PARAMETER&#8217;,<\/span><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span> <span style=\"font-weight: 400;\">regex: \/([?&amp;](?!((foo|bar)=))[^=]+=)([^&amp;$#])+\/gi,<\/span><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span> <span style=\"font-weight: 400;\">replacement: &#8220;$1[REDACTED]&#8221;<\/span><span style=\"font-weight: 400;\">\u00a0<\/span> <span style=\"font-weight: 400;\">}]<\/span><span style=\"font-weight: 400;\">\u00a0<\/span>\n<p><span style=\"font-weight: 400;\">\u00a0<\/span> <span style=\"font-weight: 400;\">return piiRegex;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">}<\/span><\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span style=\"font-weight: 400;\">The regular expression is now more complicated by using a negative lookahead. For the enthusiasts, I like to explain the regular expression with the negative lookahead bit by bit:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">([?&amp;](?!((foo|bar)=))[^=]+=)<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">First capturing group: match a &#8220;?&#8221; or &#8220;&amp;&#8221; not followed by &#8220;foo=&#8221; or &#8220;bar=&#8221;<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">[?&amp;]<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Match a &#8216;?&#8217; or &#8216;&amp;&#8217;.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">(?!regex)<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Negative lookahead: match the above only if not followed <\/span><span style=\"font-weight: 400;\">by the regular expression.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">-((foo|bar)=)<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The string &#8220;foo&#8221; or &#8220;bar&#8221; followed by &#8220;=&#8221;.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">&#8220;foo|bar&#8221; is your whitelist!<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">[^=]+=<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Match each character except &#8220;=&#8221; at least once up to &#8220;=&#8221;\u00a0<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">([^&amp;$#]+)<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Second capturing group: match each character except &#8220;&amp;&#8221;, &#8220;$&#8221; (end of string) or &#8220;#&#8221; at least 1 time<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">The &#8216;g&#8217; means &#8216;global&#8217;. In other words, search (and replace) all matches within the string instead of just the first match.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">The &#8216;i&#8217; indicates that the regular expression is not case-insensitive. <\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">The use of the above regular expression means that the URL <\/span><span style=\"font-weight: 400;\"><br \/><\/span><span style=\"font-weight: 400;\">\u201c<\/span><span style=\"font-weight: 400;\">https:\/\/www.domein.nl?foo=waarde&amp;bar=waarde&amp;email=siemon@i-spark.nl<\/span><span style=\"font-weight: 400;\">&amp;foobar=waarde\u201d gives two matches, namely \u201c&amp;email=siemon@i-spark.nl\u201d en \u201c&amp;foobar=waarde\u201d. After all, 2 parameters do not match the whitelist within the negative lookahead, namely &#8220;email&#8221; and &#8220;foobar&#8221;. I only want to replace the parameter value by &#8220;[REDACTED]&#8221; and therefore the 1st capturing group of each match &#8211; &#8220;&amp;foo=&#8221; and &#8220;&amp;bar=&#8221; &#8211; be preserved. I do this by replacing the full regex matches with &#8220;$1[REDACTED]&#8221;.<\/span><\/p>\n<p><b>Steps b to e remain the same for the tool-independent whitelisting, as described above for the tool-independent blacklisting.<\/b><\/p>\n<p><b>Takeaways<\/b><\/p>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">The Personal Data Authority uses a broad definition for &#8216;personal data&#8217;. This requires measures to prevent the processing of personal data as much as possible. A combination of blacklisting and whitelisting is recommended.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">The &#8216;replace&#8217; function in combination with a regular expression makes it easy to replace personal data. This method can be applied tool-independently before other scripts are loaded and is therefore very efficient.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">The tool-independent solution using regular expressions supports both blacklists and whitelists. For whitelisting, you can use a negative lookahead.<\/span><\/li>\n<\/ul>\n<p>Would you like to have help implementing the above-mentioned solutions? Or do you want advice on what is the best solution for your business? <strong><a href=\"https:\/\/i-spark.nl\/en\/contact-us\/\">Get in touch!<\/a><\/strong><\/p>\n<p><strong>Read part 2 <a href=\"https:\/\/i-spark.nl\/en\/blog\/this-is-how-you-prevent-privacy-sensitive-data-collection-part-2-2-blacklisting-and-whitelisting-in-google-analytics\/\">here<\/a>!<\/strong><\/p>\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Published by Marketingfacts Read part 2 here! Tool-independent blacklists and whitelists Since 25 May 2019, the General Data Protection Regulation (AVG) applies. Within the AVG, a broad definition is used for the concept of &#8216;personal data&#8217;, namely &#8216;all information relating to an identified or identifiable natural person&#8217;. This refers to data that directly relates to [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":8496,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[8],"tags":[309,311,307,301,308,310],"class_list":["post-617","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog","tag-blacklists","tag-gdpr","tag-personal-data","tag-privacy-2","tag-too-independent","tag-whitelists"],"acf":[],"_links":{"self":[{"href":"https:\/\/i-spark.nl\/en\/wp-json\/wp\/v2\/posts\/617","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/i-spark.nl\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/i-spark.nl\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/i-spark.nl\/en\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/i-spark.nl\/en\/wp-json\/wp\/v2\/comments?post=617"}],"version-history":[{"count":13,"href":"https:\/\/i-spark.nl\/en\/wp-json\/wp\/v2\/posts\/617\/revisions"}],"predecessor-version":[{"id":10181,"href":"https:\/\/i-spark.nl\/en\/wp-json\/wp\/v2\/posts\/617\/revisions\/10181"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/i-spark.nl\/en\/wp-json\/wp\/v2\/media\/8496"}],"wp:attachment":[{"href":"https:\/\/i-spark.nl\/en\/wp-json\/wp\/v2\/media?parent=617"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/i-spark.nl\/en\/wp-json\/wp\/v2\/categories?post=617"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/i-spark.nl\/en\/wp-json\/wp\/v2\/tags?post=617"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}