How a fast-growing fintech improved GDPR compliance with Atlan in hours, not months
At a Look
- Tide, a UK-based digital financial institution with almost 500,000 small enterprise prospects, sought to enhance their compliance with GDPR’s Proper to Erasure, generally generally known as the “Proper to be forgotten”.
- After adopting Atlan as their metadata platform, Tide’s knowledge and authorized groups collaborated to outline personally identifiable data with the intention to propagate these definitions and tags throughout their knowledge property.
- Tide used Atlan Playbooks (rule-based bulk automations) to mechanically determine, tag, and safe private knowledge, turning a 50-day guide course of into mere hours of labor.
Tide, a mobile-first monetary platform based mostly within the UK, affords quick, intuitive service to small enterprise prospects. Knowledge is essential to Tide, having supported its unimaginable progress to now almost 500,000 prospects in simply eight years. However in monetary providers, knowledge acutely presents danger and calls for cautious and fastidious safety of delicate monetary data. These dangers solely enhance as enforcement of GDPR will increase, with nine-figure fines levied towards offending companies in simply the previous few years.
Recognizing the immense alternatives introduced by knowledge, Tide’s CEO, Oliver Prill, recruited Hendrik Brackmann to construct an information science workforce. “The ambition at that time wasn’t a lot to construct an information group. It was about the place we might use machine studying at Tide”, Hendrik shared, “however it shortly grew to become clear you can’t notice that when you don’t have an information platform.”
The journey towards knowledge maturity was a frightening one. Initially reporting into the Finance workforce at Tide, the information platform workforce consisted of simply two workers. It grew to become Hendrik’s duty to develop not simply a sophisticated knowledge science workforce, however to decide on the appropriate knowledge platform expertise, and to suggest, construct, and scale knowledge and reporting groups.
“We appeared very deeply into how our group ought to look,” stated Hendrik. “We made quite a lot of adjustments, from splitting roles between analytics engineers and analysts, to beginning an information governance workforce.” And together with personnel progress and a extra mature assist mannequin to assist Tide’s progress, Hendrik ensured that his workforce was aligned to enterprise wants, delivering transformational options like a transaction monitoring system, assist for income identification, and machine studying–powered danger scoring.
In simply 4 years, Hendrik grew the perform to a workforce of 67 throughout knowledge engineering, analytics, knowledge science, and governance. It was throughout this time of maximum progress that Hendrik acknowledged room for enchancment: “We grew in a short time, and we noticed we weren’t as environment friendly as we thought.”
Whereas Tide’s knowledge workforce had matured by leaps and bounds, as a regulated entity, compliance was a excessive precedence that demanded enormous effort and a focus. “The authorized workforce not often spoke with the engineering features. It was a bit remoted,” Hendrik stated.
Early Days of Knowledge Governance
Recognizing that collaboration between authorized and technical groups had to enhance, Hendrik started trying to find an information governance knowledgeable. He met Michal Szymanski, who would turn into Tide’s Knowledge Governance Supervisor. “The preliminary thought was to rent Michal as a bridge to the privateness perform,” Hendrik remarked.
Michal joined Tide as a one-man workforce. “My scope of tasks elevated so much,” stated Michal. “I needed to cope with an unlimited array of challenges, ranging from understanding the place knowledge governance might assist in such a corporation.” He started by making an attempt to know his stakeholders’ wants. “I needed to begin by interviewing many individuals throughout completely different enterprise areas to know what they wanted.”
Based in 2016, Tide had little of the technical debt or legacy expertise that sometimes burdens conventional monetary providers organizations. Their knowledge stack consisted of dbt, Airflow, and Snowflake, with Looker downstream as their Enterprise Intelligence (BI) layer. Whereas Tide had invested in the appropriate expertise, Michal discovered that his colleagues discovered it obscure how knowledge traveled throughout their stack.
Hendrik noticed this problem as a possibility for progress.
We wished to embed knowledge safety and privateness into our working processes, moderately than discussing it on the finish of initiatives.
By combining Michal’s new governance perform, an understanding of knowledge lineage, and customary definitions of knowledge, they might obtain the collaboration that they had been lacking.
Hendrik and Michal started trying to find an answer. Summarizing the trail ahead, Michal defined, “We wanted to have a platform the place we might put all such fascinating data to assist customers navigate the information that we have now. So my first job was to determine an information catalog.”
Including a Context Layer
After a radical analysis of the market, Hendrik and Michal selected Atlan as their knowledge catalog.
[Atlan] built-in seamlessly with all of our instruments, and we felt it was very straightforward to make use of.
Beginning with a couple of key drawback statements, Tide carried out Atlan to enhance knowledge discovery, visibility, and governance within the brief time period, and democratize knowledge entry and understanding in the long term. To begin, Hendrik ensured that Atlan was correctly built-in with their knowledge stack, and was capturing all related metadata.
With Atlan, technical and non-technical customers might discover the appropriate knowledge asset for his or her wants, shortly and intuitively, lowering the time it as soon as took to seek out, discover, and use knowledge throughout instruments like Snowflake, Looker, and dbt. Utilizing Atlan’s knowledge glossary and metrics, Tide started to get pleasure from higher context surrounding their knowledge domains, which set the stage for standardizing classifications of delicate knowledge like personally identifiable data. And lastly, Atlan’s automated lineage added transparency so Hendrik’s workforce might perceive the place knowledge got here from, the way it reworked all through the information pipeline, and the place it was finally consumed — one thing they couldn’t do earlier than.
Tide grew to make use of Atlan to assist a big selection of customers and enterprise models, from Authorized and Privateness, to Knowledge Science, Engineering, Governance, and BI colleagues. With improved context, greater belief in knowledge, and democratized entry to Tide’s knowledge, Hendrik started to think about new use instances: “We had been seeking to determine how we might drive course of efficiencies in our analytics and engineering groups.”
With a 360-degree view of their knowledge property, the stage was set for Hendrik’s workforce to construct broader, extra mission-critical options.
The GDPR Problem
After utilizing Atlan to higher perceive their knowledge property, Hendrik’s workforce was able to assist a vital use case.
“Like each firm, we should be compliant with GDPR,” stated Michal. And a key part of GDPR compliance is the appropriate to erasure, extra generally generally known as the “Proper to be forgotten”, which supplies Tide’s prospects throughout the European Union and the UK the appropriate to ask for his or her private knowledge to be deleted.
Tide’s knowledge workforce understood these obligations nicely, however the means of compliance was tough.
Our manufacturing assist workforce had a script, and every time somebody wished to delete knowledge, they might undergo our back-end databases and delete private knowledge fields.
And whereas the assist workforce’s script managed a major quantity of knowledge deletion, guide effort was wanted to seek out and delete knowledge that persevered elsewhere in secondary methods that had native projections of the private knowledge fields. Michal defined, “The method was not capturing knowledge from all the brand new sources that stored showing within the group, simply the important thing knowledge supply.”
Complicating this problem was an absence of shared definitions of non-public knowledge, with differing opinions on what constituted personally identifiable data throughout organizations from Authorized to IT. This meant that finishing the “Proper to be forgotten” course of concerned steadily re-litigating definitions.
Whereas Tide was doing its finest to adjust to GDPR, as its expertise stack and structure grew extra sophisticated, new services and products had been launched, and prospects elevated over time, the compliance course of took solely extra effort and time.
Automating this course of grew to become a precedence. In a great world, when a buyer exercised their proper to be forgotten, a single click on of a button would mechanically determine and delete or archive all knowledge concerning the buyer in accordance with GDPR. Immense guide effort, and the danger of delays or human error, can be eradicated.
That’s precisely what Hendrik set his workforce to do.
Driving Frequent Understanding
Earlier than pouring assets into fixing the issue, Hendrik and Michal wanted to justify the trouble to their colleagues. “It required element to be introduced to senior leaders with the intention to determine that we’d make investments money and time in fixing such an issue,” stated Michal. “That was essential, as a result of nobody actually needs to take a position except it means some enhance of income or value financial savings. We stated we will keep away from fines and we will be certain that the corporate is dealing with private knowledge at a excessive stage.”
The case was so sturdy that fixing the issue grew to become a workforce OKR. With their purpose in hand, Hendrik requested his workforce to know the issue in better element: “The very first step was to determine the place we had this sort of knowledge, then figuring out possession.”
In his position as a bridge between the information workforce and its enterprise counterparts, Michal labored with the Authorized workforce to determine what did or didn’t represent private knowledge. And to make sure the groups had been collaborating easily, Hendrik established a cross-functional working group. “It’s simply getting the appropriate folks in a room after which getting them to speak,” stated Hendrik. “Our largest contribution was bringing folks collectively and retaining them targeted.”
By bringing technical groups and area consultants collectively, Hendrik ensured each voice was heard and that his workforce remained targeted on collaboratively delivering worth, moderately than arcane technical ideas. Recalling an instance of how strongly the workforce collaborated, Hendrik shared, “We had our privateness lawyer on the decision after we mentioned structure. He might reply any questions that may come up straight.”
With these definitions in hand, Hendrik and Michal started evaluating them towards current documentation and processes. “There have been a few locations the place completely different folks had been attempting to record private knowledge. So the entrance finish workforce did this, and the again finish workforce did that. Some product managers did the identical, they usually weren’t constant,” Michal defined.
Additional, whereas his colleagues had command of their knowledge, they typically had bother speaking the information’s definitions — a key a part of good knowledge governance. Oftentimes, column names would function definitions. “In lots of instances, it was not exact sufficient,” stated Michal.
With clear misalignment, Tide wanted extra exact documentation and course of. Atlan introduced an easy strategy to resolve this problem. Hendrik’s workforce would take what they discovered from their analysis (together with new definitions of non-public knowledge, alternatives for enchancment, and homeowners of knowledge) and doc it as soon as and for all of their catalog.
We stated: Okay, our supply of reality for private knowledge is Atlan. We had been blessed by Authorized. Everybody, any longer, might begin to perceive private knowledge.
From 50 Days to five Hours
With their knowledge property built-in with and made navigable by Atlan, Tide used automated lineage to shortly and simply decide the place personally identifiable knowledge lived, and the way it moved by way of their structure. Beginning by figuring out the columns and tables the place private knowledge persevered, the workforce then used Atlan to trace it downstream.
Michal defined simply how helpful lineage was to the workforce: “This was very helpful. It confirmed us how a lot knowledge we have now in our knowledge warehouse, after which we might additionally extrapolate this to the upstream sources of Snowflake. We knew we had it in Snowflake as a result of it’s coming from this and this database. So we knowledgeable the groups that that they had plenty of private knowledge and we would have liked to provide you with a design.”
Subsequent, Hendrik’s workforce determined to correctly tag personally identifiable knowledge, and add their newly decided definitions. Property saved in Snowflake, like account numbers, electronic mail, telephone numbers, and extra, can be searchable, however correctly secured and masked within the Atlan UI.
Whereas worthwhile, the guide effort concerned was daunting. Michal defined, “Folks must go into the databases and attempt to translate my record of non-public knowledge components. There have been 31 components to seek out in our databases, and we have now greater than 100 schemas, every with between 10 to twenty tables. So it will be plenty of work to determine it.”
Making assumptions about which schemas would possibly include personally identifiable data might save time, however this wasn’t an choice. The danger concerned meant Michal and his workforce needed to be exact, looking out and tagging location-by-location, or it will show pricey.
If we had been very diligent and did it for each schema, then it will most likely be half a day for every schema. So half a day, 100 instances.
After discussing this scope with the Atlan skilled providers workforce, Michal discovered about Playbooks, a characteristic distinctive to Atlan. As a substitute of spending 50 days manually figuring out after which tagging personally identifiable data, Tide might use Playbooks to determine, tag, after which classify the information in a single, automated workflow.
Hendrik’s workforce was able to spend 50 days of effort on a job that will clarify enhancements to Tide’s danger profile. However after integrating their knowledge property with Atlan and driving consensus on definitions, they used Playbooks’ automation to perform their purpose in mere hours. Michal defined, “It was principally a couple of hours to debate what we would have liked.”
After saving almost 50 days of labor, Tide can now make additional enhancements to their course of, far ahead of anticipated.
Within the months to return, the workforce is constructing a microservices-based orchestrator to deal with requests from prospects about their private knowledge. It can then be enhanced to anonymize knowledge in accordance with GDPR requirements for de-identification and Tide’s knowledge retention obligations as a regulated enterprise. Right here, too, Atlan has helped. Tide’s engineers can construct these options extra shortly by referencing the knowledge and lineage made potential by Hendrik’s workforce and Atlan.
I’d say I bought nice help from the Atlan workforce, who had been with me on the entire journey. I’d have by no means considered Playbooks. It was urged in the appropriate means for the appropriate use case.
As for Hendrik, his workforce’s accomplishments imply the conclusion of his imaginative and prescient from the very starting of his time at Tide. “During the last 12 months, we’ve managed to maneuver nearer to the enterprise. Having the ability to create this sort of organizational change is one thing that I really feel very pleased with.”
With a major win for his workforce in hand, enabled by the appropriate expertise and guided by the appropriate technique, Hendrik shared his recommendation for fellow knowledge leaders. “Deal with enterprise worth, and the precise worth you’re producing on your group moderately than discovering a course of everybody within the business follows and adopting the identical factor. Don’t attempt to do governance in every single place. Determine what knowledge units are related to you, and give attention to these ends.”