Applications and solutions that leverage deep learning and AI are proliferating for numerous verticals, especially as the cost of related computing continues to decline.
The hype around AI is especially evident in the world of cybersecurity, where it has become a type of “silver bullet” solution that every firm makes some promise of offering.
Despite AI’s popularity in cyber for decades, success has eluded the industry as solutions are still trying to produce a significant value add.
In this article, we share the guiding philosophy for how we built our Semantic Intelligence solution and how we leverage AI to provide tangible security value to our clients.
While AI expertise is a key strength and differentiator for Concentric, what sets us apart is that data security is in our DNA.
Unlike newcomers in the field who see every challenge as a nail for the proverbial AI hammer, our team has decades of lived experience confronting and overcoming real-world cybersecurity challenges.
We’ve been thinking about the data security problem for quite some time, long before Concentric’s founding back in 2018.
In building our solution, we focused on a few core guiding principles:
- First-principles thinking. Focus on solving the challenges at hand, ignoring prevailing conventional wisdom of what’s possible or not while avoiding unnecessary complexity. It starts with truly understanding the real pain-points of practitioners and identifying where you can add tangible value.
- Relentless focus on precision. Any product that produces novel insights must confront the issue of false-positives. We had a strict requirement that anything we built kept false positives to a minimum (also known as “high-precision”). This is critical in a field like cybersecurity, which is chronically under-staffed. Practitioners cannot afford to chase after alerts that turn out to be benign.
- Enterprise-grade scale. Whatever we built had to meet enterprise-level scale and prove to be bullet-proof enough to support enterprise environments. A corollary is that the technology stack chosen should be amenable to being operationalized in a cost-sustainable manner. There is a history of AI startups ignoring this at their peril, releasing seemingly impressive demos only to eventually fold at large data volume thresholds.
Arrival of Transformers
Back in 2017, when we were trying various approaches to address data-security risk, we were unsatisfied with existing AI methods that were available. Traditional NLP approaches such as topic modeling (my PhD is in computational neuroscience topic modeling) came up short because of issues with accuracy — specifically around false-positives.
On the other hand, deep learning had made great strides in NLP, especially techniques such as Recurrent Neural Networks (RNNs) for language modeling. However, they were computationally very expensive to deploy and we were not convinced we could use them to generate practical client-value at enterprise-scale sustainably.
A new deep learning architecture called Transformers was introduced later that year that turned out to be revolutionary. We were able to prototype solutions that convinced us of its effectiveness, so we set out building Concentric. Combined with transfer learning, it gave us the starting point to build a real solution to address data security challenges.
Over the years, we have built on this foundation to develop state-of-the-art Intellectual Property and advance what’s possible in the field.
Insert Concentric MIND™
A great advantage of AI – if done right – is the virtuous cycle that allows you to start with simple models built on small datasets that you constantly refine over time. This is exactly the approach we took, bootstrapping efforts with a bunch of documents from our own personal collections. We had examples of hundreds of different document types from which to start. Over the years, that model has borne real dividends.
Concentric MIND™ is a realization of the flywheel that pulls all the data together. Today, we have amassed massive amounts of labeled and curated enterprise data, possibly one of the largest such curated collections in the world.
Harnessing the power of a massive community
As more organizations partner with us to manage their data risk, MIND™ only gets better as it learns.
How so? Feedback and input from customers.
Not only does feedback find its way into our solution via our customer success team, but as customers continue to work with the UI, the model learns and improves on the fly.
Let’s say a customer looks at a form that’s classified as a tax form, but they know it should be classified as another type of document. With a single tap of the button, that customer updates the category. As soon as they hit submit on the UI, many processes occur in the back end, cascading through Concentric’s models. If, minutes later, another customer were to have the exact same document type, it would now be correctly classified.
And bear in mind: Concentric does not store any client-specific information. Essentially, MIND™ only reads the data and then forgets it. A document’s content merely resides in our system’s memory while being processed, not on the disk. Once the document’s content is processed through deep learning models to generate a numerical representation, the content is discarded. Think of it as a one-way transformation.
Our algorithm is just one way in which Concentric excels. We believe that being the best at what we do is table stakes. What sets us apart is that we built MIND™ to solve problems as a data security company.
To see firsthand — with your own data — how you can quickly and easily deploy Concentric MIND™ to identify, classify and remediate your data, book a demo today.