semantha Pushes the Boundaries of AI-Based NLP with Snowflake and Accenture

Interpreting meaning within millions of words is one of the many unsolved business challenges—until now. Discover how Snowflake’s Data Cloud is helping semantic processing platform, semantha, reach its full potential—with Accenture’s support.
semantha is a semantic processing platform that understands and processes human language—at astonishing scale and speed. What started as research into speech processing and AI became a startup consisting of data scientists, businesspeople, AI researchers, and software architects.
The platform pushes the boundaries of natural language processing (NLP) with a diverse, rapidly expanding (and almost limitless) number of use cases that extends from automotive manufacturing and chemicals to HR, insurance, legal, finance, and government. semantha goes beyond simply comparing text to drive document-driven processes—it understands the meaning within words in more than 50 languages simultaneously. It also includes specific models for language combinations.
“Everything where there’s too much text and not enough time, I’d say that’s 80% of the work for companies in these sectors,” said Dr. Sven Körner, Managing Director and Founder at semantha. “We transform unstructured data, such as text, images, and videos, into semantic fingerprints. From there, we can process information not unlike how humans do. The difference between semantha and humans is semantha processes data in seconds instead of months.”
Finding the right partner
To help semantha reach its full potential with its existing 25-plus customers in a proof of concept (PoC), Körner needed a new kind of data platform partner that could match the necessary speed and scale. “We read about the Snowflake for Startups program online and were then nudged towards it in the process of building a common solution,” said Körner. “We were super-quick to get everything ready for the build from our side, and the Snowflake team acted just as fast.”
Körner added: “Snowflake gave us loads of support as well as access to product previews and domain expertise from account executives—pretty much everything. It’s been a great partnership.”
In a discussion with its business partner, Accenture, Körner and his team identified a PoC use case that would bring the strengths of the solution to life: ease of use and adaptability, speed, and scalability. Semantic text understanding can dramatically accelerate and improve the process of environmental and social governance (ESG) reporting—a process that helps stakeholders understand how organizations manage risks and opportunities related to sustainability issues.
“Lots of our clients want to tap into the wealth of contracts, toxicological reports and other important evidence on their sustainability footprint,” said Dr. Kate Sikavica, head of ESG Measurement in Accenture’s sustainability technology practice. “And they need to be quick to comply with recent changes in European legislation. There is no way to analyze all the documents manually. Snowflake is the place to organize those documents, and semantha is a powerful tool to turn them into valuable KPIs for company steering and compliance with regulations.”
Körner clarified: “The PoC needed a combination of Accenture’s domain expertise with tech-savviness, Snowflake’s technical abilities and data expertise, and our ability to make sense of unstructured data.”
With deep knowledge of ESG process and legal frameworks, Accenture was crucial to developing the PoC. As a Snowflake partner, it was another natural choice.
Building a prototype at lightning speed
Based on a very smooth integration of semantha with Snowflake, the team built a native model using an open-source UI tool at lightning speed. “We built a prototype on Snowflake and Streamlit within a week,” said Körner. “That’s unprecedented. We did this with real customer data. We integrated the API for connected apps and can run as a native app in Snowflake.” “Before this project, I had basic knowledge in Snowflake and I’m a good Python developer. With just a few example documents and KPIs from our reporting group, and some support from the semantha team, I could build the basic user interface and visualization in a single week”, said Fynn Kölling, a Data Engineer in Accenture’s Data & AI team.
By helping companies connect, find, and extract unstructured data from distributed sources, such as messages, company disclosures, and external reports, semantha’s ESG use case has incredible potential. Business users can scrape data from multiple locations and sources, such as a subsidiary database in another continent or a complicated legal framework in another language.
In this way, the platform can help companies save time, effort, and resources trying to meet ESG reporting requirements, which is currently a mostly manual process. Giving clients access to where its reporting gaps lie would also lower risk and increase speed and accuracy, helping them avoid huge penalties and reputational damage for non-compliance.
“Our plan is to assist clients in their journey towards a sustainable future. For this, they need an easy, scalable way to find ESG-related content in their own internal sources, like work contracts, policies, and other unstructured documents. The easier the integration of the tool, the better for our clients. This tool will allow us to concentrate on content and user experience instead of IT setup”, said Sikavica.
Data processing in a single platform for complete peace of mind
The principle for this use case applies across any business with document-driven processes. For automotive manufacturers preparing a request for information (RFI) report, for example, the ability to scan documents with hundreds of pages in seconds instead of weeks is transformative.
HR teams can use the platform to scan for desired skills across countless CVs, even if the exact terminology isn’t used. And for any company, it can streamline how its users map KPIs to complicated legal frameworks. Körner said: “Basically, anywhere there is too much data, not enough info. It’s like having a magic fairy on your shoulder.”
semantha’s wide-ranging use case potential means that scalability, data accessibility, security, and speed are all high on the priority list. That’s why Snowflake’s platform was a natural choice: it enables critical workloads, including seamless data collaboration, both within an organization and across its global ecosystem.
With this partnership, semantha provides access to the data for any business that’s on Snowflake. Data never leaves the Data Cloud, so businesses can securely share and access governed data, tools, applications, other technologies, and data services—while preserving privacy.
Connecting business with IT
semantha is architected on the Snowflake platform as a managed or connected application. Every Snowflake customer could run it out of the box in minutes, and with an API key, non-customers could use it, too. The next step is to make it a native app for every Snowflake customer to access as easily as possible.
semantha’s uniqueness is its ability to bridge technical and non-technical skill sets,” said Körner. “This is also its biggest benefit. It’s for everybody working with too much text, which is the whole workforce.”
Körner added: “It works out of the box. No training is needed. And it integrates directly with our customer’s data, which is where Snowflake plays a fundamental role. It solves real customer problems quickly. Being quick with results and changes is the most important part.”
Endless possibilities
For Körner and his team, finding Snowflake came at exactly the right moment. He said: “Without Snowflake, we wouldn’t have [gotten] close to what we achieved in the same timeframe. And this is just the first of many use cases. The possibilities are endless, as are the potential business connections through Snowflake.”
Snowflake Startup Challenge news: semantha is one of 10 semi-finalists in the 2023 competition! Learn more about the Startup Challenge Top 10 semi-finalists in our blog, and stay tuned for the Top 3 announcement in May.