AI for business: AI automation with transformers – Mantium

AI for business: AI automation with transformers

By Carolyne Pelletier

May 12, 2022   ·   4 min read

The world of AI for business and enterprise has progressed by leaps and bounds in the past ten years. AI is transforming society, starting with convolutional neural networks (CNNs). CNNs are likely a part of your day-to-day life, even if you aren’t aware of them. They are the deep learning models used for almost every computer vision task in enterprise and business settings. The advent of CNNs brought us image analysis in healthcare and all of the advances in vision tasks. With CNNs, AI is becoming more affordable and higher performing:

Since 2018, the cost to train an image classification system has decreased by 63.6%, while training times have improved by 94.4%.

Artificial Intelligence Index Report 2022

Transformer models accelerate AI advancement.

The advances resulting from transformer models are proving as revolutionary as those that resulted from CNNs. Transformer models already exceed human performance on basic reading comprehension benchmarks. However, when it comes to more complex linguistic tasks, AI systems still do not achieve human performance. For example, humans are still outperforming AI systems in the abductive natural language inference (aNLI) task. The difference is narrowing, though. In 2021, humans only performed 1 percentage point better than AI systems on aNLI, whereas in 2019, humans performed nine percentage points better. The gap has shrunk considerably.

Transformer models are bringing AI to an inflection point, just as CNNs did in 2012. If transformers continue on the same trajectory, the benefits in text-based AI will be very similar to the advantages brought to us by CNNs and image tasks.

With the recent introduction of transformer models, a revolution in the world of NLP will take place.

In the world of AI, what are transformers? 

A transformer is a type of neural network architecture that has started to catch fire. Its popularity is due to the improvements enterprises see in efficiency and accuracy with natural language processing (NLP) tasks. Interest in transformers first took off after Google researchers reported on a new technique that used the concept of “attention” in translating languages. At a high level, attention refers to the mathematical description of how things (e.g., words) relate to, complement, and modify each other. Google developers highlighted this new technique in their seminal 2017 paper, “Attention Is All You Need“. The paper showed how a transformer neural network could translate between English and French with more accuracy and in only a quarter of the training time than other neural nets.

Transformers have improved many enterprise technologies.

You might not realize it, but transformers have increased efficiency and ease of use for many technologies we all use regularly. Google search uses transformers to process approximately 6.9 billion search queries per day. The popular messaging application, WhatsApp, uses transformers to process 100 million messages per day. Drug discovery processes use transformers to organize biomedical research and explore novel drugs and therapeutics. Transformers’ content creation support can help write blog posts, social media posts, and more. As you can see, several industries utilize transformers each day to bring their product or solution to market.

Transformers are better than previous models for many reasons, including:

  • They are more computationally efficient than previous models, and at the same time, computing is getting better.
  • Since they are so efficient, transformers have an unprecedented capacity to take advantage of pre-training, thereby allowing them to learn powerful representations of language from large amounts of unlabeled text. 
  • Pre-trained transformer models are primed for any NLP task.  

Transformers can perform and improve AI and NLP business tasks and streamline enterprise workflows.

Transformer ModelsExamplesNLP tasks Enterprise use cases
Representation Model
(Also known as Encoders)

Good at capturing meaning and form of language.



– Text classification

– Extractive question answering

– Named entity recognition
– Support ticket routing for CRMs

– Analysis of support tickets for CRMs

– Find answers to queries performed on a company’s knowledge source
Generation Model
(Also known as Decoders)

Good at generating coherent text mimicking human-level writing.
– GPT-3



-Transformer XL
– Text Generation– Help editors rewrite unedited articles to use house style and tone.

Build a Twitter Bot with Mantium, Tweepy, and Heroku
Sequence-to sequence Model
(Also known as Encoder-Decoders)

Good at tasks that require a transformation on text.
– T5


– Marian
– Translation

– Generative

– Question Answering

– Summarization
– Automatically convert speech to text and store in a CRM

– Summarize customer service conversations to be stored in a CRM

Building an SMS chatbot with Twilio


Carolyne Pelletier
Carolyne Pelletier is a Senior NLP Engineer at Mantium, where she helps enterprises gain efficiencies and streamline workflows leveraging state-of-the-art language models with the Mantium platform. She holds a Master’s degree from Mila - Quebec AI Institute in Computer Science, specializing in machine learning. Carolyne is passionate about mitigating the potentially harmful effects of AI and is part of the Mantium-Mila-IBM collaboration working to correct gender bias in text.

Enjoy what you're reading?

Subscribe to our blog to keep up on the latest news, releases, thought leadership, and more.