Transformers in AI: The Cool Story of Google’s Genius Scientists

Think about the best conversations you’ve had with your friends, often unplanned and sometimes in the weirdest places. What if one of those chats changed the world? That’s what happened back in 2017 when two smart dudes at Google, Ashish Vaswani and Jakob Uszkoreit, had a game-changing chat about artificial intelligence (AI) in a hallway. Let’s dive into their awesome journey.

Marvels of mankinds genius. Cropped shot of a female scientist drawing up a molecule.

A Big Idea Begins to Grow

Vaswani and Uszkoreit were chatting about how to make Google Translate better. They came across this concept of “self-attention.” Imagine it like the alien language in the movie “Arrival.” The aliens talk in whole sentences using one symbol, and humans need to decode the whole idea.

Back then, AI translation was like reading a book word by word. This new “self-attention” idea was like reading a whole paragraph at once to get the full picture instead of each word separately. This could mean better translation and understanding of the context, which got them pretty stoked.

A Chance Team-Up

While they were talking, Noam Shazeer, a long-time Google employee, happened to overhear them. He was looking for new ideas and was kind of bored with the old ways of doing things. So, he decided to join them, and their chance meeting resulted in the “transformer” model. They later explained all about it in their cool paper called “Attention Is All You Need.”

Professor writing mathematical formulas on the chalkboard

“Attention Is All You Need” — A Game-Changer

In June 2017, they published the “Attention Is All You Need” paper, and it was a big deal. It brought in a new phase in AI called generative AI. The transformer wasn’t just used for Google Search and Translate, it was also used in other big language models like ChatGPT and Bard. It even reached beyond languages and was used in image creation, writing code, and interpreting DNA.

The Transformer’s Super Flexibility

What made the transformer really cool was how it could be used in so many different ways. Vaswani, who loved music, was stoked to find out that the transformer could even create classical piano music. It could take in different types of inputs, like sentences, musical notes, images, or even parts of proteins and work with them.

Young smart mathematician drawing on the chalkboard

The Start of a New Era

The transformer was a result of many years of hard work by scientists from Google, DeepMind, Meta (formerly Facebook), and universities from all around the world. In 2017, a team of scientists — Vaswani, Uszkoreit, Shazeer, and others — made a huge breakthrough in how we process language using AI.

The Transformer Creators Move On

Even though Google was a big name in AI, the scientists who created the transformer weren’t content to stay there. They wanted to push their ideas further and felt that Google’s way of doing things was holding them back. So, all eight of them left to start their own thing.

Serious biology scientific man working concentrated on pharmacy medical analysis with a laptop

The Impact of the Transformer

After the transformer was published, a lot of people started using it in all sorts of fields. It was so big that even if no one made any more advances in AI, we could still spend ages integrating what we’ve learned from the transformer into new products. This showed just how big an impact the transformer made.

Wrapping It Up

The transformer has been a massive deal in AI, as big as when we first got the internet or smartphones. Its creators didn’t let themselves be held back by Google and started their own companies to further develop AI. As AI continues to grow, the transformer remains at the heart of all the cool new stuff, inspiring a whole new generation of entrepreneurs to create amazing things.

Frequently Asked Questions

1. What is the “transformer” in artificial intelligence?

A transformer is a type of model used in machine learning, particularly in processing language data. It was introduced in a research paper by Google in 2017 called “Attention Is All You Need.” The model is unique because it uses a mechanism called “self-attention” or “attention,” which lets the model consider the full context of a word by looking at all the words in the sentence at once.

2. Who were the scientists behind the transformer model?

The transformer model was the brainchild of eight scientists from Google: Ashish Vaswani, Noam Shazeer, Jakob Uszkoreit, and others. Their collaboration led to a significant breakthrough in the field of natural language processing.

3. How is the transformer model used?

The transformer model is used in a wide range of AI applications. Initially designed to improve Google Search and Translate, it has been implemented in other large language models like ChatGPT and Bard. Its uses extend beyond language, influencing areas like image generation, code creation, and even DNA interpretation.

4. Why did the creators of the transformer model leave Google?

The creators of the transformer model left Google because they felt that its structure was hindering risk-taking and rapid product launches. They were eager to bring their ideas to market more quickly and without constraints, which led them to start their own AI-powered companies.

5. What is the impact of the transformer model on the field of AI?

The transformer model had a profound impact on AI, marking the beginning of a new era in artificial intelligence called generative AI. Its influence extended to various industries, and it continues to be a vital part of cutting-edge AI applications, driving the development of new products and technologies.

Sources Financial Times

A Big Idea Begins to Grow

A Chance Team-Up

“Attention Is All You Need” — A Game-Changer

The Transformer’s Super Flexibility

The Start of a New Era

The Transformer Creators Move On

The Impact of the Transformer

Wrapping It Up

Frequently Asked Questions

Related Posts

The Battle Over AI in Video Games

Making AI Accessible to Everyone

Stock Fall Due Concerns Tech Growth