Li Yanhong: How AI Works At Baidu

摘要: AI technology has become a must-talk topic for Li Yanhong as well as a focus for Baidu. On September 8th in the afternoon in London, Baidu’s founder Li Yanhong shared his insight into artificial intelligence.

(Chinese Version)

According to Li Yanhong, Newton, Darwin and Hawking are the Cambridge brought geniuses that have a profound impact on him. However, famous poet Xu Zhimo and the father of AI, Alan Turing, gave him the most influence.

“AI is not a new concept,” Li Yanhong stated. “As early as in the 60s there was already the term artificial intelligence. But it wasn’t until the recent decade that we have finally realized just how important it is.”

“Generally speaking, we have gone through three stages of the Internet era. The first stage was the PC Internet era, which lasted for about 15 years. The second stage was the so-called mobile Internet era, with the rising period of about four or five years. And now we are embracing the third stage, which is the AI era. Every stage has its own characteristics and therefore they also have different development speed.”

Li Yanhong believed that there are following characteristics in the three stages:

The PC Internet era relied heavily on the fast adaptation speed of software. It usually took traditional software companies at least six months to push an update to their users while Internet companies could send updates anytime.

About five years ago we started to enter the mobile Internet era. At that time, fast software update was no longer an edge as mobile apps emerged and built their own ecosystem that eventually won them the mobile Internet era. That’s one of the reasons why Baidu wanted to do O2O business in the very beginning.

In the third stage, the AI era, sound and image become a more natural way of communication, making voice interaction and image recognition the more convenient ways to provide service.

Li Yanhong stated out many cases in his speech to illustrate the application of voice recognition and AI technologies in search engines and the financial sector. “We have pretty accurate voice recognition technologies now. Baidu’s voice recognition function is 97% accurate.”

At last, he stressed that everyone would be involved more or less in the application of AI. Whether it’s the IT industry, manufacturing industry, financial industry, education industry, healthcare industry, tourism or logistics sector, every single one of them will be disrupted by AI and face challenges.

What follows is the transcript of Li Yanhong’s speech at the Cambridge Celebrity Forum:

It’s been my honor to have this opportunity to be here delivering this speech to you. I am thrilled. I am not just here to give you a speech, but also inspire people like what Newton, Darwin and Hawking had did to me.

And of course, the famous Chinese poet Xu Zhimo influenced me the most. His line “gently I go, as I gently come” is known by most Chinese people. We all remember this poet he wrote in Cambridge.

But I am not going to talk about Xu Zhimo today. And in my opinion, the most inspiring genius of all is Alan Turing for he’s the founder of modern computer science and AI. That’s why I think the University of Cambridge has a special charm. Everybody knows that I had stayed in the U.S. for 8 years where I got my master degree in computer science. I later worked at the Wall Street and the Silicon Valley. In 1999, I went back to China and founded Baidu.

Three stages of the Internet

In my opinion, the Internet has changed a great deal over the course of 16 years. In short, we have gone through two staging and are currently witnessing one. The first stage is the PC Internet era that lasted for 15 years. The second stage was the mobile internet era with a growth period of around 5 years. And now we have the AI era, the third stage. Every era has its own features and different iteration speed.

For example, I believe the PC Internet era relied heavily on the speed of updating software. I grew under that time. It usually took at least 6 months for traditional software companies to update their software. For some companies, it might be one or two years. So many people would get the impression that it would take six months to have an update for their software. But the development of Internet changed everything.

We then didn’t have to work for Internet companies like in 1997. At that I was working for Infoseek. We started to realize Internet companies were very different from software companies, especially in terms of iteration speed.

At Internet companies, we update our software constantly. We might have an update everyday. Once we update the codes and servers, every user would be able to enjoy the new services immediately. This is very different from traditional software companies. Traditional companies roll out software packages. It would take 6 months for users to get the newest update. But with Internet companies you can update your servers and software anytime.

For instance, Baidu updates its service everyday for several times, which is a constant upgrade. But that’s something users can easily notice since the search engine itself won’t change much. The search engine still works the same way. Looking back, being able to update our software anytime we want is exactly what sets us apart from the crowd in the last 16 years. That’s also the reason why most traditional companies are lagging behind Internet companies. This is the PC Internet era.

Building an ecosystem is the key to success in the mobile Internet era

About five years ago, the world entered the mobile Internet era, which is the second stage I talked about. In this era, the game changer is no longer how fast you update your software. As a matter of fact, in this era companies might only update once a month or two or three months. The time is irregular. So what exactly is the key to success in the mobile Internet era? In my opinion, it’s building your own ecosystem.

But why we didn’t have such thing in the PC era? In the PC Internet era, everything had a standard in the global market. We had http and html. The only thing you needed to focus on was technology itself since everything else was ready for you to use. All websites were open. All you need is a link to have access to all information.

But things were much more different in the mobile era. Contents were poured into numerous apps. With these apps, you no loner just possess the information, but also the capacity to do so much more. The information of the website is standard information. But when it comes to mobile language, you can even make apps that allow you to trade on your smart phone. How come we have different behavioral patterns in the PC and mobile era? That’s because for companies like Baidu, we need to serve the search demand of the user and provide a large scale trade service.

In the PC era, users have access to information via Baidu’s products. But in the mobile era, users expected something more, and we naturally did more for them. We Baidu wanted to be more competitive when users are using our service. We wanted user to be able to purchase a certain service when they find them in the search results. We wanted to enable users to make reservation directly after searching for it, freeing them from the trouble of switching between apps and webpages.

That’s very different from what we had back in the PC era. So how did we achieve it? The answer is building an ecosystem. But building an ecosystem doesn’t mean just integrating all the apps together. It took us years to finish that process.

A few years ago we tried to convert our search service for PC into a more mobile-friendly version. In the beginning we were just trying to change the interface to fit smaller screens, and adapt the slower network speed and expensive data fee. But that’s far from enough. We needed to bring about more changes. So we started to invest in vertical specific sectors such as education, healthcare, automobile, tourism, and catering etc. Thanks to these investments, we are now able to provide relevant services no matter what key words users have input. Meanwhile we are also trying to establish cooperative relationship with giants in multiple vertical sectors so as to provide our users with the best search experience.

Of course, Baidu Map is also one of our focus. Now in China, you can have access to hotel reservation service easily once you open Baidu Map. Besides that, you can also purchase groupons on Baidu’s Nuomi for dining. As a matter of fact, we got a great deal of orders after we first opened the hotel reservation service, most of which were instant booking.

In contrast to that, you usually need to book a hotel a few days or weeks ahead in advance. But in the mobile Internet era, people can just open Baidu Map and make reservation based on their location after their arrival.

This is a major change. The mobile Internet era will continue to bring Internet companies like us more possibilities. Thus, we have been rolling out apps that are fitted for the mobile Internet and cooperating with industry leaders in many vertical sectors to bring our users the best O2O experience as possible.

Consumers look for services and some services are available online while some offline. We provide the offline services that many important industries need. You can pretty much tell from that we no longer rely on a standard Internet ecosystem. We are now building a brand new ecosystem based on our demand. We need to insure a good cooperative relationship with vertical suppliers and make sure our users can purchase goods smoothly via our apps.

That’s the second stage, which was based on the development of the mobile Internet era.

The opening of AI era

Starting from this year, we are entering a new age. That’s the Internet era powered by AI technology.

Without a doubt, the AI era is different from the PC and mobile era, which you can tell from the search function. Nowadays you can not only search with key words, but also image and sound. That’s something very different. For the general public, using voice or camera to express themselves seems to be easier. We noticed the demand and we tried to satisfied it. And the key lies in AI.

We have pretty accurate voice recognition technologies now. Baidu’s voice recognition function is 97% accurate. What does this 97% mean? It means it even exceeds humans’ capacity in identifying voice. That said, voice recognition technologies are already very mature and are ready for being applied in many areas. And it’s for sure that the search service will be a great scenario for such technology.

When we entered from the PC era to the mobile era, people started to realize keyboard is not a natural way to express feelings. My generation grew up with computers and laptops so we are used to keyboards. But after smart phones became popular, a screen replaced both the keyboard and the mouse. In the beginning people thought the design was stupid. It’s slow and inaccurate. But now whenever I see my kids using their smart phone, everything seems to be so smooth and natural. That’s because tapping on a smart phone screen is more natural than using a keyboard.

In the third stage, the AI era, sound and image become a more natural way of communication, making voice interaction and image recognition the more convenient ways to provide service.

Then we also have the image recognition technology. When you see a plant and not really know what it is, you can just take a photo of it and the machine will identify it. Similar application can also be used on people. When you see someone and don’t really know who he or she is, you can photograph that person and the machine will tell you the information. What allows such function? Apparently, it’s AI. The application of AI is very wide. As early as in the 60s there was already the term artificial intelligence. But it wasn’t until the recent decade that we have finally realized just how important it is since we now have cheaper and more powerful computing power. And we also have more data compared with the past.

AI technology, empowered by cheaper computing power and big data, is penetrating our everyday lives. In the past six years, Baidu has invested a great deal in AI technologies, especially in deep learning.

So how does AI help Baidu work better?

On September 1st, we held the annual Baidu World Conference, where we launched Baidu Brain. That’s the engine of Baidu’s AI technology. Baidu Brain is linked to the core of AI, which includes voice and image recognition, natural language understanding and user profiling.

We have mentioned the voice and image recognition before and now let’s talk about natural language processing. Natural language processing is actually extremely essential since the machine has to not only understand the users’ personality but also their demand to understand their thought. This involves the natural language processing technology, which is a very unusual area. We also add user profiling into our system since we have a lot of data to utilize.

We have data on users, such as data on behavioral pattern, search patter, geo information etc., which allow us to have better understanding of our users. Thanks to these data we are also capable of meeting users’ demand. The AI technology behind it is very important to our work. It provides great assistance for us. With AI technology, we can augment our existing services: Search service, Baidu Map, Baidu Tieba etc. In fact, it can also provide many services for the developers to help them build up their own advantages.

Recently our sales team has tried many new things. Many companies have telephone sales. They are typically not that high paid and the quit rate is particularly high. So companies always have to train new sales to help them learn sales skills and how to talk to customers. And the best sales might have a better performance of ten times than a new sales. And companies will usually summarize the sales techniques and have the new staff memorize them. So it takes a lot of time for the new staff to learn these skills.

We have implemented a new system to help these new sales learn these skills. What the sales calls up a potential customer, we actually have a voice recognition that operates along. Whatever the customer says or asks, we will recognize it and shows the best way for the sales to reply. It used to be something impossible, but with voice recognition technology, it has become a reality. These sales won’t need long-lasting training to achieve a good sales performance like the seasoned sales. That’s the power of voice recognition. Just imagine what it could bring to different industries around the world.

Besides voice recognition, we also found other possibilities. We have our own financial service and Internet finance business, which could use image recognition technology to identify faces. With that a student loan process can be completed within seconds since we can match the student’s identity via the photo on the ID card. This become feasible for our technologies. We also integrated this function into our advertising system. And one of the sector that benefited from it was the education sector. We know these education organizations well and we know the income level of their graduates.

We teamed up with these education organizations and provided loans for potential students who needed it. The education organizations were satisfied with our cooperation since they would be able to find more students and earn more. The students were also satisfied since they got the money to go to school without having to worry about their financial situation. For us that’s beneficial too, because we can profit from the loan interest. All these can be attributed to the development of technologies that allow us to evaluate students’ credibility. Natural language understanding is also another tendency in the IT sector.

User profiling technology will also provide support for many industries such as the marketing sector. In June last year, Legendary Pictures asked us for assistance in promoting the upcoming movie Warcraft in the Chinese market. We then made use of our user mining technologies to appeal the movie to a greater audience. We divided our users into three groups: first group is the fans that would go watch the movie without any promotion; the second group is the undecided ones; the third is people who just wouldn’t go to the cinema for that movie in any case.

Our job was to find and identify these users and appealed to the undecided users to watch the movie. With user profiling technology, we were able to find the undecided group and launch promotion campaign accordingly. Originally we expected a 5% raise in income with this marketing method, but the result was astonishing, a 200% increase. We know our users well.

We know who they are, what they like, their income level. Through such analysis, we can achieve many things. Currently we are only exploring this technology. We have cooperated with a few partners on this. Once industries become aware of the power of user profiling technology, voice recognition, image recognition and natural language understanding, they would know the potential of related applications is immense.

Everyone would be involved more or less in the application of AI. Whether it’s the IT industry, manufacturing industry, financial industry, education industry, healthcare industry, tourism or logistics sector, every single one of them will be disrupted by AI and face challenges. Everyone, every industry should consider the possibilities and challenges AI would bring.


(Like our Facebook page and follow us now on Twitter @tmtpostenglish, Medium @TMTpost and on Instagram @tmtpost_english.)

[The article is published and edited with authorization from the author @TMTpost, please note source and hyperlink when reproduce.]

Translated by Garrett Lee (Senior Translator at PAGE TO PAGE), working for TMTpost.




Our official account in English/English Version of


Oh! no