Search Engine's Future Doesn't Lie In Speech Or Image Search, Said Sogou's CEO
摘要： How will search engine and input software look like in the future? What’s Sogou’s next goal? How will Sogou apply artificial intelligence technology to its search engine and input software? Are speech search and image search the end?
Undoubtedly, artificial intelligence is one of the most heated areas where all major Chinese internet companies are trying to make a difference, no matter to what degree they have reached and what the future trend looks like.
During the 2016 Sogou Core Partners Conference, Wang Xiaochuan, CEO of Sogou Technology, delivered a speech titled “Sogou’s Path of Artificial Intelligence” and shared with the audience Sogou’s strategic development plan at that stage. In the speech, Wang maintained that the ultimate goal of mankind in terms of artificial intelligence has always been to be able to have machines or robots communicate with people as if through natural language, and that this is also the ultimate goal of Sogou.
Wang predicted that there would be two systems in the future world of artificial intelligence: virtual world and machine intelligence, while the core lies in “getting machines involved in decision-making process of human beings”. Artificial intelligence meets two great needs of human society: for one thing, human beings get to feel the sense of being better; for another, they can turn to machines for help in the decision-making process.
Speaking of Sogou’s path of artificial intelligence, Wang believed that there was a natural relationship between artificial intelligence and search engine and that search engine will become the shiny star in the era of artificial intelligence.
“While artificial intelligence is based on big data, cloud computing and machine learning, search engine has always been the core carrier of technologies in these three areas.”
In his opinion, the future of search engine lies not in speech search or image search, but rather “Q&A”. In the era of artificial intelligence, users can ask search questions through key words, while the search engine can predict what users mean and present all possible answers accordingly.
“No matter it’s speech or image search, the key is to have machines be able to understand what you really mean at first and provide users with more diverse ways of expression accordingly.”
Therefore, Wang revealed in his speech that the next big goal of Sogou was to change the current keywords-based search engine system into Q&A-based. In fact, Sogou has always been able to adopt such system and provide 5 to 10 per cent questions with direct answers.
Speaking of change in Sogou, one of the major search engines in China, Wang revealed that speech and image search were still the most important features this year, and Sogou was prepared to upgrade its technology and provide better and more advanced and sophisticated speech and image search service for the large numbers of users.
“When we talk about artificial intelligence and Q&A-based model, Input Software is always the best gateway,” Wang said at last, believing that Sogou’s Input Software can help Sogou collect large amount of data and therefore gain an advantage in artificial intelligence area.
The following is the full transcript of Wang Xiaochuan’s speech:
In this speech, I shall not bother to talk about the current business of Sogou. Instead, I’d like to share with you Sogou’s path of artificial intelligence as well as give you a sens of Sogou’s strategies at that stage.
After AlphaGo defeated Lee SeDol for the third times, we took a day off for all Sogou’s employees to celebrate. AlphaGo’s victory ushered in a new era: the era of artificial intelligence. However, I started to learn artificial intelligence back when I was still a freshman in college, yet it is only until 2016 that a major breakthrough was made in this area. How come? Not only the artificial intelligence technology itself was upgraded, but also people as well as the market’s attitude towards artificial intelligence has changed significantly.
Before 2016, artificial intelligence technology is studied only in colleges and universities; today, however, the number of enterprises that are studying artificial intelligence technology has doubled. Indeed, artificial intelligence technology becomes applicable only till 2016.
Virtual world and machine intelligence are the two main systems artificial intelligence will evolve around in the future
My prediction is that there will be two systems in the future world of artificial intelligence: virtual world and machine intelligence. When I say virtual world, I didn't simply mean headsets. Instead, I want to talk about areas Sogou isn’t good at, such as games, fictions, music and video, because progress in these cultural and innovative industries are bringing us all into a virtual world.
Artificial intelligence meets two great needs of human society: for one thing, human beings get to feel the sense of being better; for another, they can turn to machines for help in the decision-making process. Therefore, the significance of artificial intelligence lies not in identifying and processing, but in getting robots involved in people’s decision-making process.
At the beginning of this year, some journalists asked me if Sogou was going to shift our focus on artificial intelligence? I said that wasn’t the case. While artificial intelligence is based on big data, cloud computing and machine learning, search engine has always been the core carrier of technologies in these three areas. Therefore, we are not shifting our focus on artificial intelligence, but rather evolving towards artificial intelligence. That’s why I say search engine will become the shiny star in the era of artificial intelligence. So you might ask: What does the future look like? What’s the future of search engine? The future of artificial intelligence? Why is search engine the shiny star in the era of intelligence?
What does artificial intelligence look like from the aspect of sci-fic or scientists? In the 1960s, artificial intelligence is more like a machine that can talk in a way a person can’t tell if it is a machine or a real person he/she is talking about. There are also a large number of fictions and films depicting machines that can talk like human beings, such as Baymax and robots in Interstellar series.
In Asimov’s last fiction The Last Question, he depicted an “ultimate robot” with which people use to control earth, even the Milky Way. That robot can answer all questions of human beings, except for one: What’s the origin of the universe? Near the end of the fiction, the robot figures out the answer: let there be light. Thus, the robot re-creates the universe afterwards. Therefore, for most people, artificial intelligence means more like a machine that can think and communicate with human beings.
How will search engine look like in the future?
How will search engine look like in the future? Is speech search the future? I don’t think so. There isn’t much added value in speech search. Maybe you can have the system recognize different accents, for fun, but nothing else. Is image search the future? Indeed, the development of image search, or personalized recommendation system is pretty fast in the past two years. For me, the future of search engine lies not in both of them.
What then? My answer is “Q&A”. Today, search engine is more like typing in a few keywords and getting a few answers for users to choose. However, we are not satisfied with the current user experience. Although we might be able to add some personalized features, we still don’t know if we got the right answers you want.
For example, if I ask Hong Tao (Note: Vice president of Sogou Technology): sogou, or sogou and revenue, he certainly don’t know what I am asking. Only when I ask him in natural language, such as “What does our revenue in Q3 look like?”, can he knows what I am asking and gives me the right answer. However, over 97 per cent of search requests are keyword-based.
When I first got to know search engine in 1999, most people tended to ask search engine in full sentences. The first search request is: am I beautiful? However, machines weren’t smart enough at that time, so how can machines answer such questions? That’s why search engines gradually became keywords-based. Yet, search engines will gradually evolve in the future. At that time, search engines will give users only one answer. This also applies to Google.
At that time, users will also need direct answers from search engines, so search engines should be able to give users ten direct answers, even one specific answer, while users will also get to interact with search engines more naturally.
The biggest contribution of Steve Jobs is to allow people to communicate with machines in a more natural way. In the past, we used keyboard and mouse. Later, smartphones were invented and allowed us to interact with machines through five fingers, which completely transformed the smartphone industry. The next innovation won’t occur in speech recognition, but in language understanding and Q&A. Apple is the first company to release Q&A system, and all other major tech giants, including Facebook, Google, Amazon, are following suit.
Why isn’t Google the first one? I believe Steve Jobs really want to see the Q&A system on the market before he passed away, so he decided to integrate Siri, a pre-mature version Q&A system, into iPhone4s. He passed away the second day the system was released.
However, as long as you’ve used Siri, you might find that user experience of Apple’s Q&A system isn’t good enough. I believe Google will catch up since Google had much more experience and Q&A is one of its main focuses as fundamentally a search engine. The next goal of Sogou is to transform the current keywords-based system into Q&A-based one. In fact, our system has already been able to provide direct answers for 5 to 10 percent answers. This must the be next change in Sogou.
At first, our goal at Sogou is to make it easier to acquire and express information. To make it easier to exchange information, we develop speech search and image search; to make it easier to acquire information, we endeavor to develop a really accurate Q&A-based system and provide users with direct answers.
Another “core weapon” of Sogou is Input Software. In fact, it is also one of the major tools for us to make it easier to express information. How will Sogou Input Software evolve?
Sogou Input Software & artificial intelligence
Can input software be counted as advanced technology? When a search engine company develops input software, it tends to integrate cutting-edge technologies into it.
Sogou Input Software is born as a product out of big data. In 2006, we still don’t use the concept “big data” to describe it, but we would go through the entire internet, grab necessary words and form our word bank. Besides, we would analyze the frequency of each word as well as the grammar structure. This is exactly what we call “big data”.
Sogou Input Software is also a product out of cloud computing. If you’ve ever used Sogou Input Software, you might find that when you input letters, a small cloud will slowly float up. What does it mean? Well, when we find it not enough to assist users when inputting letters and words via local computing, we decided to upload input request to a cloud server. Since the cloud server has stronger computing and storing ability, users will also find it easier to input letters and words.
In addition, Sogou Input Software is a product out of artificial intelligence. Speech recognition becomes very popular this year. I believe as artificial intelligence becomes more mature, it will be easier for users to input letters and words with this function. Moreover, Sogou Input Software will be able to identify letters and words in pictures.
But are they the end in the application of artificial intelligence in input software? Today, Sogou Input Software receive over 190 million speech input requests per day, more than that of any other rivals combined. In the process, we can accumulate over 100,000 hours of speech data every day and form a gigantic speech bank to further upgrade our system’s accuracy and provide users with better experience.
Although we might not have explained to the public our effort and improvement, Sogou Input Software has indeed made it more convenient for users to input letters and words. Still, I don't think this is how the future of input software look like, but only for now.
Sogou artificial intelligence: natural interaction & knowledge computing
So, what does the future look like, exactly? As a matter of fact, we make a video to explain this, though the function we are going to release later this year is more amazing than is explained in the video.
This video tries to introduce to you one basic concept: we are not merely highlighting speech and image search, but rather getting machines to understand what you really want to say and provide users with varied search results accordingly, from restaurant information, map location, professional explanation, music or any other related contents.
A few years ago, we had a discussion: is technological progress making people stronger or weaker? My answer is very simple: human beings are becoming stronger with technology. Without technology, we might, indeed, be weaker than ancient people; with technology, however, we are much stronger. Today, a 20-something can book an airplane ticket and travel to an island with internet technology.
The future of input software is not becoming just a more effective tool, but rather an essential part of people. At that time, input software will be able to understand what you are talking about and what you really mean. It is possible that machines will help you answer some of other people’s questions in the future. In other words, the future of input software is about finding a way to cooperate with people and even combine with people via artificial intelligence technology.
When we talk about artificial intelligence and Q&A-based model, Input Software is always the best gateway. Many companies are developing Q&A system, but none of them have found a mature scenario to adopt the system. Sogou not only has enough technology, but also a widely-based user scenario. Therefore, we can also accumulate enough data and further upgrade our technology in the process. To some degree, we have a natural advantage in the future development of artificial intelligence.
To make it easier to express and acquire information, we have to do two things: for one thing, we need to enable natural interaction between people and machines, so that machines will better understand people, whether through words, speech or image; for another, we have to enable smarter knowledge computing. Only when machines are smarter in understand language, can our search engines and input software better meet people’s need and assist people in their work and life.
That’s all, thank you.
[The article is published and edited with authorization from the author @TMTpost please note source and hyperlink when reproduce.]
Translated by Levin Feng (Senior Translator at PAGE TO PAGE), working for TMTpost.