The Story Behind The AI That Predicted The Winners Of "I Am A Singer"
摘要： Ai had successfully predicted the top three contestants in the I Am A Singer Season 4 finale and was able to name out the winner, Li Wen, in the end. It appears that AI is getting good at looking into people’s heart.
On April 8th at the final competition stage of the I Am A Singer season 4, there had been a sudden intruder, Ai, an AI developed by Aliyun that’s expected to predict the winners of the competition. Before the final competition, Ai predicted that Huang Zhilie had the highest chance of winning, but failing to name out the winner. However, it should be noted that Ai had predicted the top three contestants successfully.
After getting to know about the fundamental mechanism that Ai is based on, we TMTpost had learned that Ai’s algorithm is not merely a production of deep learning technology, but more of much more advanced algorithm that’s based on it. To confirm this hypothesis, we consulted with Aliyun’s chief AI scientist Min Wanli, and the answer was a YES.
So why is Ai’s algorithm more advanced?
Although Aliyun hasn't revealed much information on Ai’s algorithm, besides stating that Ai is based on neuro network, social computing, and emotion recognition technology etc., it had appeared that Ai is able to perceive and process the true nature of current events and understand humans’ emotion.
To put it in perspective, there were three obstacles that challenged Ai when predicting the outcome. Firstly, the final competition of I Am A Singer was a battle of seven singers, entirely different from AlphaGo’s battle with Li Shishi. Secondly, music, or let’s say, singers’ performance, is about the appreciation of art as well as emotional interaction. Different singing style, ponticello and falsetto can bring totally different results and they are unpredictable. More importantly, during live shows, singers improvise a lot, meaning they also jump out of the box and create something new according to the situation. Thirdly, the final result was determined by the votes from the Hunan Satellite Channel’s team, the audience watching the show at home through TV, the 500 public judges at the scene, and the performance of the 7 singers. This whole process was full of random possibilities.
In simple terms, AlphaGo’s deep learning algorithm, which is centered around the game of go and based on voice recognition and image recognition technology, aims to achieve a single and direct goal, that is to win the game, while Ai was pretty much trying to achieve multiple tasks, forcing Aliyun to choose a more advanced algorithm. Min Wanli told TMTpost that Ai is based on the process of decision-making optimization fundamentally. “During the process of making decisions, many sets of variables are needed to be input, many of which come from the deep learning optimization.”
Before I Am A Singer season 4, Ai’s algorithm has been adopted by the Transportation Bureau of Zhejiang Province to predict the road condition. Many might not know the fact that algorithms that are fitted for city management and the macro economy are multi-tasking advanced algorithms. The western world has been using such algorithms to help manage city and make macro policies for a long time. These algorithms include simulated annealing algorithm and genetic algorithm etc. Min Wanli told TMTpost that Ai is based on an independently developed algorithm instead of existing ones.
Min Wanli, father of Ai
So how did Ai achieve the functions of advanced algorithms? To answer this question, we would have to mention the father of Aliyun and Ai, Min Wanli, an AI scientist at Alibaba.
Accepted by the Teenage Talent Class of the University of Science and Technology of China at the age of 14, Min Wanli later went to the U.S. to pursue a master in physics when he turned 19. In 2004, Min received his Ph.D in Statistics from The University of Chicago. After graduating from school, he first worked as a researcher at IBM Watson and later Google. In 2013, he eventually joined Aliyun’s AI team, the very team that developed Ai.
Min Wanli stated that his experience accumulated at IBM Watson helped him a lot. IBM was the first that brought about the intelligent city strategy, foreseeing the future tendency. In 2005, after selling its PC hardware business to Lenovo, IBM started to undergo its transformation. During this process, IBM found that its biggest shortcoming was the lack of the ability to process and analyze large amount of data, therefore the company conducted relevant researches, including mass data analysis, key information extracting, forecast modeling, and machine learning etc. With that chance, Min Wanli was able to get involved with real actions and get first hand experience.
Later Min Wanli started to work for Google, where he was in charge of the research on advertisement targeting on mobile clients, the core technology in Internet advertising industry. Such research includes the study on mass data analysis, which happens to be a very important component for Ai. To achieve the goal, the team had to analyze mass amount of data and use machine learning technology to increase the accuracy of advertisement targeting so as to improve the view rate.
The optimization of data analysis process of mobile advertisement targeting includes: identifying the content the users have viewed; identifying the location of the users (whether they are driving or siting in a restaurant); determining if the users like the advertisement etc. These are the variables from different situations that will determine the final decision, quite similar to the situation in which Ai needed to predicted the winner of I Am A Singer season 4.
Min Wanli has been doing R&D work related to machine learning and applied algorithm and holds several international patents in EEG analysis, high-dimensional data mining, stochastic process theories, time series analysis, and network-flows etc. In 2011, Min published his research about traffic flow prediction. His research is the most quote one in the field in five years globally.
In 2013, a head hunter found Min Wanli and tried to convinced him to join Aliyun. “In China there is a company that processes data more than that of Amazon, eBay, and PayPay combined,” the head hunter said. “If you really like this field, that’s where you belong, Alibaba.” Eventually, Min Wanli made the decision to join Aliyun.
The development of Ai
Starting from 2012, Aliyun has been developing a mass data processing engine named MaxCompute (ODPS), a later vital infrastructure to Ai.
ODPS is the one and only data processing platform that backs Alibaba’s over 30 business departments. In the 2015 Sort Benchmark competition, ODPS finished case sorting process within just 377 second, breaking the previous record of 1,406 seconds set by Apache Spark and achieving four global records. At present, MaxCompute is able to process 100PB data, about 100 million HD movies to put it into perspective, in 6 hours.
It’s worth noticing that ODPS’s real-time computing system StreamSQL, which was later named StreamCompute, can process trillions of information, PB-level data and millions-level QPS. It’s a real-time recommendation system that recommends merchandise in accordance with the users’ behavioral data (views, transactions, and favorite list etc.).
According to Aliyun, Ai learns ten thousand times faster than humans, which means it only takes ten hours for Ai to master something while humans will need ten thousand hours. Ai’s powerful learning capability is empowered by Aliyun’s big data analyzing platforms such as MaxCompute and StreamCompute. “Aliyun’ big data analysis has been tested in real action. That’s the thing that sets it apart from the crowd,” he said. This platform has been tested by thousands of engineers at Alibaba, including the challenges brought by the Double Eleven shopping festival.
While developing MaxCompute, Aliyun’s Ai team was also working to develop Ai algorithm systems such as the fields of deep learning, emotion analysis on social media, semantic analysis, and optimized algorithm etc. By 2015, Aliyun’s Ai algorithms had been applied in different scenarios under Alibaba’s businesses, proving to be mature. Later they have been made into different modules and were installed in MaxCompute.
“That means Aliyun’s Ai modules target specific application scenarios instead of pursuing some abstract goals. We didn’t bear our heads into our work for four years just to make an AI that can predict the winner of a reality show. This outcome is contributed by Alibaba’s entire business ecosystem,” Min Wanli told TMTpost.
Ai’s algorithm system
Aliyun first starts to develop AI technology early as in 2012. Before I Am A Singer, Ai has actually achieved some results as well. For instance, it had helped photovoltaic solar plants estimate the power generation efficiency so as to reduce energy consumption and Water Resources Regulation Bureau forecast the reservoir level in order to prevent accidents from happening. Besides that, it had also served as customer service agent for financial organizations to answer phone calls and helped AliMusic predict singers that had the potential to become stars.
Ai’s team includes engineers, scientists, as well as professionals from AliMusic as coaches to help it learn to appreciate music. Backed by AliMusic’s database, Ai learned to automatically identify and process the characteristics of music and make evaluation about it from different dimensions, including the perspectives of the pitch, emotional power, sonogram, and the fundamental frequency. Through identifying and process these elements, Ai became able to assess their contributions that would make the song more popular among the audience. However, so far Ai hasn’t learned about all genres of music, such as Korean and other foreign songs.
So how did Ai predict the outcome? Ai looked into the previous performance and results of the contestants for the variables that might determine the outcome and built a real-time model to make predictions. This model included the songs, the singers themselves, the fans, the live atmosphere, and the discussion of the viewers online etc. All these perspectives provided data for machine learning process. Some data were static, some changed in accordance with the live show and required real-time processing.
Min Wanli revealed that in Ai’s eyes, singers have thousands of different labels. For example, Li Wen is a female singer born in the 70s, currently a American Chinese living in the US, an iconic figure, sexy, with songs of the R&B and Soul genre and the record of singing at the Oscar. The elements that might affect the outcome generated in the live show were many, therefore Ai must find those information variables, which include the genre, style, arrangement, and the guest singers of the songs, as well as the performance of the dancers and the singers themselves. Furthermore, the data on the Internet generated by online viewers also counted. Ai integrated all these elements into a logic system and determined the outcome.
In short, Ai observed the elements that might influence the voting. It’s a process of understanding humans' preferences and thinking.
Different from Microsoft’s Xiao Bing
Microsoft’s Xiao Bing, being an AI that also focuses on algorithms on emotions, is quite different from Ai. According to Min Wanli, Xiao Bing can understand the situation easily when engaging in a Q&A conversation with humans and then build a linguistics-related model for analysis. In contrast to that, Ai needed to understand the music coming from seven different singers, something that’s beyond human language. This is where Ai appears to be different.
Min Wanli added that the challenges that Ai faced were extremely complicated. For example, Sun Nan’s exit from the competition was completely random and sudden, which Ai could never expect it to happen. Events like this created many obstacles for Ai in the training. In live show, anything could happen. No matter how accurate Ai's predictions could be, it’s still a successful attempt.
Aside from that, Ai doesn't have a clear business model when compared to Xiao Bing. At present, Microsoft is trying to make Xiao Bing a fundamental AI infrastructure for the company’s products and services. In comparison with the currently popular deep learning algorithms, Ai’s multi-target optimal algorithm doesn't seem to have a clear commercial prospect.
“We have a project that studies the index of happiness and health. Ai’s ability to appreciate music, which is an art, is a sign of understanding happiness and health. Achieving that is already a major technological breakthrough,” Min Wanli said, revealing Alibaba’s strategy for the future in terms of business model. “Furthermore, Ai’s ability can be applied in the business field easily, and its potential has been proven by its performance in fields such as traffic management and weather forecast.”
Ai’s success in predicting the winners of I Am A Singer season four had gained the AI wide public attention. IBM even sent its regards to Alibaba before the competition started. More importantly, this event had been a great challenge for multi-target optimal algorithm and the very first show brought by such AI in China to Chinese audience.
From Xiao Bing to Ai, the research and development of algorithms has been shifting to the study of human emotions. In any case, it’s a sign of AI technology moving into the next level.
[The article is published and edited with authorization from the author @Wu NingChuan, please note source and hyperlink when reproduce.]
Translated by Garrett Lee (Senior Translator at PAGE TO PAGE), working for TMTpost.