Fortune Telling Collection - Comprehensive fortune-telling - How to Make Machines Think Like People —— User Portrait, Personality and Chat Robot
How to Make Machines Think Like People —— User Portrait, Personality and Chat Robot
Recently, I bought a picture book for my three-year-old daughter, named "Can I recreate myself?" She couldn't put it down. The protagonist of this book is a child who is tired of routine life. He hopes to train a robot to take a nap, eat and go to kindergarten on time instead of himself, and let him play freely. So he bought the cheapest robot and took it home for training. In this process, the first question he encountered was, how can a robot become him? So, he tried to tell the robot all kinds of information about himself, including his name, age, height, weight, parents, brothers and pets, and even "left-handed", "irritable" and "socks often break holes".
The author of this picture book has a big brain, and he is thinking about what we are thinking. This story also tells us that the first step to make robots think like people is to know themselves. Because in this way, we can tell robots how to be the most like themselves. We discuss this problem from the following aspects:
1. artificial intelligence and psychology
2. Personality classification and speculation
3. How to make robots think like people?
For a long time, our team has been engaged in the study of user portraits. What is a user portrait? Simply put, it is to guess and understand a person's age, occupation and hobbies through big data generated by users, and it can also describe the life rules and exercise patterns of a group of people. This makes us start to think, can we get a deeper understanding of their personalities and emotions through these data? It is not easy. However, in the process of research, we found that these problems have been considered in the field of psychology for thousands of years. In fact, the fields of artificial intelligence and psychology have actually crossed for a long time.
Two years ago, we began to visit famous psychologists and professors, trying to carry out interdisciplinary cooperation and exchanges. In this process, the first problem we have to solve is personality. Can people's personality be inferred from the big data generated by users?
Although the word personality is very common in daily life, it is not easy to give an accurate and clear definition of personality, and even psychologists find it difficult to reach a consensus on the definition of this word. The earliest definition of personality can be traced back to more than 2000 years ago (400 BC). Hippocrates, an ancient Greek doctor, said that he believed that the human body was composed of four kinds of body fluids, including blood, mucus, yellow bile and black bile. The distribution of these four kinds of body fluids determined people's personality: black bile produced melancholy personality, blood produced optimistic personality, yellow bile produced impulsive and irritable personality, and mucus, although Hippocrates' theory of body fluids had been denied by modern medicine.
When we communicate with psychologists, we find an interesting fact: in modern psychology, the definition of personality is actually closely related to the use of language. In fact, in the field of computer science, we also have a lot of research on language, which we call "natural language understanding". In psychology, there is a concept called "lexical hypothesis". What is the lexical hypothesis? According to this hypothesis, we don't need to observe and study all kinds of people to study personality, but we can simply observe the related words in human language directly. For example, if you introduce a friend to me, you may describe him in a long paragraph: "He likes talking very much, and he is a chatterbox every time he talks" and so on. In fact, this passage can be summarized in one word: talkative. Therefore, psychologists decided to sort out these descriptive words. If the number of words is small, it can be used as the basis for establishing a classification system.
Based on these observations, allport and Aubert, pioneers of personality theory, made a difficult and systematic survey of English words in 1936. By consulting the dictionary, they found about 18000 words in four categories: personal characteristics, temporary mood or behavior, intelligence and talent, and further sorted out more than 4000 words to describe their personality. Although 4,000 seems small, it is still very complicated for the whole user language.
Imagine how much work it takes to score these 4000 descriptive dimensions when describing a person's personality. Therefore, they want to further reduce it on this basis. In the process, they found some correlation between these words. For example, an extrovert is usually talkative, and a calm person is usually rational, but he may also be introverted. If we can locate these correlations, we can further classify more than 4,000 words on this basis.
In recent twenty years, the definition of personality that personality researchers are most concerned about and supported is "Big Five Personality Theory". Including five highly summarized personality factors: extroversion, conscientiousness, nervousness, agreeableness and openness. There are also some subdivision characteristics under each personality factor (such as extroversion, including whether you often participate in activities, whether you are enthusiastic, etc.). ). In this way, when you introduce a friend in the future, you can describe him as "an extrovert, but not easy-going, maybe more emotional." Simple way, but comprehensive description.
In fact, sorting out these words and generating a personality classification system are mostly data-driven, which is closely related to computer science. Then can we automatically calculate the top five personalities of users? In fact, this is also possible.
In the traditional personality measurement, psychologists often use interviews and questionnaires, which require a lot of manpower, financial resources and time. The research object is often limited to dozens to hundreds of people, and it is impossible to measure large-scale users. But there is also a method of personality measurement in psychology, called behavior measurement, which is evaluated by observing individual behavior. The theoretical basis of behavior measurement is the consistency of human behavior in personality theory. Because personality can explain the stable individual differences between people, and the differences of individual behavior are closely related to individual personality, it is possible to predict personality by observing individual behavior. Only before computer technology was widely used, it was difficult for psychologists to collect enough rich user behavior data, so the lack of data led to the lack of application of behavior measurement in traditional psychology.
In recent years, with the popularity of the Internet, smart phones and various sensing devices, users' behavior data have been widely collected, and the popularization of artificial intelligence methods in user modeling has made the method of measuring personality through behavior data develop rapidly in the cross field of computer and psychology. On this basis, our research work goes further and puts forward a "personality inference model", which uses heterogeneous data on social media (such as headshots, published texts, emoji usage and social relationships) to predict the Big Five personality. For example, for pictures, we can calculate the semantic representation, and then classify these pictures into certain categories, such as comics, selfies, group photos, animals and plants. Using artificial intelligence method based on behavior data to predict personality, it is necessary to collect the questionnaire results of a small number of users as annotations. By labeling the user's behavior characteristics and personality characteristics, the mapping and connection between them are input into the model, and a good model is trained.
In fact, we found a group of volunteers who provided the data themselves and completed the questionnaire survey, so that we have two aspects of data. After training the model, new users do not need to complete the user survey, and the model can automatically calculate their personality. It sounds abstract, but it is also very specific. For example, we can calculate the relationship between the words published by users and their personalities. Big Five personality has five dimensions, and we can calculate that there is a particularly positive or negative correlation between words and each dimension. For example, a person who often writes about youth and self in a circle of friends may be extroverted, while users who often can't write and face it have a low extroversion score. Other users may write some words that sound positive, such as times, society and success. We find that these people are more responsible. On the contrary, some people may often write a few words casually, MengMeng, temperament, and we find that their seriousness is relatively low. Low due diligence is not a derogatory term: in this model, people who care about the results have a higher degree of due diligence, while those who care about the process have a lower degree of due diligence. Both extremes have their advantages, and there is no difference between good and bad.
We also show the clusters with strong positive or negative correlation with Big Five personality by calculating Pearson coefficient of Big Five personality and user's avatar clustering (two pictures are selected for each cluster). This calculation reveals some interesting phenomena: for example, users with high extroversion scores like to use smiling faces, while users with low scores often block facial expressions or use side faces in their avatars; Users with high openness scores often use photos with friends as avatars, while users with low openness scores often take selfies.
Our experimental results show that the accuracy of personality prediction can reach 0.6 only by using head photos. We not only put forward a targeted feature extraction strategy for behavioral data of different dimensions, but also effectively integrate behavioral data of different dimensions by integration, so as to improve the accuracy of the Big Five personality prediction and make the accuracy of the individual Big Five personality prediction reach above 0.75.
After understanding users, the next step is how to use this knowledge to help robots think like people. One of the important behaviors that humans hope robots can achieve is chatting. Microsoft also put forward the concept of "dialogue is platform", thinking that all human-computer interfaces will be transformed into dialogue interfaces in the future.
I watched a TV play two years ago, and I still remember it vividly. It's the first episode of the second season of the British drama Black Mirror. This TV series describes an artificial intelligence company, which can synthesize a virtual person through a person's social media and online chat data, imitate the personality characteristics of the prototype and have a conversation with his girlfriend. This seems to be science fiction, but it is not far away from us. A news report in June 20/KLOC-0 also mentioned that Kuyda, an entrepreneur from Russia, trained a chat robot with his 8000 SMS data in memory of his dead friend Roman, and officially released it in May 20 16.
Although the technology has made a big step forward, even the best chat robot at present can't make people feel that he is a living person with stable personality and emotions. This involves how to make the language and behavior of robots more personalized.
With the popularity of social networks, language data with user tags become easy to obtain. Just like the news report mentioned above, if we have enough data about someone, it is possible to train a chat robot with the same personality as him. Of course, we can also train robots with human characteristics through the data of a group of people, such as children, students and even poets. For example, can we collect all the data of modern poets and use these data to train a robot to output poems? We can do it now, but with the deepening of research, I believe we will eventually encounter bottlenecks, such as how to make robots have more realistic human personality and emotions, which still needs cooperation with psychologists.
In fact, Eliza, the earliest chat robot, was a psychological counselor. About 50 years ago, Joseph, a researcher at MIT, bred Eliza. When chatting with users, Eliza introduced the people-centered therapy proposed by psychologist Rogers, and emphasized the dialogue attitude, such as respect and empathy. In fact, Eliza doesn't take the initiative to say new content, but has been guiding users to talk as much as possible. The seemingly pleasing Eliza project achieved unexpected success, and its effect shocked the users at that time. Thus, a word called Eliza effect came into being, which is a psychological feeling of overestimating the ability of robots. Eliza effect is actually very common now. For example, when AlphaGo, which beat the top players, appeared, people felt that computers had the inspiration to play Go, and artificial intelligence would soon surpass human beings. But in fact, the programs behind AlphaGo are all written by people. The so-called inspiration and the so-called intelligence are actually realized by programs.
Inspired by ELIZA project, Microsoft Research Asia also launched DiPsy project. The goal of this project is to enable robots to chat with people and help them overcome their psychological problems. In this project, we draw lessons from cognitive behavioral therapy and mindfulness therapy commonly used in psychological counseling. The characteristic of DiPsy is to guide the dialogue in a natural and effective way, so that users can speak freely. It will also study the psychological process of users, and make a diagnosis of users' psychological characteristics and mental disorders through data-driven. We use cognitive behavioral therapy (CBT) or early intervention to change users' thinking and behavior in various therapeutic situations, and help risky users to alleviate and manage their psychological problems.
In the future, we expect this project to help solve practical social problems, such as psychological counseling for left-behind children in rural areas. At the Future Forum held not long ago, Shen Xiangyang, global executive vice president of Microsoft, said that three diseases closely related to human brain should be solved: childhood autism, middle-aged depression and Alzheimer's Harmo's disease. I hope our technology can help him do this. Of course, many of these research projects are still in the initial stage, which requires a lot of cooperation with scholars in other fields, including psychology, sociology and cognitive science. I hope I can communicate with more disciplines in the future and get more research inspiration and innovation.
We hope that in the end, the machine can think like a human, providing not only help but also companionship when people need it. When you are lonely, at least one AI will accompany you.
Knowledge map:
Pearson coefficient: It is used to measure the correlation (linear correlation) between two variables X and Y, and its value is between-1 and 1. In natural science, this coefficient is widely used to measure the degree of correlation between two variables.
Ensemble learning: A machine learning method, which uses a series of learners to learn and some rules to integrate the learning results, so as to obtain better learning results than a single learner.
- Related articles
- How to divide the six grades of fortune-telling profession _ fortune-telling profession
- Top Ten Tourist Attractions in Fuping County
- Teacher Xue told Fortune.
- A fortune-telling monkey _ a fortune-telling monkey
- /kloc-What is the Regulation on the Management of Online Recruitment Service implemented from March, 2000?
- Dream of dragons
- Gong Changling's fortune-telling _ Gong Changling's fortune-telling accuracy
- What do you mean by dreaming that your shoelaces are broken? What do you mean by dreaming that your shoelaces are broken?
- Li See Li Ka-shing, the fortune teller.
- What eight-character woman can win Foucault's best eight-character for her husband?