The University of Electro-Communications develops AI that expresses the texture of things in images with onomatopoeia and mimic words such as “smooth” and “smooth” | Today Nation News

TechnologyThe University of Electro-Communications develops AI that expresses the...

On November 17, the University of Electro-Communications announced the development of AI that can express the texture of things in images with onomatopoeia (mimic words) such as “smooth” and “smooth”. It means that he succeeded in machine learning the ambiguity of different sensations depending on the person named onomatopoeia.

A research group led by Professor Maki Sakamoto of the University of Electro-Communications Graduate School of Information Science and Engineering and the Center for Advanced Artificial Intelligence asked 100 subjects to express the texture of what is reflected in 1946 images with onomatope. A deep learning model was created from the data.

The neural network modeled on human nerve cells was used here, but among the convolutional neural networks that are attracting attention especially in the field of object recognition, the deep convolutional neural network (DCNN), which has a more multi-layered structure, is used. Adopted. This is because DCNN has the advantage of being able to automatically detect image features during the learning process. Therefore, it can be applied to things such as the texture of things that “the point of view differs from person to person”. However, AI is not good at ambiguous learning in the first place, so some ingenuity was required in the learning method.

READ MORE  Mmhmm ​​Acquires Macro for More Human Video Conference | Today Nation News

Therefore, the research group focused on onomatopoeia, which strongly expresses “phonological symptom”, which is a phenomenon in which phonology is associated with sensory impressions such as tactile sensation and vision. If this is used, it is easy to quantify the impression of a person. In the study, we prepared 1946 images classified into 10 categories of fiber, glass, metal, plastic, water, leaves, leather, paper, stone, and wood, and the corresponding onomatope of 30,138 words. We asked 100 subjects to see and express the images. Then, by learning multiple onomatopoeias as correct answers in one image, we were able to create a DCNN model that takes ambiguity into consideration. This model, which outputs onomatopoeia when an image is input, is said to have achieved a correct answer rate of about 80%.

“If a computer that can express textures like humans is realized, it is expected that in the future where humans and robots coexist, for example, robots will be able to teach textures to visually impaired people.” The research group is talking.


Latest news

You might also likeRELATED
Recommended to you