Chinese scientists have presented an algorithm to create songs from recordings of speech. The neural network is also capable of the reverse process. Details of the development and test results published arXiv.org.
A group of researchers from Tencent has taken into account the typical problems in the development of other programs for speech synthesis – processing large amount of data for training.
Previously for the “musicalization” one particular person was necessary to handle a large number of vocal samples. A new algorithm for Chinese developers content with just a voice recording as the sample without requiring the subject to torn ligaments in an attempt to sing the text.
The basis of the used neural network DurIAN, designed to synthesize realistic videos with the speaker a leading text-based.
The algorithm is trained on a private 1.5-hour recordings of singing, and 28 hours of speech. Then the efficiency is tested on 14 volunteers. The most successful versions published on the website of the developer.
Previously, China has introduced a robot cat who has emotions and knows how to dig in the trash.