The developers of the model artificial intelligence (IA) Stable Difussion have adapted this technology to be able to create spectrograms capable of becoming audio or music clips from text.
Stable Difussion is a text-to-image machine learning model developed by Stability AI, which is used to generate high-quality digital images from text.
SIGHT: Tefi, the robot dog capable of guiding blind people or those with Alzheimer’s
Two developers named Seth Forsgren and Hayk Martiros have created a project called ‘Rifussion’ through which they adapt this solution to music. With it you can generate spectrograms that can be translated, in turn, into audio clips.
As the creators of this project explain on their website, an audio spectrogram or sonogram is a visual representation based on sets of text prompts entered by the user.
SIGHT: Qric, Oppo’s robot dog that calls an emergency if its owner is in trouble
These sonograms have two axes: X, which represents time, and Y, which represents frequency. The color of each pixel of each audio spectrogram, on the other hand, is its amplitude. It is precisely this data that Torchaudio takes into account, which takes the image generated by Stable Diffusion and converts it into audio.
From Rifussion they announce that it is not only possible to generate music from images and text, but that it is also possible to combine, experiment and merge styles.
SIGHT: These five jobs would be taken by AI in the next 10 years
The developers have pointed out that, if you have a powerful enough GPU, you can create sonograms with a generated image size of 512 x 512 pixels and five seconds long. However, infinite variations can be introduced based on the same original image.
Rifussion currently includes a clip generator, as well as instructions and technical details to be able to use this technology on its website. Also, their code is available in their repository on GitHub.
Source: Elcomercio
I have worked in the news industry for over 10 years. I have a vast amount of experience in writing and reporting. I have also worked as an author for a number of years, writing about technology and other topics.
I am a highly skilled and experienced journalist, with a keen eye for detail. I am also an excellent communicator, with superb writing skills. I am passionate about technology and its impact on our world. I am also very interested in current affairs and the latest news stories.
I am a hardworking and dedicated professional, who always strives to produce the best possible work. I am also a team player, who is always willing to help out others.