File:AI-generated audio featuring bossa nova music with electric guitar.ogg
AI-generated_audio_featuring_bossa_nova_music_with_electric_guitar.ogg (Ogg Vorbis sound file, length 15 s, 141 kbps, file size: 262 KB)
Captions
Captions
Summary
[edit]DescriptionAI-generated audio featuring bossa nova music with electric guitar.ogg |
Demonstration of an algorithmically-generated audio track featuring bossa nova music accompanied by electric guitar, created using Riffusion, an open-source fine-tuned derivative of the Stable Diffusion image-generation diffusion model that has been retrained to generate images of audio spectrograms, which can then be converted into audio files. An audio spectrogram is a visual representation of an audio clip's frequency content, and images of spectrograms can be converted into audio via short-time Fourier transform, using the Griffin-Lim algorithm to approximate phase during audio reconstruction. While the Stable Diffusion AI model is originally intended to generate visual images from a textual prompt, Riffusion has been retrained from Stable Diffusion v1.5 to instead generate spectrogram images from text prompts describing musical motifs, fine-tuned through the use of Nvidia A10G enterprise datacenter GPUs.
The spectrograms were generated using the Riffusion Inference Server running the riffusion-model-v1 diffusion model, paired with the Riffusion App UI frontend. The following values were used:
This resulted in the output spectrogram image: ![]() Spectrograms were then converted to WAV audio using this python script: |
Date | |
Source | Own work |
Author | Benlisquare |
Permission (Reusing this file) |
As the creator of the output images and audio, I release this file under the licence displayed within the template below.
The Stable Diffusion AI model is released under the CreativeML OpenRAIL-M License, which "does not impose any restrictions on reuse, distribution, commercialization, adaptation" as long as the model is not being intentionally used to cause harm to individuals, for instance, to deliberately mislead or deceive, and the authors of the AI models claim no rights over any image outputs generated, as stipulated by the license.
The Riffusion v1 model, created by Seth Forsgren and Hayk Martiros, is released under the CreativeML OpenRAIL-M License and is a derivative model of the Stable Diffusion v1.5 model checkpoint.
The Riffusion Inference Server is released under an MIT License.
|
Licensing
[edit]


- You are free:
- to share – to copy, distribute and transmit the work
- to remix – to adapt the work
- Under the following conditions:
- attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- share alike – If you remix, transform, or build upon the material, you must distribute your contributions under the same or compatible license as the original.
![]() |
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled GNU Free Documentation License.http://www.gnu.org/copyleft/fdl.htmlGFDLGNU Free Documentation Licensetruetrue |
File history
Click on a date/time to view the file as it appeared at that time.
Date/Time | Thumbnail | Dimensions | User | Comment | |
---|---|---|---|---|---|
current | 22:22, 17 December 2022 | 15 s (262 KB) | Benlisquare (talk | contribs) | {{Information |Description= Demonstration of an algorithmically-generated audio track featuring bossa nova music accompanied by electric guitar, created using [https://www.riffusion.com/about Riffusion], an open-source fine-tuned derivative of the Stable Diffusion image-generation diffusion model that has been retrained to generate images of audio spectrograms, which can then be converted into audio files. An audio spectrogram i... |
You cannot overwrite this file.
File usage on Commons
The following 2 pages use this file:
Transcode status
Update transcode statusFormat | Bitrate | Download | Status | Encode time |
---|---|---|---|---|
MP3 | 98 kbps | Completed 22:22, 17 December 2022 | 1.0 s |
File usage on other wikis
The following other wikis use this file:
- Usage on el.wikipedia.org
- Usage on en.wikipedia.org
- Usage on nl.wikipedia.org
- Usage on ro.wikipedia.org
- Usage on uk.wikipedia.org
- Usage on uz.wikipedia.org
Metadata
This file contains additional information such as Exif metadata which may have been added by the digital camera, scanner, or software program used to create or digitize it. If the file has been modified from its original state, some details such as the timestamp may not fully reflect those of the original file. The timestamp is only as accurate as the clock in the camera, and it may be completely wrong.
Author | Benlisquare |
---|---|
Short title | Riffusion, prompt "bossa nova with electric guitar" |
Software used | Xiph.Org libVorbis I 20200704 (Reducing Environment) |