Vocal synthesis using sine

Revision as of 17:26, 6 August 2023 by Spherey (talk | contribs) (added a method to restore low frequency clarity in liams fft using tempo stretching in audacity as well as a guide to make the output sequence sound a lot like the original sound file)

One of the key limitations of Online Sequencer is the inability to include audio from any outside source. One way to get around this is by using a script which reconstructs any sound using 8 Bit Sine. The results often have enough fidelity to discern lyrics, or the timbre of an instrument, depending on which FFT algorithm is used. This method creates lots of notes, and requires a fast computer to both create and play back these sequences. Simpler sounds work best, like someone talking, or a clear sound effect. More complex and busy sounds like a whole song, with many parts and layers, usually don't work as well, but can still produce usable results.

Liam's FFT

The easiest way to do FFT is using Liam's FFT tool. Just upload a WAV file, choose a preset mode, and click GO. Then you can either copy the notes and paste them into a sequence, or download the sequence file and drag/drop it into OS. There are lots of advanced options to play with to optimize the output. This tool is based on the older fft.py script, which is now deprecated. With the correct settings applied, it can use multiple copies of the 8 Bit Sine instrument to reproduce sounds more clearly than Jacob's FFT, but will generate hundreds of thousands of notes per minute.

Restoring Low Frequency Clarity

This algorithm is highly effective at replicating the mid-to-high frequencies of an sound file, but depending on the chunking frequency in the Advanced Options panel, the lower frequencies will tend to be grainy and noisy, and will also lack clarity. A possible way to mitigate this limitation is to time-stretch your sound file using an audio editor, such as Audacity. Using the Effects tab found on the topmost panel, navigate to Pitch and Tempo (for Audacity versions 3.3+), then click on the Change Tempo option, while having the audio clip selected. Change the Beats per Minute field to "from 16 to 1" (either that or another power of 2 value such as 8-1 or 32-1, higher values give more frequency clarity, but will significantly inflate the note count). It's also essential to make sure that the high quality stretching checkbox is marked, otherwise Audacity will use the default algorithm of tempo-stretching, which is very low quality. The purpose of stretching the sound file is to allow more data for chunking in the FFT converter, which will increase frequency accuracy and also retain time accuracy. Return to the converter, open the time-stretched sound file, and divide both the minimum and maximum chunking frequency by the value which the audio has been time-stretched (e.g. 16). Run Liam's FFT converter, this time with the time-stretched audio file. Once it is finished, import the output *.sequence file into Online Sequencer and use the console to stretch the notes by a factor of 1/N, with N being the value used to time stretch the sound file. The resulting sequence should have clearer low frequencies, and less graininess or noise.

*Additional Note: To make the output sequence sound more like the original audio file, try using a windowing function (e.g. Blackman), a high extra-detune value (e.g. 4800), a low minimum note volume (e.g. 0.0025), stereo (if needed), increasing the number of microtones (e.g. 4 microtones), and perhaps increasing the overall output volume (e.g. 4). The example settings achieve a result that sounds rather close to the original sound file, although it will be very note dense, so it's best to export the sequence as an audio file to listen to it without lag or cutouts.

Jacob's FFT

Below is a script that can be used by copying and pasting it into the console on the sequencer. Press F12 to access the console, then click in the box. Once you see a blinking cursor, paste the script in. It will ask you to upload an audio file. MP3, WAV, and OGG are accepted. This script should function the same regardless of the level of the signal in your file, and with mono and stereo files (final output will be mono). It will generate 8 Bit Sine notes corresponding to the frequencies present in your file, with time resolution being equal to 1/16 grid at whatever tempo you have set. Less notes will be placed if a slower tempo is used, but time resolution will suffer. For most purposes, 110 BPM is fine, but it can be helpful to match the sequences BPM with that of the file you intend to upload (only if you are uploading a song) or to not change the BPM at all if you are incorporating this into an existing sequence.

[Jacob's FFT Converter]

Restoring High Frequency Sounds

Using this algorithm bounds you to the frequencies accessible by the 8 bit sine, which will often result in high frequencies getting cut off, making things sound muffled and certain syllables will be difficult to hear. It is possible to get around this limitation and more accurately reproduce high frequency sounds in the sequencer. In order to do this, you must first follow the instructions in the paragraph above as normal, except you should use a BPM that is a multiple of 4. After that, select all of the notes and change them to 8 Bit Triangle. Open the console and run this command: setDetune(13,2400) . This makes all 8 Bit Sine notes sound 2 octaves higher than they are. Next, open your sound file in an audio editor such as Audacity. Slow down the sound to 25% of its original speed, and make sure you do not have any setting enabled which preserves the pitch of the sound. In Audacity, this can be done by clicking the drop down menu next to the audio track, and changing the "rate" to a quarter of itself. The purpose of this is to bring high frequency data down into the range which can be detected by the FFT Converter. Return to the sequencer and set the sequence's tempo to a quarter of itself. Run Jacob's FFT Converter in the console as before, this time importing the slowed down version of the audio. Once it is finished, change the tempo back to normal. You should have the original conversion in 8 Bit Triangle, and a different with more high frequencies in 8 bit sine. They should completely overlap and sync up, creating a clearer sound. Volume and EQ of the Sine and Triangle should be adjusted to your taste.

*Please note that this method is no longer necessary to use as Liam's FFT can offer this level of quality with less effort.