La Conversion Audio vers MIDI

Une question fréquemment posée par tous les musiciens informatiques est : Comment puis-je faire pour transformer un fichier audio en fichier MIDI ?
Celà peut sembler une idée interessante.
Après tout, les fichiers MIDI sont bien plus compacts que les fichiers WAV, ou même que les fichier compressés de type MP3...

Cependant, si la conversion d'un fichier MIDI vers un fichier Audio est simple par la vocation même du format MIDI, la transformation inverse est beaucoup plus complexe que nos oreilles ne veulent bien nous le laisser croire.

One of the most frequently asked questions on the MP3 and MIDI Forum is, "How can I convert MP3 or WAV files to MIDI?" It seems like a good idea. After all, MIDI files are much smaller than huge WAV or even compressed MP3 files. Unfortunately, the nature of MIDI makes conversions from digital audio extremely limited at best. Unlike MP3, WAV, and other digital audio files, MIDI files don't really contain recorded music. Instead, the music is stored as a series of numbers which tells a synthesizer how the music is to be played back. Yes, a MIDI file must be played on a synthesizer. You may not know it, but the sound card in your computer also contains a MIDI synthesizer. Here's a simplified explanation of how MIDI works. To reproduce the sound of a piano playing a C note, the MIDI file, or sequence, contains digital information that says, "this is a piano sound." Another number says, "a note has been played," and other numbers convey information such as, "the note is middle C," "the key was struck very softly," "the note has now stopped," etc. Musicians love MIDI because it's easy to edit the files. MIDI files aren't recorded, they are "sequenced." Most MIDI files are made by a musician playing on a synthesizer keyboard. Each instrument must be entered separately, but since MIDI can have multiple tracks, the resulting sequence sounds like the instruments are all playing simultaneously. The fact that MIDI sequences must be created one instrument at a time prevents almost all digital audio files from being converted. There are several "pitch to MIDI" software programs such as WAV to MIDI Converter and Intelliscore that CAN convert a file to MIDI. The catch is that the audio file must be of a solo instrument. Finally, since MIDI must be played on a synthesizer, you can't record vocals or other sounds that are not available on the synthesizer. That's why you can't convert an MP3 file of a pop song to MIDI.

Every couple of days, it seems that we field the 'how do i make a midi from a wav?' question. And every couple of days, we say 'try Digital Ear'. Digital Ear can analyze a recorded solo performance (wav file) (e.g. a singing human voice, or a musical instrument) and convert it to a standard MIDI file. If you think this sounds too good to be true, you should check out the 'Before' and 'After' demos. Listen to this Violin Wav and then this Violin MIDI. More examples can be found at wizard screenshot Best of all, perhaps, is the settings wizard. It will help you find quickly and easily the optimal settings for any musical instrument. Things couldn't be made any easier for you! But that's not all! There are other programs that deserve your attention. AKoff Music Composer, AmazingMIDI and TiMidity if Digital ear doesn't rock your boat.

Please note that this is an old article. For a list of WAV to MIDI converters that often do the job, please check out our new WAV-MIDI page. In general, you can't convert WAV into MIDI. These are completely different concepts. It's like asking: How can I convert a cake back into 'the separate operations of the baker' AND 'the original ingredients (eggs, sugar, butter, flower, etc)'? A MIDI file is a sequence of commands to control one or more pieces of equipment (synthesizers most of the time). These commands are not sounds, they are recorded operations to do something (mostly to generate sound). A WAV file is sound. It is the recording of a sound wave. It is the mix of all the given things (instruments, voices, background noises) you could have heard at the moment of recording. A lot of info (in fact most of it), that you need for a MIDI file, is lost. Like with the cake. When the cake is at your table, all data about the baking process is gone. There is a lot of discussion going on (continuously) about WAV-to-MID conversion, done by computer/software. Don't be confused by people who say it can be done or that it is (should be) possible. You'll hear all kind of academic twaddle in this respect. Like FFT, one of the most popular buzzwords (which by the way stands for Fast Fourier Transform) or some other kind of fancy gobbledygook. The problem is a lot harder than these theorists like you to believe. For people, some sounds sound as music. We can like the sound of 50 musicians playing 50 instruments at the same time, because for us humans, the notes that are played by these 50 musicians are related in some way. To us, it's music. To a computer it's just noise. Because of this relation between instruments, that we humans hear in music, we can distinguish the separate instruments (or instrument groups like violins). Therefore we are able to 'translate' a piece of music into a MIDI file by listening to it. A computer (program) does not have that ability, that sense. It can not distinguish music from noise. To the computer (program) it's just sound and we ask it to unravel that. If you'd like to know what that means, try to imagine the following: There are 50 musicians on stage, all having hearing protection so they can't hear each other. They all start playing a different piece of music at the same time. Do you have any idea how that sounds? It's still only those 50 musicians you liked so much before, but do you think you could make a MIDI file out of it this time? I will not confuse you with all kinds of technical details, that form the basis for some folks to say it is possible. Take this advice, just give them a nice, full-blown wave file of an orchestra and ask for a demonstration. Works all the time :-). In cases of great simplicity it is possible to convert a WAVE (file) into a MIDI file with more or less success. We're talking about a WAVE (file) in which you have ONE instrument playing ONE note at a time. The degree of success depends on the quality of the hardware and software you use and of course the instrument you want to 'convert'. Instruments that allow less human influence will make a conversion easier. For instance: you can hit a piano key with more or less 'velocity' and you can hold the key long or short, but that's about it. When you play the saxophone, there's not just 'velocity' and 'hold'. There are a lot more human influences to the sound. The way you breathe, open the valves, hold your mouth, use your tongue, bite the reed or even add a little human sound to it. This makes the conversion of the sound of a saxophone a lot more difficult than the sound of a piano. When you feel like experimenting with WAV-to-MID conversion, you might like to try the following programs: Sound2Midi Autoscore from Wildcat If you want to convert a MIDI to a WAV file there are 2 ways to do it. If you want the WAVE file to sound exactly like the MIDI file does, then the only way to do this (and the easiest overall method), is to open up an application to play the MIDI file and open up an application which will allow you to record the piece as a WAVE file. Then hit 'record' on the WAVE application and 'play' on the MIDI player. The other way is to get a program, that will use its own sounds to directly generate the WAVE file. But that is also the downside to it. These programs use their own sounds, so a MIDI file, that sounds good on your MIDI equipment may not sound good when such a program turns it into a WAVE file. Also, they are usually not XG or GS compatible. "Some of this information was supplied by the official FAQ created and maintained by Kees van der Velden. A HTML version and download of the complete FAQ can be found at the excellent web site MIDI Papa's - the MIDI FAQ by CC maintained by Bomi"

What is Music Recognition?

In a few words music recognition is mathematical analysis of an audio signal (usually in WAV format) and its conversion into musical notation (usually in MIDI format). This is a very hard artificial intelligence problem. For comparison, the problem of recognition of scanned text (OCR - Optical Character Recognition) is solved with 95% accuracy - it is an average exactitude of recognition of the programs of the given class. The programs of speech recognition already work with 70-80% accuracy, whereas the systems of music recognition work with 60-70% accuracy but only for a single voice melody (one note at a time). For polyphonic music the accuracy is even lower.

To create a MIDI file for a song recorded in WAV format a musician must determine pitch, velocity and duration of each note being played and record these parameters into a sequence of MIDI events. A music recognition software must do the same things. Even for a single instrument song it is not a simple task, because a WAV recording contains waveform signals and doesn't contain any music specific data.

In general cases the variety of music timbres, harmonic constructions and transitions make it impossible to create a mathematical algorithm for precise reconstruction of a music score from the audio sources. It is hard to recognize audio data which contains many instruments, drums and percussions or clipping signals, unstable pitch sounds and background noises. However, in many cases AKoff Music Composer will produce a MIDI material that represents the basic melody and chords of recognized music. You can download and listen to source wav-files and recognized midi-results.

WAV and MIDI Formats

The difference between WAV and MIDI formats consists in representation of sound and music. WAV format is digital recording of any sound (including speech) and MIDI format is principally sequence of notes (or MIDI events). The relations are approximately the same as between sounded speech and printed text.

WAV format
A WAV file is the recording of a sound wave. It is the mix of all the given sounds (instruments, voices, background noises) you could have heard at the moment of recording. So you can record, for example, human voice in WAV format, but you cannot edit any note or change any instrument in music recorded in a WAV file. The Standard Windows PCM WAV format contains only Pulse Code Modulation data without compression. PCM format is the only kind that saves the entire wave completely with no data loss.

There are many other formats for audio recording. They differ from each other by compression algorithms and can be referred to one group. The conversion from one format into another is very simple. There are many sound editors which allow one to do this.

The following is a list of some audio formats with file extensions:

  • Standard Windows PCM waveform (.WAV)
  • Microsoft ADPCM waveform (.WAV)
  • MPEG Layer (.MP2, .MP3)
  • RealAudio (.RA)
  • Sound Blaster voice file format (.VOC)
  • Apple AIFF format (.AIF, .SND)
  • WMA, VQF and many others.
  • MIDI format
    MIDI (Musical Instrument Digital Interface) format is a sequence of commands to control one or more pieces of musical hardware or software such as synthesizers or sequencers. These commands are not sounds, they are instructions to do something (mostly to generate sound). For example: select Instrument #1 (Acoustic Grand Piano), play Note #60 (C5) with Velocity #127. So you cannot represent, for example, human speech in MIDI format, but you can edit any note or change any instrument in music recorded in MIDI file.

    MIDI to WAV conversion
    Music recorded in MIDI format can be easily transformed to WAV format. You can play MIDI files on an appropriate player and record reproduced music in a sound editor. The size of a WAV file will be larger than the same music file represented in MIDI format. The quality of music will be determined by MIDI capabilities of your sound card and professionalism of the musician creating the source MIDI file. There are programs converting MIDI files into WAVE using only their own timbres of MIDI instruments (WAVE-table synthesis).

    The reconversion from WAV to MIDI is the music recognition problem which up until now has no qualitative solution (with 100% accuracy).