J. Chris Griffin
Recording Tech. for Non-Majors
04 November 2017
Waveforms and Musical Physics
Waveforms are representations of physical wave signals in the air (in this case, sound waves). A wave is a periodically occurring event, such as ocean waves coming towards the beach, or x-rays. Sound waves in specific occur when something causes pressure differences in the air (vibrations) in a regular manner. Sound waves need three primary things to be created, travel, and be heard: A vibrating source, a medium in which the sound wave will travel in, and a receiver. For example, a guitar string (the source) vibrates, thereby creating differential pressure waves. These waves travel through the air (the medium), and towards your ear. Your eardrums (the receiver) therefore vibrate, and create electrical impulses to send to your auditory thalamus in your brain. The reception of the sound can be delayed by ‘storing' the waveform. This can be done with a microphone and a computer, which can ‘store' it, and replay it later. A waveform should never be distorted or altered without a distinct artistic purpose.
A waveform is defined by its fundamental characteristics- Its frequency and its amplitude. Measured in metres, the amplitude is the height of a waveform from the middle to the peak. A higher amplitude means that the sound vibrations have more energy and are disturbing more air, and are therefore perceived as louder. Amplitude, volume, and gain are essentially synonyms of each other. Frequency is the number of waves there are in a given time period (1s), measured in Hertz (Hz). For example, a wave is travelling at 50Hz, it means that the wave is passing through any given point fifty times per second. A lower frequency will sound ‘bassier', and of a lower pitch, whilst a higher frequency will sound higher pitched.
The general range of a human ear is 20Hz to 20kHz (20,000Hz). Most healthy children and teenagers will be able to hear up to 20kHz, however, as one gets older, they have trouble hearing higher frequencies, but still hear low frequencies very well. When ear damage occurs, your brain often compensates for the loss of volume in a certain range, so although there is permanent physical ear damage present, you may not realise it after a while. Notching (illustrated below) is a fairly common form of ear damage when one has a ‘notch' in a certain frequency range, and as a result, hears that small frequency range at a lower volume.
Octaves, Harmonics, and Timbre are all based off of frequencies and the mix of frequencies in a waveform. An octave of a note is double or half the frequency of a note: For example, an A4 note is at the frequency of 440Hz. One octave down from that is the A3 note, at 220Hz, and one octave up from the A4 note is the A5 note, at 880Hz. A Harmonic is a note that is not double or half the original frequency, and is therefore a division of a note that is not the octave. IE: If the original note is an E4 , then a harmonic would be an A#4 , a G5, or any other note that is not an En. Timbre (pronounced tam-ber) is the balance of harmonic frequencies in a note, which define the characteristics of the sound. IE: If one were to play an open E string on a Les Paul neck pickup, it would sound very different from playing an open E string on a Telecaster bridge pickup. Even though the notes played are the exact same, the timbre is different, and therefore produces a different perceived sound.
The Fletcher/Munson Curves (shown above) illustrate a phenomenon in human hearing, in which at lower volumes, mid-range frequencies are perceived as louder, and higher and lower frequencies are perceived as softer. As volume is increased, the curves level out, and low, mid, and high frequencies (except for a small notch at 4kHz) are perceived equally. The principles from the Fletcher/Munson curves are used in music recording and mixing, taking into account how loud the song is meant to be listened to (IE: Black Sabbath is made to be listened to at high volumes, whereas Norah Jones may be made to be listened to at softer volumes), and equalising the frequencies to where it sounds pleasant to the human ear.
Gear and Equipment
There are two primary types of microphones: Condenser microphones (also known as Capacitor microphones), and dynamic microphones. Condenser (capacitor) microphones are the most common type of recording microphones. They use an active circuit (meaning that the circuit in the microphone requires a power input). Condenser microphones (diagram below) have a small gold coated membrane (diaphragm) and a charged brass disk (backplate). Vibrations and changes in air pressure move the gold membrane; as the membrane moves closer to the brass disk, a positive voltage relative to the waveforms is created and outputted, whereas when the membrane moves away from the disk, a negative voltage is created and outputted. As a result of the mechanism, the condenser microphone is very sensitive, has lots of clarity, and is extremely precise, accurate, and efficient. In addition, the microphone has an extended high frequency response, but this can make cheaper condenser microphones sound very harsh at high frequencies. Some common issues with condenser microphones are that they are very delicate, so very loud stimuli can collapse the capsule within the microphone. Also, because they use active circuits, they require a power source (or phantom power) to be used.
Another common type of microphone is the dynamic microphone (diagram below). These are mostly used on stage because of their reliability and durability, but are also sometimes used in the studio. Much like a reverse speaker, dynamic microphones use a membrane attached to a coil of wire, which moves in and out around a magnet. As the coil moves in, a small positive voltage is created relative to the movement, and as it moves out, a small negative voltage is created. Because of the electromagnetic technology that dynamic microphones use, they don't need any power source to work (they are passive). Dynamic microphones are robust and can take very high sound pressure (they can handle very loud sounds), so they are popular for live performances. In addition, they are very efficient. Unfortunately, the durability and passive nature of these microphones mean that they sacrifice precision and frequency response (especially in high frequencies). Furthermore some (cheaper, lower-end) models distort the original waveform.
Some other less common types of microphones include ribbon microphones, which are favourable for mic'ing guitar and bass amplifiers, and electret microphones, which are very small and cost-efficient, and are common in small electronic devices.
There are many factors to consider when using any sound equipment. The first and most important factor is that no system is one hundred percent efficient. Since every component in a system has flaws, one must understand these limitations and use them to their advantage. Since we can never have 100% efficiency, we must work with what we have- For example, one might use a twangier microphone for recording harmonica, while using a microphone with a better bass response when recording a double bass.
Another important factor to consider is the proximity effect. When one is in close proximity to a microphone, the bass response increases significantly as sound waves are reflected between the microphone and the user. This effect can also be used to one's advantage when understood properly.
The final factors to consider with microphones are the patterns and placement. Microphone patterns describe a microphone model's sensitivity to different frequencies at different angles around the microphone itself. Polar cardioid microphone patterns (seen below) are used to reject most sounds besides for those coming from directly in front of the microphone itself. This is useful when trying to reduce feedback from monitors, and when the main sound source is limited from one direction. Omnidirectional patterns (seen below) are made to pick up sounds from all directions with equal clarity and volume. These would be a good choice for choirs. Microphone placement is important when trying to reduce feedback and unwanted noise. A microphones pattern should always be considered, and microphones should be placed in a direction which does not pick up any noise from monitors, in order to reduce feedback.
After the microphone comes the microphones amplifier (the pre-amp). The job of a microphone pre-amp is to buffer and boost the signal of a microphone, as the outputted electrical signal directly from a microphone is extremely weak. There are two types of pre-amps: Tube pre-amps, and solid state (transistor) preamps. These two different types of pre-amps handle attack transients in very different ways, making a signal sound more or less punchy. Solid-state pre-amps handle attack transients much better and clearer than tube pre-amps. Solid-state pre-amps are superior when recording drums, as attack transients are very important, whilst tube pre-amps are better when recording guitar and vocals, as they sound warmer, and sharp P and S sounds are less present in vocals.
There are many types of speaker and monitor systems, but there are a few guidelines to follow to find the right fit. The most important consideration with speakers is accuracy. One should always look for a speaker system that can accurately reproduce transients, timbre, dynamic range, and distortion. Fletcher Munson principles should be applied to match one's musical style and volume preferences (some speakers are made to sound good soft, and some are made to sound better when cranked). A fair rule of thumb is that the more expensive a monitor is, the better it is, as monitors are not usually marketed to the consumer market, and as a result, prices are not so much a result of marketing, but rather an indication of quality.
Unlike monitors, headphones are made for the mass market, and often price is not an indication of quality. Fletcher/Munson curves should also be applied when choosing headphones.
Finally in equipment, there are cables. Better cables contain more oxygen-free copper. There are many types of cables. Here, a few common types will be explained. XLR Cables are very commonly used by almost all microphones and consoles. They have three conductors, and waveforms travelling down pin 2 and 3 are mirror images of each other. They should be wired carefully, as if wired incorrectly, they will cancel each other out. The next type of cable is ¼”. There are two types: Balanced and unbalanced. Balanced cables usually have higher noise interference ratings than unbalanced cables. Another type of cables are TT jacks and plugs. These were originally used for telephone switchboards, so they are very robust. Many pro recording bays use them as they are small, yet reliable and strong. Lastly, MIDI is a low level digital data interface between two switches. This allows for automation and syncing, and is very useful for live DJ'ing. There are many, many more types of cables out there.
When processing waveforms, there are two main approaches: analogue and digital processing. Analogue processing was the method that was used for all recording up until the 1980s. It is a brute-force-like approach that is very inefficient. An analogue approach is one that uses electrical equipment that processes waveforms in a way that the voltage is analogous to the waveform. This means that at any point in the signal chain, the voltage can be measured and will be directly related to the original waveform. Unfortunately, most analogue storage mediums (such as tape) break down over time, and can introduce distortion and other artefacts. This breaks the rule of never altering a waveform without artistic purpose. In addition, because analogue processing is terribly inefficient, the original signal must be amplified many times, which can degrade the quality and add unwanted artefacts. Despite these negative aspects, analogue processing is sometimes still used today (for example, the new Foo Fighters record was recorded completely using tape), as if done right by an experienced artist, analogue systems can subtly change the source audio in a very artistically pleasing way.
Digital processing is the modern, computer based approach when recording waveforms. Created in the 1980s, this digital approach uses sophisticated algorithms to convert the waveform into a complex mathematical number set which represents the original waveform, but unlike analogue processing, is not analogous to the original form. The output, however, is analogous to the original waveform. Digital processing uses two distinct devices to convert electrical impulses to number sets and back; Analogue to digital converters (A/D), and digital to analogue converters (D/A). Digital processing is extremely efficient when compared to analogue processing, as it has virtually no signal loss. In addition, unlike analogue mediums, digital mediums don't degrade or decay over time. Moreover, the mathematical nature of digital processing means that complex alterations and effects which would be impossible with analogue processing can be done using mathematical algorithms to modify the number sets. Both digital and analogue processing require the exact same front end gear.
When processing audio signals, effects are often used to alter the waveform.
EQ (equalisation) is an effect which is used to alter certain frequencies of a waveform using multiple filters covering bands of frequencies. EQ is used for many purposes. First and foremost, when using microphones to record, one must consider the fact that microphones are not like human ears, and cannot filter out what is sonically displeasing. For example, EQ can be used to make sibilants from a vocalist sound less harsh. There are also musical considerations when using EQ. Different genres may need to stand out in different ways, and different instruments may need a boost in certain frequency ranges to stand out in a band mix. For example, Metallica often cuts the mid-range frequencies in their guitars, while Eric Clapton boosts his guitar's mids, each to define and fit their musical genres and styles of playing. These purposes are divided into two main categories: Necessary EQ and Creative EQ.
Necessary EQ is when EQ is used to reduce feedback or tune a system to accurately reproduce the original signal (can be done by cutting/boosting certain frequencies), and to make up for inefficiencies and flaws with equipment. IE- When one is using a wedge monitor (speaker), typical voice mics are prone to feedback, so graphic EQ is useful, as certain frequencies can be cut to reduce feedback.
Creative EQ is not completely necessary, per-say, but it is used to increase the quality of the sound aesthetically. When using creative EQ, the output is subjective, and it requires artistic skill and experience. Creative EQ can just as easily make a recording sound bad as it can make it sound good.
EQ works by electrically changing the level of different frequencies, independent of the overall volume. Graphic EQs have 30+ sliders which together control the entire set of frequencies between 20Hz and 20kHz. Each of these sliders control a band of individual frequencies. Parametric (sweepable) EQs have knobs rather than sliders, and the frequencies around a movable centre point are boosted/cut in proportion to that point. Frequencies closer to the centre point are affected more than those further away. Fully parametric EQ lets the user vary the range of frequencies around the given frequency, known as Q, while semi-parametric EQs have no Q control. When using a Q control, a narrow Q is sharp sounding as, whereas a wide Q is less intrusive and more like a brush sweep, much like the sweep of a guitar wah-wah pedal.
The Fletcher/Munson curve is a very important consideration when altering EQ. As a result, the EQ of a song can often be dictated by the estimated listening volume of the song, and
therefore the genre of a song. A simple example of EQ is seen on the ‘loudness' button of a home/car stereo. This is an automatic EQ which flattens as the volume increases, reflecting the Fletcher/Munson curve.
Compression is another type of audio effect. To understand compressors, we must first understand dynamic range. Dynamic range is the range of volume from the softest to the loudest waveform. As seen in the diagram below, dynamic range is not the same as the maximum amplitude. An example of dynamic range in action; If you are in an airplane which is taking off whilst listening to a song, and you have to keep adjusting the volume to hear the song over the noise of the engines, the song has a high dynamic range. Humans usually find a large dynamic range to be sonically displeasing. Compressors make it so that one doesn't have to keep adjusting the volume, by reducing the dynamic range.
Compression occurs in two steps. First, the effect limits the dynamic range by bringing loud passages down in volume (soft parts are not brought up in volume). Secondly and finally, the volume of the track as a whole is increased. This causes the soft passages to be perceived as louder, whilst actually just compressing the volume of the loud parts. This mechanism does not require any amplification, and therefore usually does not result in any loss or degradation of sound quality.
Compressors have multiple parameters that need to be controlled by the user. First, the threshold is adjusted. This tells the circuit the amplitude at which it should begin compression (the Y-axis). Next, adjusting the ratio changes the quotient that the loud parts are compressed by. Attack (X-axis of waveform) specifies how long after the waveform hits the threshold amplitude to begin compression (this can provide the opportunity to increase the track's dynamic range). Then, the release (also X-axis) controls how long the compressor takes action for after it has begun. Lastly, the gain dictates the final addition or reduction of volume for the output.
...(download the rest of the essay above)