Some Digital Audio Theory
Meloware´s Antique Phonograph Record Archive

   Sound is nothing more than waves of pressure and vibration which change. over time. The acoustic recording technology took advantage of this fact, by causing these pressure waves to vibrate a needle back and forth, or up and down, and then used this motion to cut into a soft material, such as wax. If we look at an old under a large magnification we might see something like this:

You should notice the variation in the lines from nearly straight, to quite rippled, indicating the ranges of loudness and pitch. We could represent this recording as a graph in this way:

    The wavy line is what we call an analog of the recorded sound. The straight line is the position the groove would be at, if everything was perfectly silent. Above and below this straight line we have a range of free motion for the needle. We have also included the series of dots at the bottom of the graph to represent even increments of time.

    Never allow your recorded signal to hit the ´ceiling´ or ´floor´ of this allowed free range. Making the most use of the free range is very important for a good dynamic range of the recording, but allowing our audio to exceed this range will result in very disturbing distortion.

Dynamic range is the difference between the loudest and quietest sound. The unit of measurement is called the decibel (dB). If you examine the detailed record issue, in catalog of this archive, you will notice that we provide a pair of negative numbers in the listing. This is a measurement of how quiet silent portions of the record are, at both the beginning and ending of the record, and is a good indication of the record´s condition. Zero is considered to be the loudest sound, and the negative decibel value is the quietest. An increase of 3dB is a doubling of intensity, but will only seem like a noticable increase to the ear. A typical new record from 1908 would has a dynamic range around 20 dB. A modern CD audio disk has a range of 90dB. Many people complain about old records being ´scratchy´, when in fact they are objecting to their unfamiliarity with a normal noise floor for record of that age.

Let’s take a look at what we will normally expect to see with a typical sound recorder and editor. The first is an example of what an entire 3 minute recording might look like:

The white portion of this picture represents the recording. There is too much of it displayed in this window to really see what´s going on. Notice that the white signal never completely reaches the top or bottom, of the window. This recording will not have any ‘clipped’ samples, which is the cause of the severe distortion. We should also notice that most of the dynamic range is utilized well, but there is still a little room above and below, which would have allowed us to record it a bit louder. These free margins also allow us later to boost certain frequencies, in order to enhance clarity.

If we magnify in (zoom) on a segment of time, we begin to see more detail.


Each white cluster is actually a sung word or note played in the record. Notice the minimum band of white along the centerline. This is surface noise of the record. This particular record is in very good condition. You can clearly see the separation between the notes and words and the centerline is fairly thin. You will NOT see scattered vertical lines across this picture. Clicks and scratches appear as sudden, sharp vertical lines.

 At this level of magnification we can see how the volumes of sounds change wth time. The long, broad areas are sustained sounds, while the narrow bursts are sounds of short duration. The larger an event appears vertically, the louder it is. These are known as volume envelopes. The term ‘attack’, is used to describe the time for a new sound to reach it’s maximum volume. The ‘sustain’ is the amount of time a sound maintains it’s volume, and the ‘decay’ is the time the sound takes to fade away. A drum hit will have a very rapid attack, a very short sustain, and a very rapid decay. A soft violin string may have a much longer attack, a pronounced sustain, and a graceful decay time. Clicks and pops will have extremely rapid attacks and decays, and virtually no sustain.

Zooming in to even smaller time segments reveals the essence and pitch of the sounds themselves. It will be in this micro temporal world where most of our time is spent in restoration.

We now are looking at the complex vibrations which make up sounds. As a sound swings from it’s starting point and passes back and forth between it’s maximum and minimum ranges, and returns to it’s starting point, we refer to this as a complete period or cycle. The greater the range of the swing, the louder the sound is. The amount of time needed for a sound to complete a cycle is called frequency. An increase of a sound’s number of cycles per second will raise the pitch of the sound. In the olden days we just called the frequency cycles per second, or just cycles. Thousands per second would be called kilocycles. Today we use the term hertz ( pronounced "hurts")(Hz).

 Notice that there is a pattern to these waves. Each cycle is not identical to the next, but in this case seems to appear as groups of three (one tall, two shorter...one tall, two shorter). My own term for one of these groups of three is a ‘phrase’. Looking for phrases can be very important later in restoration. If you are lucky, a phrase somewhere in the recording might be badly damaged, but the one just before, or immediately after it might be in perfect condition. It is often possible to copy the good one and paste it right over the bad.

 The digital world stores everything as numbers, either ones or zeros. So far, we are looking at analog style charts to explain things. Fortunately, we will mostly stay with analog models while doing our work. We should, however, understand what is going on in the digital world. Decisions we make about our archive are affected by it.

Sound and time both exist as a continuously changing form. We think of seconds as single units of time, but seconds themselves can always be divided into smaller parts. Digital computers must, however, do everything in steps.

 When you make a digital recording, you must first decide how many steps per second you want to use, and how detailed a number you wish to store for each step. The first determines what is the highest frequency you will be able to record, the second defines the dynamic range of volume possible. The picture above shows a time segment so short, that the individual samples are displayed. The smooth line connecting the small rectangles is only for our understanding and does not really exist. The numbered ruler at the bottom shows the count (or time) of each sample and the position of the sample on the chart is the sample’s value (such as size or volume). When a digital recording is played, the series of numbers are sent at the defined rate to a converter, which turns them into an electrical voltage which is directly related to the sample’s value. It is this new electrical signal which will play our speakers or headphones. Audio CDs are recorded at 44.1 KHz (kilohertz) per second sample rate, with each sample being 16 bits, which means that each sample will be able to represent one of 65,536 unique levels. Now, our old records probably don’t need to be recorded this accurately, but it is a nice standard to hold to. The files produced at this standard have the greatest compatibility with various software, the storage media is affordable, and it would be a very hard argument to claim that anyone was loosing sounds and qualities from these old records which might otherwise be saved.
 
 

previous
next
index
home