Sound

Introduction

I made a sound file which contains wind, a meadowlark singing, a bell dinging and me saying “To be or not to be, that is the question”. I open it in Praat to the following screen; you can download it yourself from here if you want to follow along at home.

_images/sound-4sounds.png

Fig. 5 Four sounds

From http://soundbible.com/, 2190-Front-Desk-Bell and 2180-Meadowlark by Daniel Simon, 1234-Wind by Mike Koenig.

Questions

Which graph A through D corresponds to which sound?

The physics of sound

Sound is generally considered to have three physical attributes: frequency, strength or intensity, and quantity.

Frequency

Imagine that you pluck the string on a guitar. It will vibrate back and forth somewhat like the curve in the top part of Fig. 6:

_images/sound-String2Sine.png

Fig. 6 Vibrating string and the waveform it produces

Author’s image.

This oscillation is conventionally represented as in the bottom half of the figure, in which the displacement of the string from side to side is measured as the displacement of the wave from bottom to top, and time flows from left to right. Since the displacement repeats itself, it is said to oscillate. A single cycle of the oscillation is measured from one peak to the next of the waveform. The number of cycles that occur during some unit of time constitutes the frequency of the oscillation. The basic unit of frequency, the hertz (Hz), is defined as cycles per second.

A simple illustration can be found in the next diagram of the graphs of two sine functions:

_images/sound-SineWaves.png

Fig. 7 Graph of two sine functions with different frequencies.

Author’s image.

The one marked with o’s, like beads on a necklace, completes an entire cycle in 0.628 s. The other wave, marked with x’s so that it looks like barbed wire, completes two cycles in this period.

As you can see, the shorter the cycle, the higher the frequency, which defines an inverse relation. Mathematically, where f stands for frequency and T for the time period, we have:

\[\begin{equation*} f = \frac{1}{T} \end{equation*}\]

You can calculate the frequency of the two examples as \(1/0.628 = 1.59 Hz\) and \(1/1.256 = 3.18 Hz\).

This is simple enough, and if this were all there was, then the world would be an acoustically rather dull place.

It turns out that not only can a guitar string vibrate at the frequency of the entire string, but segments of a string can vibrate independently. The figure Vibrating segments of a string depicts how this is possible. Each half of the string completes a cycle in half the time of the whole string:

_images/sound-VibratingSegments.png

Fig. 8 Vibrating segments of a string

Author’s image.

The result is that the halves have twice the frequency of the whole.

This can go on and on:

_images/sound-Harmonic_partials_on_strings.png

Fig. 9 Vibration and standing waves in a string

The vibration of the whole string is so fundamental, that it has its own name, the fundamental frequency, and its own abbreviation as \(f_0\). The higher frequencies that are multiples of the fundamental (fractions of the string) are usually known as harmonics and are indicated with subscript integers, e.g. \(f_1, f_2\), etc.

Nature makes this deliciously complex by allowing the string to vibrate at all these frequencies at the same time. Fig. 10 displays the outcome of superimposing both frequencies on the string and the waveform, a phenomenon known as superposition:

_images/sound-Superposition.png

Fig. 10 Superposition of whole and half oscillations of a string

Author’s image.

Warning

The discussion of speech production has been moved to Auditory cortex.

Intensity

Intensity is the attribute of a sound that allows it to be ordered on a scale from quiet to loud. Sound intensity is also known as sound pressure, sound power or sound strength. It is usually measured as the sound pressure level or SPL in decibels. Let’s unpack that.

A decibel is one tenth of a bel. A bel is the ratio of the absolute measure of power or intensity to a reference value. For sound pressure level, the reference value is the threshold of human hearing. The ratio is measured logarithmically, so that an increase of 10 db means that intensity has increased 100 times. The ‘thermometer’ below plots some common sounds on a deciBel scale in terms of their loudness. There are many others that you can find by googling “db scale hearing”:

_images/sound-dbLevels.png

Fig. 11 The intensity of some common sounds in dB.

Quantity

The quantity of a sound is how long it lasts. It is measured in seconds. And that is all there is to say about it.

Summary

Table 8 Summary of physical attributes of sound
Attribute Unit
Quantity second
Frequency hertz (cycles per second)
Intensity SPL db

The psychology of sound

Even though sound is measured physically in three ways, it is perceived in at least six: pitch, loudness, phase, direction, distance, and timbre. We touch on each.

Pitch is the perception of a sound as being high or low and comes from its frequency, so the two are used interchangeably.

Other types of frequencies

Even though this chapter is concerned with the physics of sound, other domains can also be characterized in similar terms, so it is convenient to take them up now.

The spectrum of visible light

To get us started, do you know the answer to this question?

Question

What produces a rainbow?

As explained at the beginning of the fourteenth century by Theodoric of Freiberg and Kamāl al-Dīn al-Fārisī – independently – it is because droplets of water act like prisms that refract white light into its component frequencies, as seen in this single image taken from the magnificent animation in Wikipedia’s Prism article:

_images/sound-PrismSpectrum.png

Fig. 12 A triangular prism refracts white light into the colors of the rainbow

Another question:

Question

Which color has the lowest frequency and which, the highest?

One of the few things that I remember that I learned in high school was ROYGBIV, the acronym for the ordering of the colors of the rainbow as “Red, Orange, Yellow, Green, Blue, Indigo, Violet”. The image shows you why this is the order: the prism splits the colors by frequency, with the lowest frequency/longest wavelength red at the top and highest frequency/shortest wavelength indigo at the bottom. Such a collection or continuum of frequencies is known as a spectrum. The image below paints a more traditional piecture of the spectrum of visible light:

_images/sound-VisibleSpectrum.png

Fig. 13 A linear representation of the visible light spectrum.

The important thing to remember is that red has the lowest frequency and violet the highest.

Question

Can you explain the difference between the prefixes in ‘infra-red’ and ‘ultra-violet’?

Spatial frequency

Question

What is this an image of?

_images/sound-Dali2.png

It’s not what you think, which you will see by clicking on this link to Gala Contemplating the Mediterranean Sea which at Twenty Meters Becomes the Portrait of Abraham Lincoln - Homage to Rothko (Second Version).

You probably thought it was a corruption or pixelization of Abraham Lincoln’s face, like the one on a five-dollar bill. Actually, it’s a painting designed to … well, you get the idea from its title.

The effect is called the “Lincoln illusion” in an article by Leon Harmon and Bela Julesz in Science and in fact graces its cover, Science 180:4091, 15 June 1973, though its psychological term is block masking. In November of the same year, Scientific American published an article by Harmon on the same topic, The recognition of faces, whose on-line version shares a pixelization of the Mona Lisa.

The painter, Salvador Dalí, was inspired by the Science article three years later to produce the work with which we began this topic.

more My discussion is based on Michael Bachs’s Face in blocks.

more See the article in Frontiers of Neuroscience Marvels of illusion: illusion and perception in the art of Salvador Dali for more insight on Dalí’s art.012

more There has been a tremendous amount of posterior research on this topic, but you might enjoy this short essay about an artist who has taken the techniqe in a different direction, Chuck Close and the Fragmented Image.

So what does all this have to do with frequency? Can you answer the question in the title of the following table?

Question

Table 9 Low or high frequency?
Grating 1 Grating 2
_images/sound-HiSpatFreq1.png _images/sound-LoSpatFreq1.png

It would help to answer the question if the two images could be converted to some representation of frequency that we are already familiar with. Imagine that the amount of light reflected in each grating could be measured. The black bars would have the lowest measure (they reflect little light), while the white gaps between them would have the highest measure (they reflect the most light). There may be a fuzzy transition between the two extremes. With this convention, the gratings can be turned into waves as so:

Table 10 Conversion of gratings to waves
Grating to wave 1 Grating to wave 2
_images/sound-HiSpatFreq2.png _images/sound-LoSpatFreq2.png

Question

What is the frequency of each wave above?

I hope you tried to answer the question, but you can’t. Recall that we measured acoustic waves in cycles per second. Conversion of gratings to waves has plenty of cycles (extents from peak to peak), but … how would you time a static image in seconds?

The solution that researchers in psychophysics have come up with is to divide the visual field into degrees based on a standard distance from a viewer, usually six meters:

_images/sound-VisualFieldHoriz.png

Fig. 14 The diagram above depicts the normal horizontal field of vision, including the location of the blind spots for both eyes. In a normal person, the field of vision should span a total width of 190 degrees.

Table 11 Conversion of gratings to waves subtending a visual angle
Grating to wave 1 Grating to wave 2
_images/sound-HiSpatFreq3.png _images/sound-LoSpatFreq3.png

The result is that each image now has a ‘denominator’, the degrees that it subtends in the visual field - let us call it 60° in the disproportionate world of Conversion of gratings to waves subtending a visual angle. Grating 1 has 7 cycles per 60°, or 8/60 = 0.12 cycles/degree. Grating 2 has 4 cycles per 60°, or 8/60 = 0.07 cycles/degree.

Question

Which one has the higher frequency?

Which leads back to what I was trying to get to with the Lincoln illusion:

Question

How is the left image transformed into the ones on its right?

_images/sound-LincolnIllusion.png

Answering the following question finally leads to why I bring all of this up:

Question

Describe the two images below in your own words. Can you analyze them in terms of spatial frequency?

_images/sound-MfromZ.png

But you won’t get to see the answer for a while.

more For more than you want to know about how spatial frequency is used to measure how well we see, see Visual Acuity by Michael Kalloniatis and Charles Luu.

References

  • Harmon, Leon D., and Bela Julesz. “Masking in Visual Recognition: Effects of Two-Dimensional Filtered Noise.” Science 180.4091 (1973): 1194.
  • Harmon, Leon D. “The recognition of faces”. Scientific American 229(5):71–82.
  • Martinez-Conde, Susana, Dave Conley et al. “Marvels of Illusion: Illusion and Perception in the Art of Salvador Dali.” Frontiers in Human Neuroscience 9 (2015): 496.

Powerpoint and podcast

The next topic

Come to class having read Speech sounds and their articulation and answered the questions.


Last edited: Aug 28, 2019