Speaking, this is fundamental frequency, abbreviated f0 or F0): tone, pitch accent, intonation. Practice calculating f0 directly, using the waveform of a sound file in Praat. Learn how to use the Praat spectral slice (=power spectrum in AAP) function and a narrow-band spectrogram to analyze the components of a complex wave and measure f0.
- Praat Fundamental Frequency Scale
- Praat Fundamental Frequency
- Fundamental Frequency In Praat
- Praat Fundamental Frequency Table
- Praat Fundamental Frequency Chart
This will give you the time (t1) and fundamental frequency (f0) at the cursor point. Now you can record F0 by simply choosing the relevant point in the pitch track and hitting F12 (for log 1) or Shift-F12 (for log 2). Praat allows you to automatically move your cursor to the F0 max or min in a selection. This script measures Pitch (Fundamental Frequency or F0), the standard deviation thereof, jitter, shimmer, and harmonics to noise ratio (HNR).
Before using Praat to do sound analysis, we have to be clear about know that what information we can get from Praat. Table 1 presents some major acoustic variables we usually use to analyze the speech sounds. (See Figure 1. 49) for visual presentation of the variables.
Figure 1.49
Table 1.1
If you want to extract a section of a sound (usually a single word or vowel) into a different Sound object for analysis, you can
- Select the section of sound by cursor
- Click File → Extract Selected Sound (preserve times)
By doing this, a new sound file will be is created in the Objects window, containing just the selected part of the original sound.
A. Showing the Spectrogram
Normally the waveform and spectrogram will be presented automatically if you select one file and click 'View and Edit'as Figure 1.50.
Figure 1.50
B. Adjusting the Spectrogram Settings
The most important settings here are the window length and view range.
1) View range (Hz)
View range decides how much of the spectrum is shown. For speech, we normally set the range from 0 to 5,000 or 6,000 Hz, but for examining fricatives, we might need to set it as high as 15,000 Hz. For music, we may need to focus on the area from 100 to 2,000 Hz. (Revised from Styler, 2012)
You can adjust the View range by clicking 'Spectrum' → 'Spectrogram Settings'
Figure 1.51
b) Window length
Praat can provide you with both Broadband spectrogram and Narrowband spectrogram by adjusting the window length. The shorter the window length, the larger its bandwidth (Bandwidth = 1.299 / window length). There is no clear cut boundary between Broadband spectrograms and Narrowband spectrograms, if the window length is around 3-5 ms (bandwidth: 200-300Hz), the resulting spectrogram is called 'wideband'. For the window length around 20-30ms (bandwidth: 30-50Hz), the spectrogram is called 'narrowband'. Wideband spectrogram is used to observe the formant structure while narrowband spectrograms reveal the harmonic structure (pitch information).
- Broadband spectrogram (Window Length: 0.005s) is used to observe the formant structure of sound, and it is the default setting in Praat. (See Figure 1.52)
Figure 1.52
- Narrowband spectrogram(Window Length: 0.025s ) can be used to look at the harmonics structure (F0 / Pitch information) (Figure 1.53).
Figure 1.53
You can adjust the window length by clicking 'Spectrum' → 'Spectrogram Settings' → set the 'Window Length' to 0.025s (or the narrowband window length of your choosing) → Click OK.
Figure 1.54
Praat Fundamental Frequency Scale
Now, you can see harmonics clearly in this narrowband spectrogram.
If you set the view range roughly as 0-500 Hz for speech in this narrowband spectrogram, the contours of the harmonics will accurately represent the pitch contours of the voice, which can give you a sense of the pitch (F0) contour before using the Praat pitch tracker for more precise measurement.
Figure 1.55
To return to a broadband spectrogram, you can click 'Spectrum'→ 'Spectrogram Settings' → Set the Window Length to 0.005 (or the broadband window length of your choosing) → Click OK
And then you’ll be back to the default broadband spectrogram.
Before we illustrate how to measure pitch in Praat, let’s discuss what the pitch is and what it used for.
Pitch is a term used to refer to variations in fundamental frequency (F0), which serves as an important acoustic cue for tone, lexical stress, and intonation. For example, in Chinese, which is a tone language, each syllable or morpheme may have its own pitch.
A. Extracting information about pitch
- Display the pitch track: Pitch → Show pitch
- At this point, a blue line will be placed on the spectrogram representing the pitch. At this time, you can place the cursor at the point and read the blue number on the right side of the window.
- Or you can position the cursor in a stable middle part of the blue track and click 'Pitch' and then select 'Get pitch'. A local pitch value will be displayed in a separate window.
Figure 1.56
Figure 1.57
B. Getting Maximum, Minimum, and Average pitch for a section of speech
- Select the portion of the sound for which you would like the Maximum, Minimum or Average Pitch
- Select the proper command for your task from the top menu: Pitch → Get Pitch/Get Maximum Pitch / Get Minimum Pitch
C. Improving the pitch contour by adjusting the pitch settings
Sometimes you will find that the blue pitch contour jumps up and down, doubling and halving the actual F0, and in many cases, especially where the speaker is creaky, the pitch track will drop out altogether, which is because Praat’s default pitch range is not appropriate for the file you’re analyzing. Therefore, in order to make the pitch track more visible and better reflect the speaker's voice , you may need to adjust some of the pitch settings via Pitch → Pitch settings (see Figure 1.58).
Figure 1.58
The fundamental frequency of the voice (pitch) usually ranges from approximately 30–300 Hz, but this varies according to different speakers: typically males’ pitch ranges from50-180Hzand females from 80-250Hz, so we usually set the pitch range to a reasonable range of 50-400Hz for general usage.
If you have a general sense on what the speaker's actual range is (e.g. getting from the previous measuring), you can set the minimum to just under the speaker's lowest F0 and the maximum to just over their highest pitch excursion.
If the pitch contour is too low in the spectrogram, you can increase the maximum value of the pitch range (e.g. increase from 400 to 500Hz); if the pitch contour is too high, you can decrease the maximum value of the pitch range (e.g. increase from 400 to 300Hz).
(This part is adapted from Stonham's lecture notes (p.13) that is available at http://stonham.dyndns.org/phonetics/handouts/prosod_hndt.pdf)
Remarks:
On the right side of the window, you can find the fundamental frequency (F0), which is marked in blue, while on the left side, the frequency value marked in red is formant frequency.
Figure 1. 59
- Position the cursor in a stable middle part of the sound and do the following
- Go to 'Intensity' → select 'Get intensity'. A local intensity value will be displayed in a separate window.
Figure 1. 60
Figure 1.61
Let’s discuss how to extract information about formant values
- Position the cursor in a stable middle part of the sound.
- Go to 'Formant' and select 'Get first formant' (F1), The local first formant value will be displayed in a separate window.
- Do the same for the second formant (F2), third formant (F3), and fourth formant (F4).
Praat Fundamental Frequency
Remarks:
- It’s more efficient to use 'Editor' → 'Formants' → 'Formant Listing', which will give you values for F1, F2, F3 and F4, along with the time point at which the measures were taken.
Figure 1.62
Figure 1.63
- Adjust the Formant settings to make the measure more accurate.
You can go to 'Formant' and select Formant settings
- For the male, set the maximum formant (Hz) as 5,000Hz
- For the female, set the maximum formant (Hz) as 5,500Hz
- For the children, set the maximum formant (Hz) as 8,000Hz
Figure 1.64
Hits: 634
ACOUSTIC MEASURES
Acoustical vocal parameters measure frequency, intensity (amplitude), perturbation (jitter & shimmer), and range. This type of measurement can provide valuable information regarding vocal fold movement as well as underlying vocal fold physiology and pathology. Acoustic measurement needs to be coupled with physiological and perceptual measures in order to provide an accurate differential diagnosis.
Acoustic Signs of Voice Problems
Fundamental Frequency
Fundamental frequency (Fo) is the vibratory rate of the vocal folds. It can be measured in hertz (Hz) or cycles per second (cps). Average fundamental frequency during conversation for males ranges from 100 to 150 Hz, whereas for females it ranges from 180 to 250 Hz. There are a variety of methods available to measure Fo, which range from very simple to complex. Subjective measurements are less reliable than objective (quantitative) measurements. The phonational range is the range of frequencies (highest to lowest) that an individual can produce, which decreases with age. Colton & Casper state that 'this measurement reflects the physiological limits of the patient's voice'.
Fundamental Frequency In Praat
Amplitude
Measurement of vocal intensity is useful in documenting the dynamics of the voice. Mean intensity correlates with the perception of vocal loudness, and the variability of intensity would presumably correlate with a patient's loudness variations (Colton & Casper). Vocal Intensity can be measured in decibels (dB) better known as the sound pressure level (SPL), which indicates the strength of vocal fold vibration. Coleman, Mabis, & Hinson (1977) state in Colton & Casper that, 'normal speakers should be able to produce minimum intensities of around 50 dB and maximum intensities of around 115 dB; intensities for males are slightly higher than for females'. Baken (1987) states in Colton & Casper that, 'everyday conversational speech may exhibit SPL's between 70 and 80 dB'. Measurement of perturbation refers to the small, rapid, cycle-to-cycle changes of period (jitter) in the fundamental frequency of the voice and amplitude (shimmer) that occur during phonation. These changes reflect the slight differences of mass, tension, and biochemical characteristics of the vocal folds, as well as slight variations in their neural control. Perturbation correlates with perceived roughness or hoarseness in the voice (Colton & Casper).
Signal-to-noise ratio (harmonics-to-noise ratio)
Colton & Casper state that, 'noise is random, aperiodic energy in the voice. Normal voices have low levels of noise, whereas abnormal voices show greater noise levels'. Harmonics-to-noise levels less than 1 would be expected in abnormal voices.
Praat Fundamental Frequency Table
Vocal rise or fall time
Colton & Casper state that, 'the time it takes to produce a tone of full amplitude is referred to as rise time. The time it takes for the vocal folds to stop producing a tone is called fall time'. Some pathologies will affect the vocal rise or fall time, although sufficient research has not been performed to date. R. J. Baken states that, 'acoustically, different types of vocal attack are discriminable by the vocal rise time, among other things'. Koike (1967) in Baken proposes that 'the rise time associated with the softest vocal initiation that could be produced by patients with several different types of laryngeal pathology'.
Praat Fundamental Frequency Chart
Voice Tremor
Colton & Casper state that, 'tremor refers to a regular variation in the fundamental frequency or amplitude of the voice'. 'Tremors are usually associated with central nervous system dysfunction' (Colton & Casper).
Phonation Time
Phonation time can be achieved through the measurement of maximum phonation time and the s/z ratio. 'Maximum phonation time refers to the maximum time a subject can sustain a tone on one breath' (Colton & Casper). Normal males can sustain for approximately 20 seconds, females 15 seconds, and children 10 seconds. Colton & Casper state that, 'short maximum phonation times reflect inefficiency of the phonatory or respiratory system'. The s/z ratio for 'a normal speaker would be expected to sustain both the voiceless /s/ and the voiced /z/ for approximately equal durations, resulting in a ratio of 1' (Colton & Casper). Colton & Casper state that, 'in the presence of a disturbance of vocal vibratory behavior and/or ability to close the glottis, the duration of the sustained voicing of /z/ would be expected to suffer', therefore increasing the s/z ratio. Eckel & Boone state in Colton & Casper that, 'any s/z ratio greater than 1.4 may indicate a vocal pathology'.
Voice Stoppages
During phonation, when silences are longer than normal or occur unexpectedly are considered abnormal (Colton & Casper).
Frequency Breaks
These can be seen as sudden shifts of fundamental frequency in either an upward or downward movement, often related to pitch breaks (Colton & Casper).
Normal Acoustics
Individuals can present with acoustical features which are so subtle that they appear normal.