Scientific Experiments
Passive Acoustic Monitoring
Northern Cardinals are seen all year round in the northeastern USA. The male in its striking bright red plumage remains silent all winter except for a few occasional short chucks. Towards the end of winter and all spring and summer, his loud whistled songs are heard near forest edges and often even in city parks and gardens.
There is usually at least one pair of resident Northern Cardinals in the Home 'patch'. I have not located a nest yet, but in mid-to-late summer I see immature cardinals, easily identified by their gray bills and buffy plumage. By fall the young males get their bright red color and appear to be similar to the adults.
In 2021, from late March to mid-April, I made fourteen audio recordings of the vocalization of the then resident male Cardinal. He sang his whistled notes perched on a small tree near my kitchen window or from treetops in the woods behind my home. Typically, he repeated one song several times before switching to another song. From these one-minute-long recordings made when he was near the kitchen window, I identified seven distinctly different song types.
A year later, in mid-February of 2022, a male Cardinal began singing while perched on the intertwined branches inside of a deciduous shrub visible from my kitchen window. He was singing softly as if practicing in private, and his notes had an irregular pattern unlike the Cardinal recordings from the previous year. I wondered if this was a young bird getting ready for his first spring and started recording his voice. Over the next three weeks his songs improved, and by the first week of March he could reproduce at least one of the songs of the Cardinal heard in 2021.
Northern Cardinal maturation of vocalization has been well studied in the laboratory, and there are several publications on the findings [1]. It is reported that young birds begin to produce warbled soft notes when about three weeks old. They remain quiet through the winter then begin singing in early spring. By April they have the repertoire of a mature bird.
Studying song development in wild birds is more challenging. In one report published in 1966 it is stated that in the spring it takes first year birds about a week to produce mature adult like songs [2].
As part of AvianActs work on Passive Acoustic Monitoring (PAM), and in an attempt to learn more about bird behaviors through their vocalizations, Hemant and I analyzed the Cardinal audio recordings made in the Home 'patch'. As described in this report, the difference between a mature and a first-year bird is distinct when these wild birds first begin singing in early spring. At this stage, their physical appearances are indistinguishable, but their voices give away their maturity level. This is an interesting application of PAM and different from our work on the Wood Thrush in which we could identify an individual bird by his songs (see link) Identifying Wood Thrushes by their Repertoires.
Our resident Cardinal in 2021 is given the name BISHOP. We start by examining the oscillogram and spectrogram (Fig. 1a) of BISHOP's vocalization recorded on March 2021 in my yard. It shows five repeats of a four-second long song (Song1) with pauses of four seconds in between. The song comprises two distinct syllables: an upslur and a downslur. Two repeats of a loud slow upslurred whistle syllable SL1 and six repeats of a fast downslur syllable SL2 are each represented by a thin line in the spectrogram. Examples of the two syllables are marked in Fig. 1b.
(a)
(b)
Fig. 1(a) Oscillogram and spectrogram of a Northern Cardinal vocalization comprising five repeats of a song (Song1), (b) one of each syllable in Song1, SL1 and SL2 are outlined in the spectrogram
The vocalizations of BISHOP were regularly repeated patterns of varying songs. 'BISHOP's repertoire comprised several different songs. He repeated each song several times before switching to a different song. The audio recordings I made were about one minute long and captured different songs on different days. From the recordings made on thirteen different days, we identified seven different songs. Examples of songs with unique syllables are shown Figs. 2, 3 and 4. All syllables are either upslurs or downslurs in the 1 to 5 kHz frequency range.
Fig. 2 Oscillogram and spectrogram of Song2 comprising two different downslur syllables SL3 and SL4
Fig. 3 Oscillogram and spectrogram of Song3 comprising very sharp upslurs of the type SL3 followed by four slower upslurs of the type SL6 falling in a lower frequency range
Fig. 4 Oscillogram and spectrogram of Song4 comprising one slow downslur SL7 followed by seven faster downslurs of the type SL8
Note the break in the syllable lines at around 2.5 to 3 kHz in Song4. The cardinal has two sound boxes, and this break occurs when he switches from one sound box to the other [3].
Even in early spring BISHOP appeared to have good control over his voice, faithfully repeating each song. We assumed that this is an older bird - in its second or a later year. Although the songs are reproduced well, a closer look reveals minor variations in syllables. These findings are described in Appendix A.
Now let us look at vocalization of the young Cardinal we named DEACON, in early spring of 2022. There were nine recordings made between February 12 and March 8. Fig. 5 shows the vocalization sequence of the first recording made on February 12. Although each of the five songs comprise fast upslur syllables, the number of repeats, repeat period and volume changes irregularly from song to song. DEACON's vocalizations on February 14 and 15 were similar.
In the recording made on February 19 (Fig. 6), the upslurs on the right side of the spectrograms are a little more regular. On February 26, there is clear improvement on the repeatability of the songs. By March 8, one pattern in his vocalization is very similar to the BISHOP's song (Song1) shown in Fig. 1a. This is illustrated in Fig. 9.
Fig. 5 Oscillogram and spectrogram of the first audio recording of DEACON; February 12, 2022, recording time = 45 seconds
Fig. 6 Oscillogram and spectrogram of an audio recording of DEACON; February 19, 2022, recording time = 50 seconds
Fig. 7 Oscillogram and spectrogram of an audio recording of DEACON; February 26, 2022, recording time = 42 seconds
Fig. 8 Oscillogram and spectrogram of an audio recording of DEACON; March 8, 2022, recording time = 15 seconds
(a)
(b)
Fig. 9 (a) BISHOP's song (Song1) on March 21, 2021, (b) DEACON's song on March 8, 2022
By the middle of March, DEACON moved away from the bush near my kitchen window. There was only one recording made in July which exhibited nicely repeated songs.
Our initial intention was to study the vocalization of Northern Cardinals in the Home 'patch' and learn their different song types. Later we discovered that we could differentiate between a first year and an older bird by listening to their vocalizations in early spring. It took DEACON over three weeks to begin producing BISHOP like song patterns. The only record we found on wild birds published in 1966 mentions the learning period to be about one week [2]. We realize that the learning rates of individuals may vary, and that the observations of this study may not strictly hold for all young Northern Cardinals.
We further analyzed the vocalizations of BISHOP, DEACON and of a Northern Cardinal in the Home 'patch' in 2023 who may be BISHOP, DEACON or some other bird. Since we do not know the history of the 2023 bird, we gave him a slightly elevated status and named him 'ARCHBISHOP'. We wanted to know the accuracy with which this species can reproduce syllables and songs. Our methodology and findings are described below:
Audio recordings were made with the Merlin App on an iPhone with a sampling rate of 44.1 kHz. The noise reduction process was the same as described in Identifying Wood Thrushes by their Repertoires Appendix A.
There is a distinct temporal separation between syllables in Northern Cardinal songs. By identifying all syllables of the same type in an audio recording, and knowing the exact location of these syllables in a spectrogram, we can study the temporal variation in syllables within a song and between different songs. For this analysis we used the "Template Detector" function in Raven Pro 2.0 BETA software under development by Cornell University. In this application, we create a selection table annotating one or more examples of a syllable within in an audio recording "template". When the Template Detector function is activated in Raven Pro, it slides the template over the spectrogram under analysis and at each time position multiplies all the power densities in the annotated feature in the template by the power densities in the spectrogram at that position, sums those products, and divides the sum by some factor. A TD_score value is assigned indicating how well the pixels of high-power density in the annotated template feature match the pixels of high-power density in the entire spectrogram.
We set a lower limit of 0.85 on the TD-score to detect only those syllables that reasonably matched the annotated syllables. We also chose to list the start and end time of each detected syllable and its lower and upper frequencies in the output table. The difference between the start times of two adjacent syllables, Δt, was computed and plotted against time. This gives a visual representation of the time gap variations between syllables in the same song and pauses between songs.
Fig. A1 (a) shows the oscillogram and spectrogram of ARCHBISHOP's vocalization recorded on March 16, 2023. There are eleven songs in a 70 s (1:10 minute) time interval, and all songs are of the same type. In this song a single syllable type is generally repeated three or four times. First, we selected one of the syllables at random in the template shown in Fig A1 (b). The detector function correctly identified all 43 syllables in the recording. The process was repeated with several different syllable examples in the template. The results were similar, although the TD_score values varied with the syllable selection because of the variations in the high-power density pixel locations in different syllables. Fig. A2 shows the output of the Template Detector, magenta color rectangles indicating which syllables matched the selected syllable with a TD_score > 0.85.
(a)
(b)
Fig. A1 (a) Oscillogram and spectrogram of an audio recording of ARCHBISHOP; March 16, 2023, recording time = 1:10 minute, (b) template with annotated syllable labeled #1.
Fig. A2 Screen shot of Template Detector output of the recording in Fig. A1. Magenta color rectangles indicate syllables that matched the selection with a TD_score > 0.85.
The computed value of Δt is plotted against time in Fig. A3. The time gap or pause between songs varied from 3.3 s to 5.4 s. The time gap between syllables within a song varied between 0.52 s and 0.77 s, a range of +/- 0.12 s. This indicates good control in the regularity of syllable production by ARCHBISHOP.
Fig. A3 Time gap (pause) between songs (red circles) and between syllables in the same song (blue circles) as a function of time in the recording of ARCHBISHOP shown in Fig. A1.
Fig. A4 Screen shot of Template Detector output of another ARCHBISHOP recording. Green color rectangles indicate the detected syllables that match the syllable labeled # 1.
In Fig. A4, Template Detector output of a second ARCHBISHOP recording made on March 1, 2023 is shown. In this case, the song is comprised of two syllables. When we supplied only one annotated template of the slow downslur whistle, the faster syllable type was not detected. Similarly, with the fast downslur as a template, the slower whistles were not detected. The time gap between this fast syllable type in the recording varied between 0.33 s and 0.39 s, a variation of +/- 0.03 s, better than in the recording in Fig. A1.
Now let us look at young DEACON's recording. We selected the syllable in the rectangle labeled '1' in Fig. A5 as the annotated template feature. Out of the 30 syllables that appear similar to the human eye, only 19 were detected. This result varied somewhat with the annotated template feature selection. However, it is apparent that the syllable repeatability of DEACON is not at par with a mature Cardinal.
Fig. A5 Screen shot of Template Detector output for a DEACON recording on March 2, 2022. Rectangles with a blue border indicate detected syllables.
Fig. A6 Screen shot of Template Detector output for a DEACON recording on March 8, 2022. All the slow downslur syllables are detected.
In the last recording of DEACON on March 8, 2022, the syllables were repeated well (Fig. A6), and all the syllables of each type in the spectrogram were detected. This is additional evidence of improvement in the quality of young DEACON's vocalization over time.
References:
1. Beecher, M. D. (1996). Birdsong learning in the laboratory and field. In Ecology and Evolution of Acoustic Communication in Birds (D. E. Kroodsma and E. H. Miller, Editors), Cornell University Press, Ithaca, NY, USA. pp. 61-78.
2. R.E. Lemon, D. Scott, "On the development of song in young Cardinals", Canadian Journal of Zoology, , vol 44, pp. 191-197, March 1966.
3. Donald Kroodsma, "Birdsong for the Curious Naturalists" p 40. http://www.birdsongforthecurious.com/recording.php?page=53 ~ audio recording # 124