ISCAN Tutorial 2 – Formants

This page provides a tutorial on how to do a particular case study in ISCAN: generating a dataset of formant measures, replicating the analysis done in PolyglotDB (the Python library that ISCAN is a GUI for) using formant.py (https://github.com/MontrealCorpusTools/SPADE/blob/master/formant.py).

If you completed Tutorial 1 – Sibilants, skip ahead to the ‘Enrichment’ section of the tutorial.

Logistics

Access

Before you can begin the tutorial, you will need access to log in to the ISCAN server via your web browser. To use ISCAN you need to get a username and password from whoever the administrator for the server is. For now, the only ISCAN server is at McGill, so the first step is to contact Vanna (On Slack in the #iscan-help channel or email to savanna.willerton@mail.mcgill.ca) to request access. I will send you a username and password.

To log in to the McGill ISCAN server via your web browser visit https://roquefort.linguistics.mcgill.ca, press the ‘Log in’ button on the top right of the screen and enter the username and password provided by Vanna.

Questions, Bugs, Suggestions

If at any point while using ISCAN you get stuck, have a question, encounter a bug (like a button which doesn’t work), or you see some way in which you believe the user interface could be improved to make usage more clear/smooth/straightforward/etc, then please see ISCAN – Getting Help and Giving Feedback.

Tutorial

ISCAN can help us go from a raw speech corpus to a data file (CSV) ready for further analysis. This process consists of a pipeline of four steps:

  1. Import the corpus into ISCAN, resulting in a database
  2. Enrich the database
  3. Query the database
  4. Export the results to a CSV file

We will see detailed instructions for each step below. In this particular case study, we will be generating a dataset of formant measures from a small subset of the ICE-Can corpus (named tutorial-yourusername in ISCAN).

Import

Select the ‘tutorial-yourUsername’ corpus from the dropdown menu at the ‘Corpora’ tab in the navigation bar and press the ‘import’ button to begin importing.

Results

At the end of this step we should have a structured database of linguistic objects (words, phones, discourses) in ISCAN, and our database should be ‘Running’ so that we can interact with it in the following steps.

Enrichment

At this step, we will enrich the dataset for all information needed to analyse formants as we did for the sibilant tutorial. If you completed tutorial 1, many of the enrichments below will already have been run. There is no need to run them again, simply ensure they are all there and look out for those marked as formant specific.

To get started, navigate to the Corpus View by selecting the database you’ve imported under ‘Corpora’ in the navigation bar. In this view, you can see all of the currently available information for a single corpus. You can navigate between datasets using the ‘corpora’ drop down menu in the navigation bar.

Start by pressing the ‘create, run and edit enrichments’ button under the ‘Enrichment’ header on this page. This will bring you to the Enrichment View. This view is characterized by a large table on the left hand side which shows enrichments which have been created. This table should be populated with enrichments you added for the sibilant tutorial. If not, simply follow along adding all of the following enrichments.

Properties from a CSV

First let’s import information usually provided with a corpus as CSV files. Start by selecting the ‘Properties from a CSV’ button. In this view we select files from our computer which add properties to speakers and words. In general, one would have checked out the ICE-Can or tutorial corpus from the master SPADE Git repositories held on the McGill Roquefort server, and therefore would have enrichment CSV files on their computer (See the private WordPress page Using Datasets: McGill-External Users for instructions on this). For the purposes of this tutorial however, the enrichment files needed for our subset of the ICE-Can corpus are provided:

Save these files to your computer.

Speaker Enrichment

First let’s add speaker information. Start by filling in ‘speaker info’ as the name of this enrichment. For ‘Analysis’, select ‘Speaker CSV’ from the dropdown menu. Finally, upload the ICE-Can ‘speaker_info.csv’ file from your computer with the ‘choose file’ button, then name the enrichment ‘Speaker’ and hit ‘Save Enrichment’. This will take you to the Enrichment view where you press ‘Run’ on the new row of your table.

Lexicon Enrichment

Using the same steps, add a ‘Lexicon’ enrichment by using a ‘Lexicon CSV’ analysis with the ICE-Can ‘lexical_can_dialect_info.csv’ file, and run it.

Phone Subset

For the formant analysis, it is not necessary to encode ‘sibilants’ as we did in tutorial 1. However, we still need to encode some phone subsets.

Encode Syllabics

This step will specify the subset of phones that are syllable nuclei.

Using the same steps as above, add an enrichment named ‘syllabic’ by creating a new phone subset, saving, and running.

Encode Stressed Vowels (FORMANT SPECIFIC)

Start by pressing the ‘Phone Subset’ button under the ‘Create New Enrichments’ header. Here we select and name subsets of phones. If we wish to search for stressed vowels we have two options for this corpus:

  1. For our subset of ICE-Can we have the option to press the pre-set button ‘Select stressed vowels’.
  2. For some datasets the preset button will not be available. In this case you may manually select a subset of phones of interest.

Then choose a name for the subset (in this case ‘stressed_vowels’) and click ‘Save subset’. This will return you to the Enrichment view where you will see the new enrichment in your table. In this view, press ‘Run’ under ‘Actions’.

Pauses

Another set of basic enrichments for our analysis is to encode some levels. For this we will use the buttons under the ‘Annotation levels’ header.

First, Press the ‘Pauses’ button. The pauses page allows you to select the labels assigned to pauses for the particular corpus you are using. Encoding pauses is how we will define utterances.

As they are typically among the most common in a corpus, this page displays the top 25 most frequent labels. If the labels you are looking for do not appear, you can either increase the number of labels displayed or enter the label in the ‘Custom words’ box.

For our tutorial corpus, we just need to tick the box under ‘Include’ for the word marked . Once you have selected the pause word, enter the name of the enrichment (let’s say ‘Pauses’), and hit ‘Save enrichment’. Finally, hit ‘Run’ once more in the new row of your Enrichments table.

Utterances

For current purposes, we define an utterance as a stretch of speech separated by pauses. So now we will specify minimum duration of pause that separates utterances (150ms is typically a good default).

From the Enrichment View, again under the ‘Annotation levels’ header, select ‘utterances’. From here, name the new addition ‘utterance’ and type 150 in the box next to ‘Utterance gap(ms)’, then hit ‘Save enrichment’ and ‘Run’ in the Enrichment View.

Syllables

The final level we will encode is the syllables level. Again under the ‘Annotation levels’ header, press the ‘Syllables’ button. Similar to the last two, simply name the new enrichment ‘Syllables’, select Max Onset from the Algorithm dropdown menu, and Syllabics from the Phone Subset menu, and then hit ‘Save enrichment’. As usual, upon return to the Enrichment view, hit ‘Run’ on the new addition to the table.

Hierarchical Property

Next, from the Enrichment View press the ‘Hierarchical property’ button. This will bring you to a page with three drop down menus (Higher annotation, Lower annotation, Property type) where we can encode speech rates, number of syllables in a word, and phone position.

While adding each enrichment below, remember to choose an appropriate name for the enrichment, hit the ‘save enrichment’ button, and then click ‘Run’ in the Enrichment View.

Speech Rate

  1. From the Higher annotation menu, select utterance
  2. From the Lower annotation menu, select syllable
  3. From the Property type menu, select rate

Syllable Count 1 (Number of Syllables in a Word)

  1. From the Higher annotation menu, select word
  2. From the Lower annotation menu, select syllable
  3. From the Property type menu, select count

Syllable Count 2 (Number of Syllables in an Utterance)

  1. From the Higher annotation menu, select utterance
  2. From the Lower annotation menu, select syllable
  3. From the Property type menu, select count

Phone Count (Number of Phones per Word)

  1. From the Higher annotation menu, select word
  2. From the Lower annotation menu, select phone
  3. From the Property type menu, select count

Word Count (Number of Words in an Utterance)

  1. From the Higher annotation menu, select utterance
  2. From the Lower annotation menu, select word
  3. From the Property type menu, select count

Phone Position

  1. From the Higher annotation menu, select syllable
  2. From the Lower annotation menu, select phone
  3. From the Property type menu, select position

Stress from Word Property

To encode syllable stress, click the ‘Stress from word property’ button from the Enrichment View. From the Word property dropdown menu, select stresspattern. Name the enrichment ‘Stress pattern’, and as usual, click ‘Save enrichment’, which brings you back to the Enrichment View where you click ‘Run’ on the new row of your table.

Acoustics (FORMANT SPECIFIC)

Now we will compute vowel formants for all stressed syllables using an algorithm similar to FAVE.

For this last section, you will need a vowel prototype file. This one is also normally accessed after you’ve checked out the ICE-Can or tutorial corpus from the master SPADE Git repositories held on the McGill Roquefort server. Again, for the purposes of the tutorial, it is provided below. Please save the file to your computer.

Vowel Prototypes: ICECAN_prototypes

From the Enrichment View, press the ‘Acoustics’ button under the ‘Create new enrichments’ header. As usual, this will bring you to a new page. First let’s name this enrichment ‘Formant Acoustics’. Then, from the Analysis dropdown menu, select FAVE-style point formants. New boxes will appear. From the new Phone class menu, select stressed_vowels. Using the ‘Choose file’ button, upload the ICECAN_prototypes.csv file you saved. For Number of iterations, type 3 and for Duration Threshold type 50ms.

Finally, ensure the program is set to praat and hit the ‘Save enrichment’ button. Then click ‘Run’ from the Enrichment View.

Results

Our dataset should now be encoded with further linguistic objects, and information about those objects. The table in the Enrichment page should be populated with ‘Stressed vowels’, ‘Speaker’, ‘Lexicon’, ‘Pauses’, Formants (F1, F2, etc.), and other enrichments. It may take time to run them all, but all enrichments should have a green check mark in the ‘Completed’ column of the table before moving on to the next step.

Query

The next step is to search the dataset to find a set of linguistic objects of interest. In our case, we’re looking for all stressed vowels, and we will get formants for each of these. Let’s see how to do this using the Query view.

First, return to the the ‘spade-yourUsername’ Corpus Summary view, then navigate to the ‘Phones’ section and select New Query. This will take you to a new page, called the Query view, where we can put together and execute searches. In this view, there is a series of property categories which you can navigate through to add filters to your search. Under ‘Phone Properties’, there is a dropdown menu with search options. Select ‘stressed_vowels’. You may select ‘Add filter’ if you would like to see more options to narrow down your search.

The selected filter settings will be saved for further use. It will automatically be saved as ‘New phone query’, but let’s change that to something more memorable, say ‘ICE-Can Tutorial Formants’. When you are done, click the ‘Save and run query’ button. The search may take a while, especially for large datasets, but should not take more than a couple of minutes for this small subset of the ICE-Can corpus we’re using for the tutorials.

Results

We should now have a set of linguistic objects of interest to our research. In our case, this will be a long list including all of the stressed vowels in the corpus.

Export

Now that we have made our query and extracted the set of objects of interest, we’ll want to export this to a CSV file for later use and further analysis (i.e. in R, MatLab, etc.)

Once you hit ‘Save query’, your search results will appear below the search window. Since we selected to find all stressed vowels only, a long list of phone tokens (every time a stressed vowel occurs in the dataset) should now be visible. This list of objects may not be useful to our research without some further information, so let’s select what information will be visible in the resulting CSV file using the window next to the search view.

Here we may check all boxes which will be relevant to our later analysis to add these columns to our CSV file. The preview at the bottom of the page will be updated as we select new boxes:

  1. Under the PHONE header, select:

    • label – Adds the orthographic contents of an object as a column
    • begin – Adds the start of the object in time (seconds) as a column
    • end – Adds the end of the object in time (seconds) as a column

    • F1 – Specifies the frequency of Formant 1

    • F2 – Specifies the frequency of Formant 2
    • F3 – Specifies the frequency of Formant 3
    • B1 – Specifies the bandwidth of Formant 1
    • B2 – Specifies the bandwidth of Formant 2
    • B3 – Specifies the bandwidth of Formant 3
    • num_formants – Specifies the number of formants detected
  2. Under the SYLLABLE header, select:

    • stress – Specifies syllable stress
    • position_in_word – Specifies the syllable’s position in the word
  3. Under the WORD header, select:

    • label – Specifies a word
    • stress – Specifies word stress pattern
  4. Under the UTTERANCE header, select:

    • label – Specifies the utterance that the word came from
  5. Under the SPEAKER header, select:

    • name – Specifies the speaker
  6. Under the SOUND FILE header, select:

    • name – Specifies the sound file the object came from

Once you have checked all relevant boxes, select ‘Export to CSV’. Your results will be exported to a CSV file on your computer. The name will be the one you chose to save plus “export.csv”. In our case, the resulting file will be called “ICE-Can Tutorial Formants export.csv”.

Results

With the tutorial complete, we should now have a CSV file saved on our personal machine containing information about the set of objects we queried for and all other relevant information.