Two interrelated open-source software packages have been developed in the context of the SPADE project:
Both packages enable “Integrated Speech Corpus ANalysis”: importing speech datasets into a common database format, enriching each database with standardized measures, finding relevant tokens, and exporting a data file. ISCAN depends on PolyglotDB, and is essentially a major enhancement to meet requirements of the SPADE project. For different use cases you may want to install just PolyglotDB, or the full ISCAN system.
High-level non-technical description of the whole system is given in the 2019 ICPhS paper, and more details of the PolyglotDB side are described in an earlier Interspeech paper. The readthedocs documentation for each software package, linked to below, provide more detailed and technical information.
1. PolyglotDB
The polyglotdb
package, developed since 2016, contains all core functionality for integrated speech corpus analysis for technically-skilled users.
Installation of PolyglotDB, currently tested on Ubuntu and OS X, allows the user to construct and query databases in a custom format (called “Polyglot”), via a Python API. Installation of PolyglotDB alone is only recommended for users working with data to which they have unrestricted access (e.g. their own data, or publicly-available corpora). While installation on a desktop is recommended, in practice we have found that PolyglotDB works on a modern laptop even for datasets of reasonable size (10-20 hours).
Documentation for PolyglotDB, including tutorials, is on readthedocs.
2. ISCAN
This is the main software package used in the SPADE project, developed since 2017. ISCAN adds significant extra functionality needed for the SPADE use case or other similar projects, including:
* Interaction via a browser-based GUI (in addition to the Python API)
* Browser GUI usable by non-technically-skilled users
* User permissions, for working with restricted datasets
* Management of multiple users (at different sites) and multiple databases
* Data inspection and correction (subject to permissions)
This package allows an “ISCAN server” to be set up on an Ubuntu machine—either a user’s local machine, or a web server to enable access by remote users. In practice, only the web server option is well-tested, and significant memory (12 GB+) is recommended to work with datasets containing 100-200+ hours of speech (total).
The base functionality is in the iscan
package here; however, you probably should not install this directly, since significant extra configuration is needed.
Different configurations of an ISCAN server are possible for different projects. The iscan-spade-server
configuration used for the SPADE project is here (with installation instructions); this is what a user should install who wishes to work with ISCAN to conduct large-scale phonetic studies as in SPADE.
Documentation for ISCAN, and the iscan-spade-server
in particular, are in progress at readthedocs. This site contains tutorials where you can try out ISCAN without actually installing an ISCAN server, by requesting a tutorial account for access to the server hosted at McGill.