September 7, 2017

Corpora in the Classroom

Two increasingly important domains in linguistics are the study of spontaneous speech and the analysis of large corpora of natural language data. Our Linguistics Department has professors and students who do both.

To improve the instructional infrastructure and scaffold undergraduate and graduate class assignments that teach relevant theory and research skills, we have developed a teaching resource called Corpora in the Classroom (, on which hundreds of hours of recorded and digitized speech from 9 languages (so far) are archived and meta-data-tagged.

This tool has been used in 7 or 8 sociolingusitics classes over the past 5 years, but we are hoping to expand its utility and use to additional classes/areas. If you'd like to use this tool or contribute data to it, please have a look at the demo pages ( and then contact Naomi to discuss. (Sample assignments using this tool for a 1st year course are at, HWs 11 & 13.)

The project has been funded by internal ITIF and CRIF grants, Keren Rice's CRC funds, and SSHRC.

No comments:

Post a Comment