The Zurich Centre for Linguistics, together with the Institute of Computational Linguistics and the Department of General Linguistics, are offering a small project to an advanced student who is interested in the interaction between computational linguistics and psycholinguistics. He or she will conduct two pilot studies and review relevant literature. Work comprises about 200 hours and will take place between about September 2012 and February 2013.
The project investigates the role of ambiguity, context, argument structure, fixedness and native-like use of collocations in language acquisition and natural language processing. The following questions will be addressed in the pilot studies:
- how formulaic is child language (L1)?
- how difficult is learner language (L2) to process for native speakers when non-native-like utterances are used, how much does it lead to ambiguity?
- up to which point can automatic parsers be used to model human addressees?
- how much does the discourse context disambiguate language, and how can computers learn from the context?
Pilot study 1: We will measure some aspects of L1 fixedness using the CHILDES corpus by investigating the distribution of n-grams, exact repetition versus productive use of modified variants, and creative use of syntactic operations.
Pilot study 2: We will use a learner English or German corpus to assess similarities between human and automatic parsing. Are automatic parser error rates higher on L2 utterances than on their corrected counterparts? Are automatic parser error rates higher if permissible but unexpected syntactic operations and synonyms are used?
Prerequisites: considerable experience in computational linguistics and basic knowledge of psycholinguistics are required, coupled with an active interest in the subject, and the ability to program and work independently.