Corpus linguistics I

This course is part of the programme
Master's Degree Programme Humanities Studies

Objectives and competences

Getting to know the methodological aparatus of Corpus linguistics.
Learning about the history of corpus linguistics.
Training in the use of corpora for linguistic research.
Training in the evaluation of results or corpus studies.
Learning about asking interesting linguistic questions.
Teaching the students to perform corpus analysis individually.


In order to successfully participate in in-class discussions and to follow the lectures, the student should take the introductory linguistic courses. This course is related to other courses in the Linguistic curriculum.


Initially the historical background of corpus linguistics is presented. Forming a corpus and tagging it are presented for various types of corpora. The way a corpus is made also determines the way it can be used and the analytical tools that can be used on it. A major part is devoted to the language specific lexical analyses and various sample language descriptions made using corpora which are critically evaluated. Special attention is also devoted to methods and tools for keeping and organizing processed language data and individual development of a standardized tagged corpus.
Material discussed in lectures also forms the main part of recitations, where more time and care is devoted to the main issues. Emphasis is given to practical knowledge and actual analysis of language data.

Intended learning outcomes

Students learn about the foundations of corpus linguistics. They are capable to read and evaluate corpus research and look for new directions in research. They are capable to conduct corpus research individually and know the benefits and problems of this approach.


  • Adam Kilgariff: Word sketches
  • WordNet E-version
  • Douglas Biber et. al., 1998: Corpus Linguistics. Investigation Language Structure in Use. Cambridge: Cambridge University Press. Catalogue
  • Corpus Linguistics Around the World, 2006. Amsterdam, New York: Rodopi.Catalogue
  • Vojko Gorjanc, 2005: Uvod v korpusno jezikoslovje. Domžale: Izolit. Catalogue
  • Graeme Kennedy, 1998: An Introduction to Corpus Linguistics. London: Longamn. Catalogue E-version
  • Jezik in slovstvo, let. 2003, št. 3-4. Tematska številka: Jezikovne tehnologije za slovenščino. Catalogue E-version
  • Zborniki konferenc Jezikovne tehnologije (za slovenščino). Ljubljana: Inštitut Jožef Stefan, 1996, 1998, 2000, 2002, 2004, 2006. E-version


A Term-paper is mandatory for completion of this course. In order to be allowed to take the exam, the student also needs to orally present the term-paper.

Lecturer's references

Associate Professor Boris Kern works on issues of contemporary lexicology and lexicography, word-formation and semantics. He is involved in the preparation of the Dictionary of Newer Slovene Vocabulary and the New Dictionary of the Slovene Language.

Izbrane publikacije:
KERN, Boris. Analiza besedotvornih sklopov glagola stopiti. Jezikoslovni zapiski 17/1 (2011).
KERN, Boris. Stopenjske tvorjenke iz glagolov čutnega zaznavanja. Družina v slovenskem jeziku, literaturi in kulturi: zbornik predavanj 47. SSJLK. Znanstvena založba Filozofske fakultete, Ljubljana, 2011, str. 156–160.
KERN, Boris. Stopenjsko besedotvorje. Slavistična revija XXLVIII/3 (2010), str. 335–348.