Corpus linguistics I

This course is part of the programme:
Master in SL studies - Linguistics

Objectives and competences

Getting to know the methodological aparatus of Corpus linguistics.

Learning about the history of corpus linguistics.

Training in the use of corpora for linguistic research.

Training in the evaluation of results or corpus studies.

Learning about asking interesting linguistic questions.

Teaching the students to perform corpus analysis individually.


In order to successfully participate in in-class discussions and to follow the lectures, the student should take the introductory linguistic courses. This course is related to other courses in the Linguistic curriculum.

Content (Syllabus outline)

Initially the historical background of corpus linguistics is presented. Forming a corpus and tagging it are presented for various types of corpora. The way a corpus is made also determines the way it can be used and the analytical tools that can be used on it. A major part is devoted to the language specific lexical analyses and various sample language descriptions made using corpora which are critically evaluated. Special attention is also devoted to methods and tools for keeping and organizing processed language data and individual development of a standardized tagged corpus.

Material discussed in lectures also forms the main part of recitations, where more time and care is devoted to the main issues. Emphasis is given to practical knowledge and actual analysis of language data.

Intended learning outcomes

Students learn about the foundations of corpus linguistics. They are capable to read and evaluate corpus research and look for new directions in research. They are capable to conduct corpus research individually and know the benefits and problems of this approach.


Adam Kilgariff: Word sketches


Douglas Biber et. al., 1998: Corpus Linguistics. Investigation Language Structure in Use. Cambridge: Cambridge University Press.

Corpus Linguistics Around the World, 2006. Amsterdam, New York: Rodopi.

Vojko Gorjanc, 2005: Uvod v korpusno jezikoslovje. Domžale: Izolit.

Graeme Kennedy, 1998: An Introduction to Corpus Linguistics. London: Longamn.

Jezik in slovstvo, let. 2003, št. 3-4. Tematska številka: Jezikovne tehnologije za slovenščino.

Zborniki konferenc Jezikovne tehnologije (za slovenščino). Ljubljana: Inštitut Jožef Stefan, 1996, 1998, 2000, 2002, 2004, 2006.


A Term-paper is mandatory for completion of this course. In order to be allowed to take the exam, the student also needs to orally present the term-paper.

Lecturer's references

Boris Kern’s research focuses on modern lexicology and lexicography, word formation and semantics. He participates in the preparations of the Dictionary of Newer Standard Slovene Words and the New Dictionary of Slovene language.

Selected publications:

KERN, Boris. Analiza besedotvornih sklopov glagola stopiti. Jezikoslovni zapiski 17/1 (2011).

KERN, Boris. Stopenjske tvorjenke iz glagolov čutnega zaznavanja. Družina v slovenskem jeziku, literaturi in kulturi: zbornik predavanj 47. SSJLK. Znanstvena založba Filozofske fakultete, Ljubljana, 2011, str. 156–160.

KERN, Boris. Stopenjsko besedotvorje. Slavistična revija XXLVIII/3 (2010), str. 335–348.

  • Lectures: 15 hours
  • Exercises: 15 hours
  • Individual work: 60 hours

Course type: mandatory

Learning and teaching methods:
- lectures - in-class discussion - term-paper workshop - oral presentation of the term paper - individual reading assignments - practical training with a computer