Objectives and competences

The course objectives are to give:

an overview of language technology, related topics in information theory, text copora for Slovenian language and corresponding tools, basic understanding of the structure of web pages, the relevant markup languages such as HTML and XML.

Students get the competence in evaluation of electronic language resources, in preparation of language-related reports for the web environment.

They learn a new approach to the possibilities in solving language problems, an approach offered by contemporary, web-based time.


The course does not require any special skills or knowledge, not covered by previous education of a future linguist. All that is needed is basic knowledge of computer use, some experience in usage of web resources and, last but not least, reasonable command of English language.

Content (Syllabus outline)

  • Overview of the field of language technology
  • Basic web skills
  • Overview of markup languages such as HTML and XML
  • Text corpora and related tools, especially for the Slovenian language
  • Term paper in the form of a web page with statistical analysis of a chosen Slovenian or English fiction text, including its lemmatization and preparation of a dictionary of open-class words.

Intended learning outcomes

Students learn how to use a modern tool for text analysis and its potential in testing of linguistic hypotheses. They understand the inner structure of simple and machine-generated web pages, they get an overview of Slovenian language corpora and their use. Students learn how to make a statistical description of a given text, including the preparation of the frequency dictionary of open-class words.


Term paper in the form of a web page, its presentation (60%), oral exam (40%).

Lecturer's references

Jernej Vičič studied computer and information science at the Faculty of Electrical Engineering and completed his studies at the newly created Faculty of Computer and Information Science.

In 1999 he received his BA degree (his thesis being entitled Napredne grafične metode [Advanced Graphic Methods]).

In 2002 he received his MA degree at the same faculty (his thesis being entitled Avtomatsko prevajanje iz slovenskega v angleški jezik na osnovi statističnega strojnega prevajanja [Automatic Translation from Slovenian to English on the Basis of Statistical Machine Translation]). Under the supervision of Professor Sašo Divjak and Tomaž Erjavec, PhD, his research focused on investigating methods and algorithms of statistical machine translation of natural languages. After obtaining his MA degree, he continued his research in the same field. His research focuses on training computers to translate natural languages, particularly related languages. In 2012, he defended his PhD thesis entitled Hitra postavitev prevajalnih sistemov na osnovi pravil za sorodne naravne jezike (Fast Implementation of Rules-Based Machine Translation Systems for Similar Natural Languages).

University course code: 1SI304

Year of study: 3

Semester: poletni

Course principal:





  • Lectures: 30 hours
  • Exercises: 30 hours
  • Individual work: 60 hours

Course type: compulsory

Languages: slovene

Learning and teaching methods:
• lectures • conversation • problem solving • seminar • usage of web tools and resources • creation of web pages • presentation of term paper