Artificial Intelligence for Data Analysis

Objectives and competences

AI-assisted data analysis is a process of discovering patterns and models, described by rules or other human- understandable representation formalisms. The most important step in this process is data mining, performed by using methods, techniques and tools for automated constructions of patterns and models from data.

The course objectives are to (a) introduce the basics of data mining, (b) outline the process of knowledge discovery in databases and the CRISP-DM methodology, (c) present the methodology for result evaluation, (d) present selected data mining methods and techniques by cases relevant for engineering and management, and (e) empower the students with the skills for practical use of selected data mining tools.

The students will master the basics of data preprocessing, data mining and knowledge discovery and will be capable of using selected data mining tools and results evaluation methods in practice.

Prerequisites

Basic knowledge of mathematics, computer science and informatics is requested.

Content

  1. Introduction
  2. Artificial Intelligence (AI) in a business environment
  3. Data analysis following the CRISP-DM methodology
  4. AI techniques for data analysis:
    - Analysis of tabular data
    - Heuristics for model and patterns construction
    - Quality of learned models and discovered patterns
    - Methodology for results evaluation
    - Text analysis
  5. AI use cases in business, ecology, industry, etc.
  6. Practical use of selected data analysis tools

Intended learning outcomes

Knowledge and understanding:

Mastering of selected Artificial Intelligence methods and techniques for data analysis, the capability of data preprocessing, practical use of selected data mining techniques, and capability of using and interpreting the methods for result evaluation.

Readings

Selected chapters from the following books:

  • D. Mladenić, N. Lavrač, M. Bohanec, S. Moyle (eds.) Data Mining and Decision Support: Integration and Collaboration. Kluwer 2003. ISBN 1-4020-7388-7 Catalogue E-version
  • J.H. Witten, E. Frank, M.A. Hall: Data Mining: Practical Machine Learning Tools and Techniques (Third Edition), Morgan Kaufmann, 2011. ISBN 978-0-12-374856-0 Katalog E-version
  • M. Berthold (ed.), Bisociative Knowledge Discovery, Springer, 2012. ISBN 978-3-642-31829-0 Katalog E-version
  • J. Fuernkranz, D. Gamberger, N. Lavrač: Foundations of Rule Learning. Springer, 2012. ISBN 978-3-540-75196-0 Catalogue E-version

Assessment

Competence evaluation:
• By written exam we evaluate the basic knowledge of artificial intelligence for data analysis and the knowledge discovery process following the CRISP-DM methodology
• By seminar or project work and its oral defense we evaluate practical competencies of using the selected data analysis tools and methods for results evaluation
50/50

Lecturer's references

Prof. Dr. Nada Lavrač, full professor in the field of Computer Science
Principal education and research areas: Knowledge technologies, Artificial Intelligence, machine learning, data mining and text mining, relational data mining and inductive logic programming, combining data mining and decision support, computational creativity, text mining, knowledge management, marketing, and virtual enterprises, applications of machine learning and data mining techniques in biomedicine, healthcare, life sciences, marketing and media analysis
Professional career: From 1978 employed at Institute “Jožef Stefan”; founder and in 2014-2020 Head of Department of Knowledge Technologies; since 2002 research councilor IJS; since 2007 full professor at University of Nova Gorica and International Postgraduate School Jožef Stefan; 1996-1998 vice-president of ECCAI (European Coordination Committee for AI); member of Slovenian AI Society SLAIS, 2022-2024 ELLIS Board member.

Publications and achievements: author of numerous scientific papers, author of four scientific monographs, editor of numerous books and proceedings, author of two outstanding scientific achievements (2011 and 2012), coordinator of two EU projects, Slovenian principal investigator of over ten EU projects worth over 4 Mio EUR. Awards: 2022 Zois award for outstanding research achievements in machine learning, 2020 ELLIS Fellow in machine learning, 2013 Zois recognition award for important scientific contributions to intelligent data analysis, 2007 ECCAI/EURAI Fellow Award for pioneering research and advances in the field of Artificial Intelligence in Europe, 1998 Ambassador of Science of the Republic of Slovenia for outstanding research and contribution to international recognition of Slovenian science, 1986 National award for research excellence (Boris Kidrič Fund Award) for research in knowledge synthesis and qualitative modeling (system KARDIO for ECG diagnosis of cardiac arrhythmias, later published as monograph Kardio: A Study in Deep and Qualitative Knowledge for Expert Systems, MIT Press, 1989, coauthor).

Prof. Dr. Aneta Trajanov (former Trajanov), Associate professor in the field of computer science and informatics at the University of Nova Gorica and a director of the Masters programme Management and Engineering, is an expert in the area of artificial intelligence. She completed her PhD on machine learning in 2010 at the Jozef Stefan International Postgraduate School. From 2005 until 2022 she was a researcher at the Department of Knowledge Technologies at the Jozef Stefan Institute. She completed her post-doc at the Ruđer Boškovič Institute, Zagreb, Croatia in 2015/2016. Her main research interests are machine learning and knowledge discovery from environmental data, decision support, inductive logic programming and equation discovery. She has worked on many European, as well as national, projects in the area of agroecology, where she applied different machine learning methods for analyzing (agro)ecological data. Since November 2022 she works as a Director of Artificial Intelligence in the company MarineXchange, which develops software for the cruise industry.

Selected bibliography

• Lavrač N., Džeroski, S.: Inductive Logic Programming: Techniques and Applications. Ellis Horwood, 1994.
• Lavrač N., Kavšek, B., Flach P. A., Todorovski, L.: Subgroup discovery with CN2-SD. Journal of Machine Learning Research, 5 (2004), 153-188.
• Železny F., Lavrač N.: Propositionalization-based relational subgroup discovery with RSD. Machine Learning 62 :1-2 (2006), 33-63.
• Fuernkranz J., Gamberger D., Lavrač N.: Foundations of Rule Learning. Springer 2012.
• Lavrač N., Podpečan V., Robnik-Šikonja M. Representation Learning: Propositionalization and Embeddings. Springer 2021.
• Sandén, Taru, Wawra, Anna, Berthold, Helene, Miloczki, Julia, Schweinzer, Agnes, Gschmeidler, Brigitte, Spiegel, Heide, Debeljak, Marko, Trajanov, Aneta. TeaTime4Schools : using data mining techniques to model litter decomposition in austrian urban school soils. Frontiers in ecology and evolution. 2021, vol. 9, str. 703794-1-703794-9, ilustr. ISSN 2296-701X. DOI: 10.3389/fevo.2021.703794. [COBISS.SI-ID 68232707]
• Iannetta, Pietro, Debeljak, Marko, Trajanov, Aneta, et al. A multifunctional solution for wicked problems : value-chain wide facilitation of legumes cultivated at bioregional scales is necessary to address the climate-biodiversity-nutrition nexus. Frontiers in sustainable food systems. 2021, vol. 5, str. 692137-1-692137-8. ISSN 2571-581X. DOI: 10.3389/fsufs.2021.692137. [COBISS.SI-ID 72049155]
• Wall, David P., Delgado, Antonio, O'sullivan, Lilian, Creamer, Rachel, Trajanov, Aneta, Kuzmanovski, Vladimir, Henricksen, Christian B., Debeljak, Marko. A decision support model for assessing the water regulation and purification potential of agricultural soils across Europe. Frontiers in sustainable food systems. 2020, vol. 4, str. 115-1-115-11. ISSN 2571-581X. DOI: 10.3389/fsufs.2020.00115. [COBISS.SI-ID 21854979]
• Sandén, Taru, Trajanov, Aneta, Spiegel, Heide, Kuzmanovski, Vladimir, Saby, Nicolas, Picaud, Calypso, Henriksen, Christian B. H., Debeljak, Marko. Development of an agricultural primary productivity decision support model : a case study in France. Frontiers in environmental science. 2019, vol. 7, str. 58-1-58-13. ISSN 2296-665X. DOI: 10.3389/fenvs.2019.00058. [COBISS.SI-ID 32342311],
• Leeuwen, Jeroen P. Van, Debeljak, Marko, Kuzmanovski, Vladimir, Trajanov, Aneta, et al. Modeling of soil functions for assessing soil quality : soil biodiversity and habitat provisioning. Frontiers in environmental science. 2019, vol. 7, str. 113-1-113-13. ISSN 2296-665X. DOI: 10.3389/fenvs.2019.00113. [COBISS.SI-ID 32581927]