OEKB100148 Data-driven AI


Art
prüfungsimmanente Lehrveranstaltung
Semesterstunden
3
Vortragende/r (Mitwirkende/r)
Weibel, Jean-Baptiste , Holzinger, Andreas , Schweng, Stefan
Organisation
Waldwachstum
Angeboten im Semester
Sommersemester 2026
Unterrichts-/ Lehrsprachen
Englisch

Lehrinhalt

The whole Module "Statistics and Data Driven AI" has 6 ECTS module gives beginner students a hands-on introduction to data-driven AI in forestry. It is divided into two parts:

In the first part (3 ECTS), students explore how data can be gathered and used in forestry with practical examples, such as analysing forest data. They also get a simple and beginner-friendly introduction to the statistical programming language R.

In the second part (3 ECTS), the focus shifts to understanding basic ideas behind data-driven AI and machine learning. Again, forestry examples are used to make the content practical and relatable. Students also learn to work with Python, one of the most important programming languages in AI, at a beginner level.

By the end of the module, students will have a basic understanding of how data driven AI and data analysis can support smart forestry and will have had first experiences with R and Python.

Inhaltliche Voraussetzungen (erwartete Kenntnisse)

Students should be familiar with simple math concepts and have a general understanding of how to use a computer. A basic curiosity for nature, forestry, and digital tools is helpful. The course is designed to be beginner-friendly and will guide students step by step through the content to prepare students for the future of digital transformation in smart forestry.

Lehrziel

Students who have successfully completed understand the basics of probability theory and can describe the motivation and function of basic measures of descriptive statistics and error propagation. They are familiar with discrete and continuous random variables and their distributions and know the common data types and structures for representing random variables in computer systems. Students are familiar with various methods of inferential statistics (Gaussian test, t-test, binomial test; variance analysis; Tukey test; distribution tests, Shapiro-Wilk test, Levene test; chi-square adjustment/homogeneity test; classical linear model; multiple linear model; non-linear regression; generalised linear models) as well as their modes of operation and areas of application; in particular, they know the prerequisites and limitations of the methods. Students are familiar with the functionality and syntax of the statistical programming language R.


Students understand one of the simplest regression models (polynomial curve fitting) and thus also the underfitting/overfitting phenomenon. Furthermore, students learn how to handle a data set so that these phenomena can be detected and avoided. In addition, they learn how to divide the data set into training, validation and test set, as well as the method of cross-validation and how to merge and stratify a data set. For clustering problems, students can explain k-means and Gaussian Mixture Models (GMM) using their derivations and examples. Students know the basic principles of Bayesian Networks (BN) or prior and posterior probability distri-bution, causal dependencies, Conditional Probability Tables (CPT) and Markov Blanket. Decision trees for classification and regression problems are discussed using exercises and the concepts of information gain and discriminatory ability are explained. Students are familiar with the most important metrics (Accuracy, Con-fusion Matrix, Mutual Information) for classification problems. They can evaluate the influence of the associated risk, especially with unbalanced data sets. Students are familiar with the functionality and syntax of the machine learning programming language Python.
Noch mehr Informationen zur Lehrveranstaltung, wie Termine oder Informationen zu Prüfungen, usw. finden Sie auf der Lehrveranstaltungsseite in BOKUonline.