Elements of Biostatistics and Bioinformatics

Objectives and competences

The objective of the course is to provide the students the following expertise:
• An improved understanding of the uni- and multivariate statistics applied to biological data;
• The capacity of visualize uni- and multivariate datasets, assessing univariate statistical significance;
• Critical thinking towards the statistics applied by other researchers in their work;
• The basics in NGS data handling

Prerequisites

There are no formal prerequisites, but a basic knowledge of the linux system and coding is preferred.

Content

• Basics of R, usage of Rstudio and syntax of the commands;
• Data visualization using R, ggplot2 syntax and logic, and effective data visualization;
• Nomenclature, definition of variable types, distributions and measures of dispersion;
• Introduction to parametric and non-parametric methods;
• Differences among categorical variables (chi-squared test), effect size;
• Introduction to linear regression
• Introduction to multivariate methods: PCA
• Introduction to bioinformatics, file formats (.fastq, .fasta, .gff, .sam, .bam etc), use of the most common databases;
• NGS data assembly, reads mapping, genome viewing using genome browsers, protein secondary structure prediction using AlphaFold2.

Intended learning outcomes

The main outcome of the course is the independence of the students in the usage of bioinformatics and biostatistics, which will be useful throughout the PhD to plan and perform experiments, and evaluate the results in a rigorous manner.

Readings

  • Whitlock, M. C. The analysis of biological data / M. C. Whitlock and D. Schluter. - Greenwood Village, Colorado : Roberts and Company Publishers, 2009. - ISBN 978-0-9815194-0-1 Catalogue E-version
  • Statistical tools for high-throughput data analysis
  • Other online resources

Assessment

Evaluation consists of specific exercises on statistical analysis of public datasets assigned singularly to each student, followed by a seminar with comments to the analysis performed.
Written part (80 %)
Oral part (20 %)

Lecturer's references

Prof. Alfonso Esposito is an Associate Professor in Genetics at the University of Enna "Kore". In the last 7 years, he has been responsible for the bioinformatics and biostatistics procedures at the University of Trento and at ICGEB, participating also in the training of both masters and PhD students. He has held lectures and seminars regarding bioinformatics applied to microbial genomics and metagenomics in several instances.

Bibliography:
• Renato Pedron, Alfonso Esposito, William Cozza, Massimo Paolazzi, Mario Cristofolini, Nicola Segata, Olivier Jousson. Microbiome characterization of alpine water springs for human consumption reveals site- and usage-specific microbial signatures (2022) Frontiers in Microbiology, IF 4.07, DOI: 10.3389/fmicb.2022.946460.
• Cristina Bez, Alfonso Esposito, Hang Dinh Thuy, Minh Nguyen Hong, Giampiero Valè, Danilo Licastro, Iris Bertani, Silvano Piazza and Vittorio Venturi. The rice foot rot pathogen Dickeya zeae alters the in-field plant microbiome (2021) Environmental microbiology, IF: 5.49, doi: 10.1111/1462-2920.15726.
• Alfonso Esposito, Luigimaria Borruso, Jayne Rattray, Lorenzo Brusetti, Engy Ahmed. Taxonomic and functional insights into rock varnish microbiome using shotgun metagenomics (2019) FEMS Microbiology Ecology, 95:12, IF 4.09, DOI: 10.1093/femsec/fiz180
• Alfonso Esposito, Arianna Pompilio, Clotilde Bettua, Valentina Crocetta, Elisabetta Giacobazzi, Ersilia Fiscarelli, Olivier Jousson, Giovanni Di Bonaventura. Evolution of Stenotrophomonas maltophilia in Cystic Fibrosis Lung over Chronic Infection: A Genomic and Phenotypic Population Study (2017) Frontiers in Microbiology, IF 4.07, DOI: 10.3389/fmicb.2017.01590