ABSTRACT: This Mini-School is linked to the NITheCS research programme ‘Genomics, Bioinformatics, and Advanced Medicine’. The theme of the first two lectures, given by Prof Martin Bucher, is the algorithmics of genomic inference. With modern genome sequencing, massive datasets have become available. How to fully exploit them remains an active area of research. The final two lectures by Dr Japie Greeff deal with applications of AI and Machine Learning to problems in medicine.
Reconstructing Family Trees from Genomic Data: Basic Methods
The fundamentals of phylogenetic reconstruction are reviewed, including distance methods, parsimony, maximum likelihood, and Bayesian inference. For small datasets, these methods can be applied to all possible trees and the best solution, or most probable solutions, is found.
Reconstructing Family Trees from Genomic Data: Methods for Big Data
The methods of the last lecture are feasible for small datasets & always find the appropriately defined ‘best’ solution. These methods however are too slow for the large datasets available. We discuss "heuristic" methods, which, while not guaranteed always to return the best solution, often return the best solution or a good approximation thereof. We discuss applications & future prospects.
Applied Artificial Intelligence: Using AI for a Systematic Literature Review
The applications of AI are rapidly expanding. Systematic literature reviews play a major role in the medical sciences & are extremely labour intensive. We show how AI can simplify this task using semi-supervised learning. A small amount of Python knowledge is helpful.
Active Learning to Generate Models with Partially Labelled Data
We discuss a semi-supervised machine learning technique called Active Learning, where a fully labelled dataset is not needed to create a classification model. Instead, the act of labelling the data set will be used to train the model at the same time.