My background is in machine learning and medical image analysis. Machine learning algorithms learn from examples in order to make predictions about novel data. For example, by learning from medical scans with annotated abnormalities, the algorithms can detect abnormalities in previously unseen patients. Below are a couple of research topics I have worked on in the past, or am working on right now.

Multiple instance learning (2011 – now)

Multiple instance learning (MIL) is a learning scenario where labels are available only for sets (bags) of samples (instances). For example, in medical imaging, labels can be available only for entire scans (bags), and not for regions of interest (instances). I have worked on MIL algorithms which are based on distances between bags, and on more fundamental aspects, such as the relationship of MIL to other learning settings, the trade-off between bag-level and instance-level performance, and the differences between real-life and artificial MIL datasets.

Learning with similarities (2011 – now)

Some algorithms make predictions by defining distances or similarities between images. In several cases distances can be non-metric: similarities provided by experts, distances defined by transformations between objects, distances between sets, and so forth. Because many algorithms expect metric distances, the non-metric characteristics are often considered an artifact and are removed. I have worked on understanding when such non-metric characteristics can be informative and how to best use this information.

Transfer learning (2015 – now)

Machine learning algorithms assume that the training and test data originate from the same distribution. However, this is not always the case for data acquired with different conditions, such as scans made with different scanners. In such cases, traditional machine learning algorithms can fail. I am currently working on creating algorithms which are robust to variations between scanners and between patient groups for segmentation tasks in brain MR images and for classification of COPD.

Classification of COPD (2014 – now)

I have previously used MIL classifiers to detect chronic obstructive pulmonary disease (COPD) in chest CT images in a single-center dataset. I am currently extending this work to also provide local abnormality scores (rather than only global patient labels), and improving the robustness of the method for multi-center datasets.

Crowdsourcing (2016 – now)

In medical imaging, annotations are often expensive to acquire and therefore scarce, motivating learning scenarios such as multiple instance learning and transfer learning. An alternative is to gather more annotations by outsourcing the task to the crowd, i.e. crowdsourcing. Although the crowd annotators do not have specialized training, several studies show that by combining their answers, good results can be achieved. I am investigating whether this is also the case for measuring airways in chest CT images.

For more information please see my publications.