Reading whole-slide images

The training and testing image data is provided in the form of whole-slide images. Whole-slide images are stored in the Aperio .svs file format as multi-resolution pyramid structures. The files contain multiple downsampled versions of the original image. Each image in the pyramid is stored as a series of tiles, to facilitate rapid retrieval of subregions of the image. 

OpenSlide is a C library that provides a simple interface to read whole-slide images of different formats. Python and Java bindings are also available. A MATLAB interface to OpenSlide is available here and here

Automated Slide Analysis Platform (ASAP) is an open source platform for visualizing, annotating and automatically analyzing whole-slide histopathology images. ASAP is built on top of several well-developed open source packages like OpenSlide, Qt and OpenCV. You can download ASAP from Github

Cytomine is an internet application for collaborative analysis of multi-gigapixel images. It can read .svs files thanks to OpenSlide and provide RESTful API to easily extract image/annotation data:

You can also use Aperio (now Leica) ImageScope for visualizing the whole slide images (Windows only).

Staining unmixing and normalization

One of the major difficulties in histopathology image analysis is appearance variability. For example, when performing mitosis detection, many false positives can arise when the histopathology slide is overstained. This MATLAB code performs staining unmixing (separation of the hematoxylin and eosing stains) and appearance normalization. It is based on the method described in [1]. Some examples of staining normalization can be seen in the figure below.

[1] A method for normalizing histology slides for quantitative analysis, M Macenko, M Niethammer, JS Marron, D Borland, JT Woosley, G Xiaojun, C Schmitt, NE Thomas, IEEE ISBI, 2009.

Example staining normalization. The top row are the original images and the bottom row are the images after performing staining normalization.