Image Analysis: Enabling Data Powered Pathology
The rise of whole slide imaging has enabled a variety of computational tools to facilitate the extraction of quantitative data from tissue samples on microscope slides. The practice of using computational algorithms to collect these data from whole slide images is known as “image analysis”. Image analysis has already had an impact in drug development, in translational research, and in patient diagnosis. This impact is expected to continue increasing in the future. The Digital Pathology Association recently released a white paper on image analysis, which provides a comprehensive overview of this technology. We’ve put together this article as a brief summary of image analysis technology.
What does an image analysis algorithm look for?
The goal of image analysis is to translate an image of a stained tissue sample into quantitative data that can be used to support research or diagnosis. But what specific details within a whole slide image are these algorithms looking for? Often, an image analysis algorithm will identify and count individual cells of interest in a tissue sample, or collect information about the spatial arrangement of those cells. Other times, image analysis is used to calculate the area of other structures in a tissue sample, such as tumor regions or fat vacuoles.
An image analysis algorithm will look at a whole slide image and identify pixels that represent cells or tissue structures of interest. Then, the algorithm “segments” these significant areas of interest into parts, which can then be counted or measured in other ways. One common use of this type of image analysis is to measure protein expression in an immunohistochemistry (IHC) stain. The algorithm can count cells that display a biomarker associated with a protein of interest. Many image analysis algorithms go beyond the simple counting of cells, and incorporate details such as staining intensity and stain completeness into their quantitative output. Other factors, such as individual cell morphology and the spatial distribution of cells, can also play an important role in diagnosis and prediction.
Some more sophisticated image analysis algorithms also leverage the power of artificial intelligence (AI) to support data extraction. The term “artificial intelligence” is used very broadly, but there are several different classifications of AI algorithms. For this type of image analysis, supervised machine learning is common. Labelled images are used to “train” the machine learning algorithm to identify tissue structures that have some significance to research or diagnosis. Once trained, the algorithm can be applied to new images and identify those same types of structures on its own.
Challenges of image analysis
Image analysis can support research and diagnosis by extracting many different types of data from whole slide images. These algorithms must be flexible and iterative to accommodate the diverse array of questions they aim to answer. In addition, algorithms must account for different types of tissue stains and variations in tissue quality, staining quality, and image quality. These details can present challenges to any given image analysis algorithm. There are a few key factors that a successful algorithm must balance:
• Sensitivity – The ability of an algorithm to correctly identify weakly-stained cells
• Specificity – The ability of an algorithm to reject tissue artifacts that are not areas of interest
• Contour accuracy – The ability of an algorithm to correctly approximate the size and shape of a cell or tissue structure of interest
In addition, these algorithms must strike a reasonable balance between oversegmentation and undersegmentation of a whole slide image into significant parts. In other words, an algorithm must capture as much relevant data as possible while still being selective enough to filter out unnecessary information from an image.
For an algorithm to output reliable data from a tissue image, a staining threshold must be set as a baseline. Each stained tissue sample can contain variations in color, which makes it challenging to set a baseline that will work for all images. Color normalization tools can help align images to a more reliable standard. However, image analysis algorithms will need to remain iterative to maintain their flexibility. This often involves adjusting various parameters of an algorithm on a representative subset of images before applying it to all images needed to answer a research or diagnostic question.
Finally, there will always be a challenge in the cost associated with transitioning to a digital workflow that incorporates whole slide scanning and image quantification in both research and clinical environments. It can be expensive and time consuming to purchase the necessary equipment and train employees how to use this equipment. The workflow of tissue to data also needs to be adjusted to incorporate these additional steps, including quality control checkpoints throughout the process. Many researchers outsource this digital pathology work to contract research organizations (CROs) such as Reveal Biosciences, allowing them to benefit from a quantitative approach to research and diagnosis without bearing the upfront costs of implementing these technologies themselves.
The pathologist’s role in image analysis
Image analysis opens many exciting possibilities in the field of pathology, enabling a more objective, routine approach to data generation from microscope slides. However, pathologists will not be replaced by this technology. In the new digital workflow, the pathologist’s expertise is more important than ever. Image analysis algorithms are simply tools that pathologists can use to guide and support their conclusions. Pathologists have valuable lab expertise that is necessary to facilitate the journey from an uncut tissue sample to a stained microscope slide. High quality tissue, stains, and images are an essential foundation to support the output of reliable data from image analysis.
For some analyses, it is unnecessary to quantify an entire whole slide image. In many cases, a researcher or clinician may only be interested in quantifying select regions of interest within an image. The pathologist can leverage their expertise to help select key regions of interest that need to be analyzed. A pathologist may also have external knowledge about a given study population that may help inform a reliable analysis. Finally, pathologists are critical to the development and assessment of new image analysis algorithms. They can provide a “ground truth” as a basis for what an algorithm should be looking for and assess whether an existing image analysis algorithm is correctly identifying the features that it aims to.
Image analysis at Reveal Biosciences
Image analysis is poised to continue changing the way drug development, translational research, and patient diagnosis is conducted. Reveal specializes in developing deep learning-based models to generate quantitative data from whole slide images, which can be used for research and clinical applications. We incorporate multiple quality control checkpoints within our workflow to ensure the highest-quality images and data to support critical decisions in research and diagnosis.
Contact us to learn how we can apply these technologies to generate actionable, quantitative data for your research.
Aeffner, Famke, et al. “Introduction to Digital Image Analysis in Whole-Slide Imaging: A White Paper From the Digital Pathology Association.” Journal of Pathology Informatics 10.1 (2019): 9.