close
close

Microsoft and Paige researchers developed Virchow2 and Virchow2G: second-generation foundation models for computational pathology

Microsoft and Paige researchers developed Virchow2 and Virchow2G: second-generation foundation models for computational pathology

To diagnose and cure cancer, pathological examination of tissues is crucial. Digital versions of the old histological slides used for light microscopy are gradually replacing them with whole slide images (WSI). This allows computational pathology to move from being used primarily as academic evidence to a routine tool in clinical practice. To aid in the diagnosis, characterization, and understanding of diseases, computational pathology uses digital WSI in conjunction with artificial intelligence (AI). The first AI pathology system to receive FDA approval was introduced in 2021, after initial efforts focused on clinical decision support tools to improve existing workflows. More recent research, however, aims to decipher ordinary WSI for previously unknown outcomes such as treatment prediction and response due to remarkable advances in the performance of computer vision, an image-centric field of artificial intelligence.

The creation of large-scale deep neural networks, often called base models, has been a key driver of advances in the performance of computer vision models. A class of algorithms called self-supervised learning is used to develop base models. These models do not require curated labels and are trained on massive datasets, orders of magnitude larger than those traditionally used for computational pathology. Embeddings, the data representations generated by base models, can be generalized to a variety of prediction tasks. This is in stark contrast to existing diagnostic-specific computational pathology methods, which use only a subset of pathology images and will not be able to generalize well enough due to the wide variety of variations in tissue morphology and laboratory preparations. For applications lacking data to build personalized models, such as detecting rare tumor types or less common diagnostic tasks like predicting specific genomic alterations, clinical outcomes, or therapeutic responses, the value of generalizing from large datasets becomes even more evident. This model has the potential to be used for a wide range of important tasks, including clinically robust prediction of cancer (both common and rare), subtyping, biomarker quantification, cell instance, event counting, and therapeutic response prediction, provided it is trained with a sufficiently large amount of digital WSI in the pathology domain.

As scaling studies show, the performance of foundational models strongly depends on the size of the dataset and the model itself. To train models with hundreds of millions to billions of parameters, modern baseline models in the natural image domain use datasets such as ImageNet, JFT-300M, and LVD-142M, among others. Vision Transformers (ViT) are another example. Despite the challenges of amassing large-scale pathology-specific datasets, some groundbreaking research has used datasets with 30,000 to 400,000 WSIs to train baseline models with 28 to 307 million parameters. This groundbreaking research is inspiring and paves the way for further advances in computational pathology.

Virchow, a baseline pathology model with one million images, was developed by Paige and Microsoft Research. It is named after Rudolf Virchow, the man widely recognized as the father of modern pathology and who first proposed the theory of cellular pathology. Memorial Sloan Kettering Cancer Center (MSKCC) provided Virchow with 1.5 million H&E-stained white tissue images, representing approximately 100,000 patients. This is 4 to 10 times more white tissue images than previous pathology training datasets. The training dataset includes benign and malignant tissues derived from 17 different high-level tissues, with biopsy accounting for 63% and resection accounting for 37% of the total. Using the DINO v. two algorithm (a self-supervised multi-view student-teacher algorithm), Virchow, a ViT model with 632 million parameters, is trained. DINO v.2 uses global and local tissue tile areas to train various downstream prediction tasks to learn how to embed white tissue tiles. These embeddings can then be aggregated across multiple slides.

The Virchow2 and 2G models are trained using the largest known digital pathology dataset, which uses data from over 3.1 million full-slide images (2.4 PB of data), representing over 40 tissues from 225,000 patients in 45 countries. With 632 million parameters, Virchow2 is on par with the original Virchow model, and with 1.85 billion parameters, Virchow2G is the largest pathology model ever created. To support this scale, the researchers suggest domain-inspired modifications to the DINOv2 training algorithm, the de facto standard for self-supervised learning in computational pathology. This algorithm plays a crucial role in training the models to achieve state-of-the-art performance on twelve tile-level tasks compared to the best-performing competing models. While models that scale only in terms of the number of parameters tend to perform worse when faced with diverse data and domain-specific training, domain adaptation, data scale, and model scale have been found to contribute to better overall performance.

The team thoroughly evaluated the performance of these foundational models on twelve tasks in the study, capturing the breadth of application areas in computational pathology. Preliminary results indicate that Virchow2 and Virchow2G excel at detecting minute details in cell architecture and shape. Detecting cell division and predicting gene activity are two areas where they excel. Complex features, such as the direction and shape of the cell nucleus, can likely be better measured for these tasks. This thorough evaluation should reassure the scientific and medical community about the reliability of these models in computational pathology and inspire optimism for the future of cancer diagnosis and treatment.


Discover the Paper And Blog. All the credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel And LinkedIn Groops. If you like our work, you’ll love our bulletin..

Don’t forget to join us Over 48,000 ML subreddits

Find upcoming AI webinars here



Dhanshree Shenwai is a Software Engineer with a good background in FinTech companies spanning Finance, Cards & Payments and Banking with a keen interest in AI applications. She is enthusiastic about exploring new technologies and advancements in today’s evolving world, making everyone’s life easier.