NucleusSegData: Cell Nucleus Segmentation Dataset for Fluorescence Microscopy Images

This is a dataset for cell nucleus segmentation in fluorescence microscopy images. We made it publicly available in the hope of accelerating microscopic image analysis research.

We used this dataset in our previous research studies which are published in the following journals. Please cite these studies if you use this dataset in your research work.

  • C.F. Koyuncu, R. Cetin-Atalay, and C. Gunduz-Demir, “Object oriented segmentation of cell nuclei in fluorescence microscopy images,” Cytometry: Part A, 93A(10):1019-1028, 2018.
  • S. Arslan, T. Ersahin, R. Cetin-Atalay, and C. Gunduz-Demir, “Attributed relational graphs for cell nucleus segmentation in fluorescence microscopy images,” IEEE Transactions on Medical Imaging, 32(6):1121-1131, 2013.

NOTE: This dataset is provided for research purposes only. The authors have no responsibility for any consequences of use of this dataset.

The dataset contains 2661 cell nuclei of 37 fluorescence microscopy images. The cells were taken from the Huh7 and HepG2 liver cancer cell lines and stained with nuclear Hoechst 33258. Their images were digitized at 20x microscope objective lens and pixel resolution was 768×1024. The nuclei in these images were manually annotated by our biologist collaborator.

The 785 nuclei of ten randomly selected images (five Huh7 and five HepG2 cell line images) are used in the training set and the nuclei in the remaining 27 images are used in the test sets. HepG2 cells tend to grow in more overlayers than Huh7 cells. This leads to more overlapping nuclei in the images of the HepG2 cell line. Thus, we provide two test sets, one for the Huh7 cell line and the other for the HepG2 cell line. The Huh7 cell line test set includes 891 nuclei of 11 images and the HepG2 cell line test set includes 985 nuclei of 16 images.

The original RGB images in these datasets and their annotations are provided as a single zip file.

The dataset is now extended! The images and annotations in this extended dataset are also provided as a single zip file.

We first used this extended version in the following work. This extended dataset contains 61 images of 3329 cell nuclei. Its training set contains 1126 nuclei of 25 images (ten Huh7 and 15 HepG2 cell line images). The Huh7 dataset remains the same. The HepG2 test set now contains 1312 nuclei of 25 images.

  • G.N. Gunesli, C. Sokmensuer, and C. Gunduz-Demir, “AttentionBoost: Learning what to attend for gland segmentation in histopathological images by boosting fully convolutional networks,” IEEE Transactions on Medical Imaging, 39(12):4262-4273, 2020.