Expert Systems With Applications 81 (2017) 223–243 

Contents lists available at ScienceDirect 

Expert Systems With Applications 

journal homepage: www.elsevier.com/locate/eswa 

Computational method for unsupervised segmentation of lymphoma 

histological images based on fuzzy 3-partition entropy and genetic 

algorithm 

Thaína A. Azevedo Tosta 

a , ∗, Paulo Rogério Faria 

b , Leandro Alves Neves c , 
Marcelo Zanchetta do Nascimento 

a 

a Center of Mathematics, Computing and Cognition, Federal University of ABC, Av. dos Estados, 5001, 09210-580, Santo André, São Paulo, Brazil 
b Department of Histology and Morphology, Institute of Biomedical Science, Federal University of Uberlândia, Av. Amazonas, S/N, 38405-320, Uberlândia, 

Minas Gerais, Brazil 
c Department of Computer Science and Statistics, São Paulo State University, R. Cristóvão Colombo, 2265, 15054-0 0 0, São José do Rio Preto, São Paulo, Brazil 

a r t i c l e i n f o 

Article history: 

Received 18 September 2016 

Revised 28 January 2017 

Accepted 23 March 2017 

Available online 23 March 2017 

Keywords: 

Nuclear segmentation 

Histological images 

Lymphoma 

Fuzzy 3-partition 

Genetic algorithm 

Valley-emphasis 

a b s t r a c t 

Non-Hodgkin lymphoma is the most common cancer of the lymphatic system and should be considered 

as a group of several closely related cancers, which can show differences in their growth patterns, their 

impact on the body and how they are treated. The diagnosis of the different types of neoplasia is made 

by a specialist through the analysis of histological images. However, these analyses are complex and the 

same case can lead to different understandings among pathologists, due to the exhaustive analysis of 

decisions, the time required and the presence of complex histological features. In this context, compu- 

tational algorithms can be applied as tools to aid specialists through the application of segmentation 

methods to identify regions of interest that are essential for lymphomas diagnosis. In this paper, an un- 

supervised method for segmentation of nuclear components of neoplastic cells is proposed to analyze 

histological images of lymphoma stained with hematoxylin-eosin. The proposed method is based on the 

association among histogram equalization, Gaussian filter, fuzzy 3-partition entropy, genetic algorithm, 

morphological techniques and the valley-emphasis method in order to analyze neoplastic nuclear compo- 

nents, improve the contrast and illumination conditions, remove noise, split overlapping cells and refine 

contours. The results were evaluated through comparisons with those provided by a specialist and tech- 

niques available in the literature considering the metrics of accuracy, sensitivity, specificity and variation 

of information. The mean value of accuracy for the proposed method was 81.48%. Although the method 

obtained sensitivity rates between 41% and 51%, the accuracy values showed relevance when compared 

to those provided by other studies. Therefore, the novelties presented here may already encourage new 

studies with a more comprehensive overview of lymphoma segmentation. 

© 2017 Elsevier Ltd. All rights reserved. 

1

 
l  

c  

r  

&  

(  

p

m

t  

(

 
m  

p  

l  

8  

c  

h

0

. Introduction 

Lymphoma is a type of malignant disease that develops in cel-

ular components called lymphocytes ( Orlov et al., 2010 ). These

ells represent one of the highest white blood cell populations

esponsible for the immunological defense of the body ( Gartner

 Hiatt, 2003 ). Lymphomas are divided into Hodgkin lymphomas

HL) and non-Hodgkin lymphomas (NHL), in accordance with
∗ Corresponding author. 

E-mail addresses: tosta.thaina@gmail.com (T.A. Azevedo Tosta), 

aulo.faria@ufu.br (P.R. Faria), neves.leandro@gmail.com (L. Alves Neves), 

arcelo.zanchetta@gmail.com (M.Z.d. Nascimento). 

d  

U  

f  

f

 
b  

ttp://dx.doi.org/10.1016/j.eswa.2017.03.051 

957-4174/© 2017 Elsevier Ltd. All rights reserved. 
heir combinations of morphological, genetic and clinical features

 Mauriño & Siqueira, 2011 ). 

The 2016 report of the National Cancer Institute of Brazil esti-

ates almost 11,0 0 0 new cases of NHL ( INCA, 2016 ). Chronic lym-

hocytic leukemia (CLL), follicular lymphoma (FL) and mantle cell

ymphoma (MCL) belong to the NHL class, which corresponds to

5% of the lymphomas ( Lowry & Linch, 2013 ). The American Can-

er Society estimates for 2017 that about 72,240 new cases will be

iagnosed and about 20,140 people will die from this cancer in the

nited States ( ACS, 2017 ). Thus, nowadays there is a high demand

or diagnoses, where its analysis and detection remain a challenge

or pathologists. 

Tissue samples stained with hematoxylin-eosin (H&E) have

een used by pathologists for analysis and identification of NHL

http://dx.doi.org/10.1016/j.eswa.2017.03.051
http://www.ScienceDirect.com
http://www.elsevier.com/locate/eswa
http://crossmark.crossref.org/dialog/?doi=10.1016/j.eswa.2017.03.051&domain=pdf
mailto:tosta.thaina@gmail.com
mailto:paulo.faria@ufu.br
mailto:neves.leandro@gmail.com
mailto:marcelo.zanchetta@gmail.com
http://dx.doi.org/10.1016/j.eswa.2017.03.051


224 T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 

 
s  

a

 
w  

(  

&  

l  

m  

(  

r  

B  

c  

t  

B  

o  

o  

f  

a  

w  

f

 
i  

t  

a  

c  

w  

d  

t  

u  

s  

O  

I  

a  

r

 
(  

2  

&  

(  

a  

i  

t  

w  

a  

t  

c  

o  

c

 
f  

l  

g  

t  

b  

fi  

i  

e  

a  

m

 
l  

o

 
t  

w  
cancer structures. These procedures are essential for disease mon-

itoring and more efficient definitions for treatments ( Orlov et al.,

2010 ). However, visual evaluation is a complex task due to the

significant time involved, its subjectivity and variability between

pathologists ( Oger, Belhomme, & Gurcan, 2012; Sertel, Lozanski,

Shana’ah, & Gurcan, 2010b ). 

Histological samples can be analyzed by computational tech-

niques and this procedure has provided advances in the sup-

port for diagnosis and prognosis of lymphomas. The computational

strategies can improve the accuracy and efficiency of the detection

of cells linked to NHL cancers ( Belkacem-Boussaid, Samsi, Lozan-

ski, & Gurcan, 2011; Sertel, Catalyurek, Lozanski, Shanaah, & Gur-

can, 2010a ) and patterns recognition ( Orlov et al., 2010 ). 

The segmentation of NHL structures is a crucial task in many

clinical applications and the subsequent stages, including feature

extraction and classification, which all rely heavily on the quality

of this process. In this stage, techniques are applied in order to rec-

ognize the presence, distribution, size and morphological features

useful for diagnosis ( Haggerty, Wang, Dickinson, O’Malley, & Mar-

tin, 2014 ). However, such a task is complex due to features varia-

tions, mainly when distinguishing nuclear regions ( Irshad, Veillard,

Roux, & Racoceanu, 2014 ). 

In this context, this paper presents an unsupervised segmenta-

tion method to aid pathologists in the identification of neoplastic

nuclei of CLL, FL and MCL histological images. The proposed algo-

rithm was divided into steps of preprocessing, segmentation and

post-processing. In the preprocessing step, the histogram equaliza-

tion and Gaussian filter were applied to the RGB color model chan-

nels. A technique based on thresholding was developed as a result

of the combination between genetic algorithm (GA) and fuzzy 3-

partition entropy method. Finally, the valley-emphasis technique

and morphological operations of dilation and opening were ap-

plied in the post-processing step. The proposed method was tested

on a public dataset comprised of 12 images of CLL, 62 of FL and

99 of MCL, which were obtained with magnification of 20 ×. The

metrics of accuracy, sensitivity, specificity and variation of infor-

mation were applied for quantitative evaluations. The performance

of the proposed algorithm was compared to the results provided

by the mean-shift technique ( Comaniciu & Meer, 2002 ) and the

approaches proposed by de Oliveira et al. (2013) ; Phoulady, Gold-

gof, Hall, and Mouton (2016) ; Vahadane and Sethi (2013) ;

Wienert et al. (2012) and Paramanandam et al. (2016) . 

1.1. Related works 

Several studies dedicated to the segmentation of NHL histolog-

ical images are presented in the literature. 

In the case of CLL images, the studies of Mohammed, Far, Nau-

gler, and Mohamed (2013a , 2013b ) presented nuclear, cellular and

cytoplasmic segmentation methods of normal and neoplastic lym-

phocytes. In Mohammed, Far, Naugler, and Mohamed (2013b) , the

authors employed Otsu thresholding, canny edge detector, morpho-

logical operations and removal of 1% of local minima of watershed

to reduce over and under segmentation errors. Further, the authors

presented in Mohammed et al. (2013a) , a segmentation method

based on pixel classification using support vector machine (SVM)

and K-means to reduce the feature set. 

For MCL images, Yang, Tuzel, Meer, and Foran (2008) developed

a segmentation method of overlapping cells using L 2 estimation

( L 2 E ), gradient vector flow (GVF) and the CIE LUV color model for

contour extraction. High curvature points were identified by a lit-

erature proposal. The canny edge detector was applied to detect

inner edges, which correspond to candidate lines for separating the

cells. These lines were analyzed through the Dijsktra algorithm to

determine the best segmentation of the overlapping cells. 
Studies related to the detection of CLL and MCL lesions are re-

tricted to blood related images with different magnifications, such

s 60 × ( Yang et al., 2008 ) and 100 × ( Mohammed et al., 2013b ). 

To segment follicular regions in FL images, different approaches

ere proposed, such as the methods of active contour model

 Arora & Banerjee, 2013; Belkacem-Boussaid, Prescott, Lozanski,

 Gurcan, 2010 ), region-based thresholding using curve evo-

ution ( Belkacem-Boussaid et al., 2011 ), thresholding based on

ean brightness value ( Zorman et al., 2007 ) and Otsu algorithm

 Oger et al., 2012 ). To deal with significant color variations in the

egions of interest (ROIs), Arora and Banerjee (2013) and Belkacem-

oussaid et al. (2010) applied a local energy function and active

ontour model to identify follicles from H&E stained tissue sec-

ions. In addition to the follicular regions segmentation, Arora and

anerjee (2013) also investigated the classification of the grades

f FL, but the segmentation step was not evaluated. The authors

f Belkacem-Boussaid et al. (2010) applied pre and post-processing

or segmenting FL images. However, a limitation was noted in this

pproach consisting of the merging of follicles in the segmentation,

hich demands a new strategy for the separation of overlapping

ollicles. 

The algorithm of Belkacem-Boussaid et al. (2011) stands out

n the evaluation metrics of their different steps. For instance,

he metrics of signal to noise ratio and texture contrast were

pplied to define the more adequate color channel. The prepro-

essing step was evaluated by the Haralick homogeneity metric,

hereas overlapping follicles were analyzed by a concavity in-

ex. In Zorman et al. (2007) , the authors used a pixel classifica-

ion approach for follicles segmentation. The mean brightness val-

es were considered as the threshold value in a pre-segmentation

tep. Different from the above described techniques, the method in

ger et al. (2012) proposed a segmentation of follicular regions on

HC images with registration of the identified regions on H&E im-

ges. However, the conformity metric does not reach satisfactory

esults due to identification of false positive regions. 

Algorithms based on the Otsu thresholding method

 Dimitropoulos, Michail, Koletsa, Kostopoulos, & Grammalidis,

014; Michail et al., 2014 ), k-means ( Oztan, Kong, Gurcan,

 Yener, 2012; Sertel et al., 2009 ) and mean-shift clustering

 Sertel et al., 2010a ) were applied to detect centroblasts on FL im-

ges. After nuclear segmentation, Dimitropoulos et al. (2014) also

nvestigated the extraction of morphological and textural fea-

ures. Nucleoli detection and cytoplasm histogram analysis

ere used by applying the SVM classifier. This approach was

lso employed by Michail et al. (2014) , however, intensity fea-

ures were then classified by the linear discriminant analysis

lassifier. Furthermore, these studies applied empirical thresh-

ld values in the segmentation step for removal of red blood

ells. 

For FL grading, Oztan et al. (2012) presented a method

or analyzing the more discriminant features among the cel-

ular regions on FL images. For this purpose, features of

raphs constructed from K-means segmentation results, informa-

ion of intensity, texture and MBIR representations were com-

ined, thus reaching the best results with the SVM classi-

er. The centroblast detection was also explored by the stud-

es of Sertel et al. (2009) and Sertel et al. (2010a) . How-

ver, Sertel et al. (2010a) considered the mean-shift method so

s not to have to manually define the number of clusters, as de-

anded by the K-means method used by Sertel et al. (2009) . 

Considering segmentation methods for CLL, FL and MCL histo-

ogical images, Table 1 summarizes the strengths and weaknesses

f these techniques proposed in the literature. 

A majority of the studies in Table 1 presents methods for de-

ection and segmentation of FL due to its high incidence rate,

hich represents the second most common B-cell lymphoma


T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 225 

Table 1 

Strengths and weaknesses of studies related to segmentation of the histological images stained with H&E. 

Ref. and lesion images Segmentation Method Strengths Weaknesses 

Arora and Banerjee (2013) FL 

images. 

Active contour model. Robust segmentation to color 

variations. 

Few training images for FL grading. 

Belkacem-Boussaid et al. (2010) 

FL images. 

Active contour model. New filter for removal of noise and 

enhancement of follicle contours 

and robust method to low 

magnification images. 

Manual selection of seed points of active contour 

model, unsatisfactory performance on low 

contrast images, limitation to identify nearby 

or small follicles individually. 

Belkacem-Boussaid et al. (2011) 

FL images. 

Region-based segmentation 

using curve evolution. 

Use of pathological and biological 

features to define seed points 

and removal of false positives, 

new proposal for concavity 

detection. 

Empirical definition of initial size of the curve, 

training with three cases to empirically define 

parameters, few images to the method 

evaluation, unsatisfactory performance for 

application on histological slides with low 

quality in their preparation and coloring 

processes. 

Dimitropoulos et al. (2014) FL 

images. 

Thresholding and Otsu. Use of pathological properties to 

propose the hybrid classifier. 

Empirical thresholding to remove red blood cells. 

Dimitropoulos et al. (2016) FL 

images. 

K-means and graph cut. New evaluation criteria for splitting 

of elliptical components and 

overlapping nuclei. 

Use of IHC and H&E images, training image set 

smaller than the test set. 

Kong et al. (2011a) FL images. Thresholding, efficient local 

Fourier transform, K-means 

and K-nn. 

Proposal of a new color model. Limitation to individually identify nuclei. 

Kong et al. (2011b) FL images. Thresholding, efficient local 

Fourier transform, K-means 

and K-nn. 

Proposal of a new color model and 

a new method for splitting 

overlapping nuclei. 

Parameter definition on five training images. 

Luo et al. (2006) Blood images 

of MCL, Hairy Cell Leukemia 

and Plasma Cell Leukemia. 

Otsu and watershed. Use of feature invariant to rotation, 

translation and scale. 

Manual definition of a threshold to feature 

selection. 

Michail et al. (2014) FL images. Thresholding and Otsu. Use of pathological criteria. Empirical definition of a threshold value for 

removal of red blood cells. 

Mohammed et al. (2013b) 

Blood cell images. 

Otsu, canny edge detector and 

subtraction between cellular 

and nuclear segmentations. 

Removal of 1% of local minima to 

deal with over and under 

segmentation errors. 

Cells located at image center, illumination 

conditions given as uniforms for Otsu 

thresholding. 

Mohammed et al. (2013a) 

Blood cell images. 

Otsu, K-means, SVM and 

subtraction between cellular 

and nuclear segmentations. 

Over and under segmentation 

reduced by SVM. 

Cells located at image center, few training 

images, limitation in cytoplasm segmentation 

and poor performance in overlapping 

lymphocytes. 

Oger et al. (2012) FL images. Otsu and intersection between 

R and B binary masks. 

Use of stain properties for color 

channel selection. 

Few images for the method evaluation and use of 

IHC images. 

Oztan et al. (2012) FL images. K-means. Definition of high level descriptor 

of cytological components. 

Empirical definition of relationship between 

graph vertices, limitation to application on 

overlapping cells. 

Sertel et al. (2008a) FL images. K-means. Use of stain properties in feature 

extraction of follicles and of 

morphological and pathological 

features for centroblast detection. 

Use of IHC and H&E images, with manual points 

definition for mapping between them. 

Sertel et al. (2008b) FL images. Thresholding and K-means. Use of pathological criteria for 

definition of cytoplasm features. 

Empirical thresholding to remove red blood cells 

and image background. 

Sertel et al. (2009) Whole-slide 

images of FL. 

Thresholding and K-means. Use of the biological and 

pathological features for FL 

grading. 

Training image set smaller than the test set, 

empirical definition of threshold value for 

removal of red blood cells and image 

background, use of texture feature on 40 ×
magnification images. 

Sertel et al. (2010a) FL images. Mean-shift. Use of biological criteria for 

definition of detected features. 

Definition of probability functions of centroblasts 

using a small training set. 

Sertel et al. (2010b) FL images. Gaussian mixture modeling 

with parameters estimation 

using expectation 

maximization. 

Parameter estimation of fast radial 

symmetry transform adequate to 

different cell sizes, automation 

for parameters definition. 

Texture extraction on 40 × magnification images. 

Yang et al. (2008) Cases of 

MCL, CLL, FL and other 

lymphoma types. 

L 2 E and GVF algorithms, 

detection of high curvature 

points, canny edge detector 

and concave vertex graph. 

High performance in a graph 

application, better quantitative 

results than watershed, use of 

pathological criteria for 

segmentation. 

Empirical definition of parameters for high 

curvature points detection. 

Zorman et al. (2007) FL images. Thresholding. New transform for follicle 

definition. 

Empirical definition of color channel without 

contrast analysis. 

(  

m  

s

 
t  

B  

m

 
s  

i  

a  

e  

p  

o  
 Canellos, Lister, & Young, 2006 ). For this reason, a method for seg-

entation of nuclear structures of CLL and MCL abnormalities is

till a major challenge in pathology. 

The studies on Table 1 present some limitations addressed by

his work. For instance, different from the methods of Belkacem-

oussaid et al. (2010) and Sertel et al. (2008a) , the proposed

ethod in this study does not require any user interaction. 
A common difficulty in lymphoma image processing is the

egmentation of overlapping cells, as indicated by the stud-

es of Belkacem-Boussaid et al. (2010) , Kong, Belkacem-Boussaid,

nd Gurcan (2011a) , Mohammed et al. (2013b) and Oztan

t al. (2012) . This difficulty was processed during the post-

rocessing step, considering the valley-emphasis strategy. More-

ver, the threshold values of Dimitropoulos et al. (2014) ,


226 T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 

 
Fig. 1. Schematic illustration of the proposed method for unsupervised segmenta- 

tion of nuclear neoplastic structures in histological images of lymphoma. 

 
d  

r  

p  

t

2

 
d  

p

2

 
s  

d  

a

 
t  

b  

g  

f

2

 
f  

A  

t  

l  

d  
Luo, Celenk, and Bejai (2006) , Sertel et al. (2009) and Sertel et al.

(2008b) were empirically defined and can present disadvantages

for practical applications. 

The methods described by Dimitropoulos, Barmpoutis, Ko-

letsa, Kostopoulos, and Grammalidis (2016) , Sertel et al. (2008a)

and Oger et al. (2012) for segmentation of lymphoma used histo-

logical samples stained with IHC and H&E. Using these approaches,

different types of images are required for the investigation and

segmentation of an abnormality. 

Some methods demand a training step, as the studies

of Dimitropoulos et al. (2016) , Mohammed et al. (2013b) and Sertel

et al. (2010a) . In these studies, the training step was composed

of a small number of images, and this can be a limiting fac-

tor in the representation of ROIs features. Moreover, the meth-

ods of Belkacem-Boussaid et al. (2011) and Oger et al. (2012) em-

ployed only a few images for the testing step. The works

of Belkacem-Boussaid et al. (2011) and Kong, Gurcan, and

Belkacem-Boussaid (2011b) considered, respectively, only three

and five images for defining the parameters of the algorithm,

which can be inadequate for application in image datasets. In

the approaches of Mohammed et al. (2013a , 2013b ), only im-

ages with ROIs in their central regions were investigated. Be-

sides, Mohammed et al. (2013a) assumed in their proposed ap-

proach uniform illumination of the images for Otsu application. 

In their segmentation methods, Sertel et al. (2009 , 2010b) used

texture features obtained from images with magnification of 40 ×.

Through this condition, these methods cannot reach high quality

results in lower magnifications, since high magnifications can lead

to more details of these features ( Gurcan et al., 2009 ). Belkacem-

Boussaid et al. (2010 , 2011) indicated limitations for applications

using low contrast images and low quality histological sample

preparation. These conditions can be found in images available on

public domain datasets. Moreover, the related studies used private

image datasets for the evaluation of their methods. Researchers

emphasize the importance of using public image datasets in order

to demonstrate robustness of new approaches ( Fuchs & Buhmann,

2011; Kothari, Phan, Stokes, & Wang, 2013; McCann, Ozolek, Castro,

Parvin, & Kovacevic, 2015; Tafavogh, Catchpoole, & Kennedy, 2014 ).

1.2. Contributions of this work 

In this work, an unsupervised computational algorithm is pro-

posed to segment nuclei from neoplastic cells located on CLL, FL

and MCL histological images. The main contributions are summa-

rized as follows: 

• A segmentation algorithm based on intrinsic features from nu-

clei of H&E histological lymphoma images with 20 × magnifi-

cation, which allows for less details to perform segmentation; 
• Application of techniques for contrast and illumination en-

hancements, allowing its application to be used on images with

different conditions; 
• A novel unsupervised method of automatic threshold selection

for noise removal (non-neoplastic regions) at the segmentation

step, based on GA associated to fuzzy 3-partition entropy tech-

nique; 
• The evaluation of valley-emphasis thresholding and morpholog-

ical operations in order to split identified cells and enhance

the representativeness of nuclear shapes at the post-processing

step; 
• A performance evaluation using a public domain dataset, which

is represented by variations commonly found in clinical prac-

tice; 
• Contribution to state of the art of lymphoma images processing

methods for segmentation of neoplastic cells in CLL and MCL
images, and new schemes for identifying centrocytes and cen-

troblasts in cellular structures from FL. 

This paper is organized as follows: the techniques and the

ataset are described in Section 2 . Section 3 discusses the obtained

esults in each step and presents the comparative evaluation of the

roposed algorithm and other methods from the literature. Finally,

he conclusion is presented in Section 4 . 

. Materials and methods 

In this section, the segmentation algorithm, the public domain

ataset and the metrics applied to analyze the experiments are

resented in details. 

.1. Method overview 

The proposed method consists of three steps: preprocessing,

egmentation and post-processing. In Fig. 1 , are shown the steps

eveloped to analyze the nuclear components of lymphoma im-

ges. 

The algorithms were developed using MATLAB 

® language and

he experiments were performed on an 1.7 GHz processor ultra-

ook (Acer M5-481T-6417) with 6GB RAM. The deconvolution plu-

in from ImageJ (2016) was also considered to analyze the images

rom the deconvolution process. 

.2. Dataset 

The lymphoma cases considered in this study were obtained

rom a Zeiss Axioscope microscope with 20 × objective lens and an

xioCam MR5 CCD color camera. All images were obtained under

he same equipments configurations, objective lens, camera and

ight source. These histological samples were stained with H&E and

igitized using the RGB color model with 24 bits of quantization,


T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 227 

Fig. 2. Color channels separation from RGB color model of a subimage of the case sj-05-5269-R10_002 from CLL lesion (a), with neoplastic nuclei indicated by red arrows 

and normal nuclei indicated by blue arrows, which exemplifies the contrast differences of the R (b), G (c) and B (d) channels, and their respective histograms illustrated by 

(e), (f) and (g). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) 

a  

2

 
s  

e  

a  

l  

l  

2  

m  

m

2

 
i  

e  

T  

i  

t  

2  

m  

d  

s  

d  

p

 
a  

t  

F  

t  

t  

a  

&  

o  

a

 
n  

i  

t  

f  

t  

&

 
v  

l  

i

g  

w  

p

 
t  

p  

d  

m  

o  

g

G  

w  

s

 
m  

w

 
i  

s  

a  

F  

s  

a  

w  

p  

i  

r

 
(  

t  

s  

w  
vailable for download ( Shamir, Orlov, Eckley, Macura, & Goldberg,

008 ). 

The described images dataset was evaluated applying manual

egmentation of subsets of each class. The image dataset consid-

red in this work is composed of 12, 62 and 99 images of CLL, FL

nd MCL, respectively. Each case contains almost 20 0 0 cells, simi-

ar amount as in studies of segmentation of cytological and histo-

ogical images ( Dimitropoulos et al., 2016; Gençtav, Aksoy, & Önder,

012; Wang, Hu, Li, Liu, & Zhu, 2016 ). The lesions were manually

arked by a specialist and automatically analyzed by the proposed

ethod. 

.3. Preprocessing 

The main purpose of the preprocessing step is to improve

mage representation for subsequent stages through contrast-

nhancement and/or noise reduction ( Gonzalez & Woods, 20 0 0 ).

here are different preprocessing techniques: the decomposition

nto each component of color models and the image normaliza-

ion to standardize the image colors ( Hoffman, Kothari, & Wang,

014 ). In this work, the RGB, HSV, LAB, LUV, YCbCr and YIQ color

odels were applied for decomposition into their components. The

econvolution process ( Ruifrok & Johnston, 2001 ) was also used to

eparate the two stains components of hematoxylin and eosin into

ifferent images. The entropy metric was used to choose the ap-

ropriate channel (see Section 2.6 ). 

Fig. 2 (b), (c) and (d), respectively, shows differences in contrast

mong the R, G and B channels of a CLL image ( Fig. 2 (a)). His-

ograms of the R, G and B color channels are illustrated by the

ig. 2 (e), (f) and (g), respectively. As can be observed in Fig. 2 (f),

he G channel intensity level distribution is more uniform than

he distribution of the R and B channels, indicating more contrast

nd, consequently, a better representation of the image ( Gonzalez

 Woods, 20 0 0 ). This procedure was also applied to analyze the

ther lesions, leading to the choice of the B and R channels for FL

nd MCL images, respectively. 

The histogram equalization technique was applied in order to

ormalize the color distribution across slides with different stain-

ng and illumination conditions ( Jothi & Rajam, 2016 ). Equaliza-

ion is a method for contrast-enhancement using histogram in-

ormation. This method redistributes the intensity levels so that
he image histogram can have an uniform distribution ( Gonzalez

 Woods, 20 0 0 ). 

Given an image with n = M × N pixels, characterized by discrete

alues for the gray levels r , equalization uses a function of cumu-

ative probability distribution as its transformation function, which

s expressed by: 

 r = T ( f r ) = 

r ∑ 

i =0 

p f ( f i ) = 

r ∑ 

i =0 

m i 

n 

, (1)

here, m i represents the frequency of gray level i and p f ( f i ) is the

robability of i -th gray level. 

Then, a Gaussian filter was applied to reduce noise and smooth

he lymphoma images. This technique consists in a convolution

rocess that uses a mask characterized by its size and elements

istribution, resulting in the sum of products between their ele-

ents and the intensity values of the image. Elements distribution

f the mask is defined by a 2-dimensional Gaussian function, as

iven by: 

 (x, y ) = 

1 

2 πσ 2 
e −

x 2 + y 2 
2 σ2 , (2)

here, x and y represent the image pixels coordinates and the

tandard deviation is expressed by σ . 

In this paper, the parameter σ was investigated with assign-

ents in the range [0.5, 5.0], and the different sizes of the mask

ere explored applying 3 × 3, 5 × 5 and 7 × 7 pixels. 

Fig. 3 shows that the amount of smoothing and noise reduction

s proportional to the mask parameters. Fig. 3 (a), (e), (i) and (m)

hows the R channel of the same image for a better visualization

nd comparison of results along the lines of this representation.

ig. 3 (b), (c) and (d) illustrates the results applying masks with

izes 3 × 3, 5 × 5 and 7 × 7 pixels, respectively, and the vari-

ble σ assigned to 0.5. Fig. 3 (f), (g) and (h) are the obtained results

ith value 2 of σ and masks with sizes 3 × 3, 5 × 5 and 7 × 7

ixels, respectively. Fig. 3 (j)–(l) and (n)–(p) represents the obtained

mages of the investigation of parameter σ with values 3.5 and 5,

espectively. 

It is noticeable that the mask size can change image sharpness

see Fig. 3 (f) and (h)). The sharpness level can hinder the distinc-

ion between nuclei and cytoplasm regions on the segmentation

tep (see Fig. 3 (g), (h), (k), (l), (o) and (p)). In this step, a mask

ith size of 3 × 3 pixels and sigma 2, represented by Fig. 3 (f), was


228 T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 

Fig. 3. Example of the use of Gaussian filter on a subimage of the case sj-04-4525-R4_001 from MCL class: (a), (e), (i), (m) R color channel from the original image, application 

with mask sizes of 3 × 3 ((b), (f), (j), (n)), 5 × 5 ((c), (g), (k), (o)) and 7 × 7 ((d), (h), (l), (p)) pixels, and assignments of sigma to 0.5 ((b)–(d)), 2 ((f)–(h)), 3.5 ((j)–(l)) and 5 

((n)–(p)). 

 
2  

2  

P  

(  

r  

2  

t  

c  

r  

t

 
p  

a  

p  

e  

v

 
t  

b  

d  

t  

o  
applied since it yields better distinctions between nuclear regions

and their surrounding areas. 

Another way to improve the histological image quality is based

on color normalization. An investigation of the method proposed

by Macenko et al. (2009) is presented in Section 3.1.1 . 

2.4. Segmentation 

The proposal of an unsupervised segmentation method is one

of the most difficult tasks in digital image processing. Many tech-

niques can be applied in this step, such as thresholding, region

growing, graphs and watershed ( Gonzalez & Woods, 20 0 0 ). Thresh-

olding is one of the main methods considered on histological

images as an efficient tool for segmentation of different struc-

tures ( Meijering, 2012; Oswal, Belle, Diegelmann, & Najarian, 2013;

Smochin ̆a, Herghelegiu, & Manta, 2011 ). Thus, the approach cho-

sen here was based on multilevel thresholding to distinguish nu-

cleus, cytoplasm and background. The definition of the threshold

values was obtained by GA and fuzzy 3-partition entropy tech-

nique ( Yin, Zhao, Wang, & Gong, 2014 ) in order to obtain an un-

supervised method. There are other optimization algorithms that

can be adopted to find the multilevel thresholding. Therefore, the

evolutionary methods of artificial bee colony (ABC) ( Bose & Mali,
016; Heris, 2015a ), cuckoo search (CS) ( Bhandari, Kumar, & Singh,

015; Yang, 2009 ), differential evolution (DE) ( Cuevas, Zaldívar, &

erez-Cisneros, 2016; Heris, 2015b ), particle swarm optimization

PSO) ( Biswas, 2014; Remamany, Chelliah, Chandrasekaran, & Sub-

amanian, 2015 ) and wind driven optimization (WDO) ( Bayraktar,

013; Bayraktar, Komurcu, & Werner, 2010 ) were also investigated

o this problem (see Section 3.2.1 ). Also, the cytoplasm areas were

onsidered during the segmentation step in order to obtain better

esults of nuclear contours. Fig. 4 shows the main steps of GA used

o calculate the threshold values ( Paulinas & Ušinskas, 2015 ). 

Firstly, it is necessary to define the GA initial population com-

osed of a set of individuals. Then, the histogram was calculated

nd normalized using the preprocessed image. In this stage, the

opulation size was empirically chosen as 60 individuals, in which

ach individual was represented as a chromosome encoded as six

alues related to the normalized histogram intensity levels. 

The fuzzy 3-partition entropy technique was applied to define

he values of each individual. The S and Z functions, represented

y Eqs. (3) and ( 4 ), were considered to quantify the membership

egree of gray levels ( k ) to each investigated region. In this work,

wo pairs of these functions were necessary for the segmentation

f the three analyzed classes (nucleus, cytoplasm and background)


T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 229 

Fig. 4. Flowchart of GA algorithm for threshold values definition (adapted 

from Kaushik, Singh, Singhal, and Singh, 2013 © 2013 Citeseer). 

(

S  

Z  

 
f  

a  

g  

g

M  

M

M  

w  

s

<  

t  

s

 
r  

h  

l  

t  

a  

t

a

 
m  

e

H

 
Fig. 5. Representative graph of membership functions used by fuzzy 3-partition 

entropy method with indication of threshold values by black arrows (adapted 

from Yin et al., 2014 © 2014 Elsevier). 

w  

P  

P  

P  

w

 
g  

t  

T  

t  

v  

w  

b

 
s  

t  

b  

I  

d  

s  

d  

s

 
o  

t  

a  

d  

d

 
r  

w  

c  

s

 Yin et al., 2014 ). 

(k, u, v , w ) = 

⎧ ⎪ ⎪ ⎪ ⎪ ⎨ 

⎪ ⎪ ⎪ ⎪ ⎩ 

1 , k ≤ u 

1 − (k − u ) 2 

(w − u ) · (v − u ) 
, u < k ≤ v 

(k − w ) 2 

(w − u ) · (w − v ) 
, v < k ≤ w 

0 , k > w 

(3)

(k, u, v , w ) = 1 − S(k, u, v , w ) . (4)

The functions given by S and Z were considered as membership

unctions, which classify the pixels into three fuzzy sets, referred

s the three analyzed classes ( M n related to the membership de-

ree to nuclear regions, M c to cytoplasm regions and M b to back-

round). These functions are represented by Eqs. (5) –( 7 ): 

 n (k ) = S(k, u 1 , v 1 , w 1 ) , (5)

 c (k ) = 

{
Z(k, u 1 , v 1 , w 1 ) , k ≤ w 1 , 

S( k, u 2 , v 2 , w 2 ) , k > w 1 , 
(6) 

 b (k ) = Z(k, u 2 , v 2 , w 2 ) , (7)

here, k represents the brightness levels of the image, in this

tudy, 0 ≤ k ≤ 255 , and u 1 , v 1 , w 1 , u 2 , v 2 and w 2 , where 0 ≤ u 1 
 v 1 < w 1 < u 2 < v 2 < w 2 ≤ 255, are parameters that determine

he distribution of membership degrees of each intensity level, as

hown in Fig. 5 . 

In Fig. 5 the pixels that contain intensity levels between the pa-

ameter u 1 and the intersection point of M n and M c curves have a

igher membership degree with the dark set, which is a feature of

ymphoma image nuclear regions. The values of pixels contained in

he interval defined by intersection points of the curves are associ-

ted to cytoplasm regions and values held in the interval between

he intersection point of M c and M b curves and the parameter w 2 

re associated to background and other irrelevant information. 

The values that represent each individual were obtained by the

embership functions parameters and analyzed considering the

ntropy function, represented in Eq. (8) : 

(u 1 , v 1 , w 1 , u 2 , v 2 , w 2 ) = −P n · log (P n ) −P c · log (P c ) −P b · log (P b ) , 

(8)
here, the probability value ( P ) of each structure was obtained by:

 n = 

255 ∑ 

k =0 

h (k ) · M n (k ) , (9)

 c = 

255 ∑ 

k =0 

h (k ) · M c (k ) , (10)

 b = 

255 ∑ 

k =0 

h (k ) · M b (k ) , (11)

here, h( · ) represents the normalized histogram of the image. 

In this step, the H value must be maximized, indicating a

reater amount of extracted information and allowing the defini-

ion of the best combination of parameters ( u 1 , v 1 , w 1 , u 2 , v 2 , w 2 ).

he threshold values are represented by black arrows in the in-

ersection points of the M n , M c and M b functions in Fig. 5 . Such

alues can also be defined considering the membership degrees,

here both of them correspond to their assignment to 0.5, as can

e seen on the Y-axis in Fig. 5 ( Yin et al., 2014 ). 

The selection step must be applied to select individuals con-

idered as adequate solutions. These individuals are maintained in

he next generation through the elitism process, and they are used

y the crossover and mutation steps ( Paulinas & Ušinskas, 2015 ).

n this step, 30% of the population was used, so 18 from 60 in-

ividuals that compose each population were considered in each

ubsequent generation. This value was chosen to reach population

iversity, leading to a higher amount of new individuals and, con-

equently, new solutions. 

The crossover operation must be applied to the population in

rder to generate new individuals. This stage will be repeated until

he population size is complete. In this study, the crossover prob-

bility ( Jianli & Baoqi, 2009 ) was defined with the value 0.65. This

ecision was based on empirical tests of the crossover probability

efined between 0.5 and 1.0. 

The value of crossover point (CP) ( Jianli & Baoqi, 2009 ) was

andomly selected among the parameters u 1 , v 1 , w 1 , u 2 , v 2 and

 2 for obtaining two descendants. Fig. 6 illustrates an example of

rossover, in which two individuals were used to obtain two de-

cendants based on the CP assignment to value 2. 


230 T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 

Fig. 6. Example of crossover performed on two individuals for obtaining their off- 

spring. 

Fig. 7. Example of application of the Hammouche et al. (2008) algorithm for def- 

inition of stop criteria over two generations ((a) and (b)) using a histogram repre- 

sentation. 

 
Fig. 8. Example of binary dilation with the increase of objects area, indicated by 

red arrows, and filling holes, identified by blue arrows: binary image (a), its corre- 

spondent regions identified on the original image (b), the result of this operation 

represented by a binary image (d) and the original image (e) using a structuring 

element with disk distribution with radius 2 (c). (For interpretation of the refer- 

ences to color in this figure legend, the reader is referred to the web version of this 

article.) 

Fig. 9. Example of opening operation that allows to eliminate isolated pixels iden- 

tified by red and blue arrows: binary image (a), original image (b) and the result of 

opening represented by a binary image (d) and its mapping on the original image 

(e) obtained using a structuring element with size of 3 × 3 with a square distri- 

bution (c). (For interpretation of the references to color in this figure legend, the 

reader is referred to the web version of this article.) 

 
t  

t  

e  

c  

F  

i  

p  

s  

o  

a  

F

 
e  

o  

i  

a  

u  

(  

i  

t

The mutation was applied to a subset after the crossover step.

In this study, mutation was applied with a 0.01 probability to gen-

erate similar solutions by new attribute values ( Lad, Agrawal, &

Pandya, 2014 ). 

Finally, the method proposed by Hammouche, Diaf, and

Siarry (2008) was applied as GA stop criteria. In this work, the ex-

ecution was interrupted when the average of intensity levels of the

analyzed structures remains the same over two consecutive gener-

ations. Fig. 7 represents an example of a sufficient condition of stop

criteria. In Fig. 7 (a), the result obtained in a generation is presented

and Fig. 7 (b) illustrates the result obtained in the subsequent gen-

eration. The average values between the classes 1 and 2 ( t 1 ), and

2 and 3 ( t 2 ) are also presented. Thus, considering that the aver-

age between the classes 2 and 3 was the same in both generations

(250), the method is finished. 

2.5. Post-processing 

To reduce the number of possible false positive regions in the

segmentation step, it is necessary to use techniques capable to re-

fine the segmented structures. Initially, the area was computed for

each segmented object. Regions with areas smaller than 10 pixels

were removed since empirical tests show that these regions repre-

sent small noise. 

The segmentation stage was also incapable to remove intra-

nuclear regions of the lymphoma images. These regions are char-

acterized by intensity levels that are brighter than the nuclei

that compose them. Therefore, the valley-emphasis technique

( Ng, 2006 ) was applied on regions identified by the segmentation

step mapped on the preprocessed images. 

This technique considers the smallest probability of occurrence

( p T ) of the intensity levels to determine a threshold value. Besides,

the variance between classes is maximized, according to the sec-

ond term in Eq. (12) , as defined by Otsu (1979) : 

T = Max { (1 − p T ) · (ω 1 (T ) μ1 
2 (T ) + ω 2 (T ) μ2 

2 (T )) } , (12)

where, ω represents the occurrence probability of the considered

classes (nuclei and intra-nuclear regions) and μ corresponds to the

average of intensity levels of each class. 
Finally, operations of dilation and opening were applied on

he obtained image from the valley-emphasis method. The dila-

ion operation allows filling holes and increasing the objects ar-

as ( Gonzalez & Woods, 20 0 0; Pedrini & Schwartz, 2007 ), as indi-

ated by red and blue arrows, respectively, in the Fig. 8 (b) and (e).

ig. 8 illustrates the execution of this operation with a structur-

ng element with disk distribution and radius 2 ( Fig. 8 (c)). Fig. 8 (a)

resents the binary image, in which segmented regions are repre-

ented by black pixels, and Fig. 8 (b) presents its mapping on the

riginal image. The obtained binary results are shown in Fig. 8 (d)

nd the identification in relation to original image is presented in

ig. 8 (e). 

The opening operation was applied to eliminate isolated pix-

ls and to merge small neighboring regions. Fig. 9 illustrates this

peration using a binary image ( Fig. 9 (a)) and its correspondent

dentification on the original image ( Fig. 9 (b)). In this example,

 3 × 3 pixels structuring element with square distribution was

sed ( Fig. 9 (c)), obtaining objects represented by a binary image

 Fig. 9 (d)) and the original image ( Fig. 9 (e)). Red and blue arrows

ndicate differences between regions before and after its applica-

ion. 


T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 231 

Fig. 10. Relation between regions identified by a specialist ( R ) and computational 

techniques ( S ) (adapted from Oger et al., 2012 © 2012 Elsevier). 

2

 
p  

w  

g  

c  

s  

t  

e  

t  

t  

p  

B

E  

w  

i  

l  

 
u  

2  

e  

2

 
b  

a  

i  

e  

d  

c  

t  

t  

2  

s  

m

 
a  

fi  

r  

m  

t  

t  

t

S

Fig. 11. Example of application of the preprocessing methods on a CLL image: (a) 

original image, (b) extracted G channel and (a) application of histogram equaliza- 

tion and Gaussian filter. (For interpretation of the references to color in this figure 

legend, the reader is referred to the web version of this article.) 

Fig. 12. Example of application of histogram equalization and Gaussian filter on B 

channel from a FL image: (a) original image, (b) extracted B channel and (c) appli- 

cation of preprocessing techniques. (For interpretation of the references to color in 

this figure legend, the reader is referred to the web version of this article.) 

S

A

 
t  

w  

m

V  

 
p  

m  

u  

u

3

 
d  

c

3

 
t  

t  

b  

a  

F

 
o  

c  

o  

s  

v  

i  
.6. Evaluation methods 

In the preprocessing step, a metric based on entropy was ap-

lied to define the appropriate color model and filters. This metric

as applied considering that its low value indicates high homo-

eneity and, consequently, absence of noise on the images, which

ould represent problems for the segmentation ( Tsai, Lee, & Mat-

uyama, 2008 ). In addition, minimum entropy also indicates effec-

ive performance of filters capable to handle illumination differ-

nces, which can also represent complications in the segmenta-

ion ( Leong, Brady, & McGee, 2003 ). Thus, the low value of en-

ropy corresponds to the desired condition for the results from

reprocessing stage, calculated by Eqs. (13) and ( 14 ) ( Belkacem-

oussaid et al., 2010 ): 

 = 

N ∑ 

i =1 

p i · log 2 (p i ) , (13)

p = 

H t (I(p, q )) ∑ N 
t=1 H t (I(p, q )) 

, (14) 

here, H represents the image histogram, N is the total number of

ntensity levels, t is the index that defines the considered intensity

evel, and i is the index that demonstrates the histogram variation.

In the segmentation stage, there are several metrics that can be

sed to evaluate quality of the segmented image ( Estrada & Jepson,

009; Ghose et al., 2012 ). In this study, four metrics were consid-

red: accuracy, sensitivity, specificity ( Insana, Meyers, & Grossman,

0 0 0 ) and variation of information ( Wu, Zhao, Luo, & Shi, 2015 ). 

The evaluation of a segmentation method can be performed

y calculating the overlapping regions of the segmented image

nd the regions of a reference image demarcated by a special-

st (gold-standard). Then, the following parameters were consid-

red: true positive ( T P ), corresponding to the amount of correctly

etected pixels, true negative ( T N ) that represent the amount of

orrectly undetected pixels, false positive ( F P ) that is related to

he number of incorrectly detected pixels, and the false nega-

ive ( F N ) that indicates incorrectly undetected pixels ( Chang et al.,

014 ). Fig. 10 illustrates the relation between these concepts, gold-

tandard regions ( R ) and objects identified by unsupervised seg-

entation techniques ( S ). 

The metrics of sensitivity ( Se ), specificity ( Sp ) and accuracy ( Ac )

re defined in Eqs. (15) , ( 16 ) and ( 17 ), respectively. Sensitivity de-

nes the amount of manually detected pixels that were also cor-

ectly segmented by the proposed algorithm. Specificity is deter-

ined to quantify the percentage of correctly defined true nega-

ives ( Oger et al., 2012 ). Accuracy corresponds to the quantifica-

ion of proposed segmentation hit rate to the manual segmenta-

ion ( Byrd, Zeng, & Chouikha, 2007 ). 

e = 

T P 
T + F 

, (15) 

P N p
p = 

T N 
T N + F P 

, (16) 

c = 

T P + T N 
T P + T N + F P + F N 

. (17) 

The metric of variation of information ( VoI ) is defined to quan-

ify the distance between R and S , as demonstrated by Eq. (18) ,

here H(X) represents the entropy of X and I(X,Y) corresponds to

utual information between X and Y . 

 oI(R, S) = H(R ) + H(S) − 2 I(R, S) (18)

It should be noted that low values of VoI are required to ex-

ress more similarities between unsupervised and manual seg-

entations ( Wu et al., 2015 ). Eqs. (15) –( 17 ) indicate that high val-

es obtained from these calculations represent better results of the

nsupervised segmentation. 

. Results and discussion 

The results of each step and their limitations are presented in

etails in this section. Furthermore, comparative results are dis-

ussed applying quantitative and qualitative analyzes. 

.1. Preprocessing 

Fig. 11 (a), (b) and (c) shows a CLL class original image, the ex-

raction of its G color channel and the preprocessed image, respec-

ively. Results of preprocessing step of a FL image are presented

y Fig. 12 (a)–(c). A MCL image is illustrated by Fig. 13 (a), as well

s the extraction of its R channel and the result of this step by

ig. 13 (b) and (c), respectively. 

Table 2 presents entropy values obtained by the application

f histogram equalization and Gaussian filter on each considered

olor channel. Minimum entropy values, represented in bold, were

btained by G, B and R channels for CLL, FL and MCL lesions, re-

pectively. This metric was used in this study, since its minimum

alue indicates a good performance measure related to preprocess-

ng filters for dealing with illumination differences and noise, as

resented by Belkacem-Boussaid et al. (2010) . 


232 T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 

Fig. 13. Example of preprocessing application on a MCL image: (a) original image, 

(b) extracted R channel and (c) result of application of preprocessing techniques 

that reached minimum entropy. (For interpretation of the references to color in this 

figure legend, the reader is referred to the web version of this article.) 

Table 2 

Entropy values obtained from application of the preprocessing methods 

on each channel from the color models of RGB, HSV, LAB, LUV, YCbCr, 

YIQ, hematoxylin and eosin. 

Channels CLL FL MCL 

R (RGB) −7.7922 −7.7889 −7.9830 

G (RGB) −7.9900 −7.7889 −7.4814 

B (RGB) −7.7914 −8.0021 −7.4822 

H (HSV) −7.7155 −7.7505 −7.7969 

S (HSV) −7.7847 −7.7981 −7.7753 

V (HSV) −7.7910 −7.7889 −7.7816 

L (LAB) −7.4909 −7.7954 −7.7823 

a (LAB) −7.5604 −7.8108 −7.4377 

b (LAB) −7.4346 −7.8081 −7.4539 

L (LUV) −7.0308 −7.6281 −7.0504 

U (LUV) −6.8558 −7.5759 −6.8782 

V (LUV) −7.4842 −7.6906 −7.6759 

Y (YCbCr) −7.7702 −7.7949 −7.7813 

Cb (YCbCr) −7.3676 −7.5534 −7.3814 

Cr (YCbCr) −7.2570 −7.5603 −7.2217 

Y (YIQ) −7.6907 −7.7949 −7.6817 

I (YIQ) −5.1639 −6.5765 −5.9798 

Q (YIQ) −7.4339 −7.6344 −7.2159 

Hematoxylin −7.4469 −7.7884 −7.6084 

Eosin −7.4 84 8 −7.3846 −7.5996 

 
a  

t  

s  

s  

s  

d

3

 
F  

i  

F  

i  

t  

l  

a  

t  

t  

a

 
fi  

a  

r  

p  

i  

m  

o  

F

3

 
o

 
t  

(  

r  

w  

c  

b  

o

 
r  

w  

w  

i  

a  

fi

 
t  

s  

t  

e  

u  

c  

l  

s  

a  

m  

b

 
t  

o  

p  
3.1.1. Normalization analysis 

The normalization method of Khan (2013) ; Macenko

et al. (2009) was applied to compare the obtained results from the

preprocessing stage. Normalization techniques allow the removal

of color inconsistencies and it can be considered an important pro-

cessing step for histopathological images ( Li & Plataniotis, 2015 ).

In this method, it is necessary to choose a reference image so its

colors can be mapped on the processed images. In this regard, the

lymphoma images were analyzed in the search for the image with

the highest contrast between the background and the ROIs. In this

application, Fig. 14 illustrates the chosen image and its histogram.

The sparse distribution of the histogram ( Fig. 14 (b)) indicates a

high contrast of this image ( Gonzalez & Woods, 20 0 0 ). 

Original images histograms were non-uniform. This condition

makes the application of the proposed segmentation inadequate

since this color distribution indicates low contrast, making the

identification of ROIs more difficult. In this distribution, the thresh-

old values defined by the proposed segmentation are not ca-

pable of identifying the neoplastic nuclei. Even after normal-

ization, the images had non-uniform histograms. The displace-

ment of histogram peaks to the left, darkening pixels that were

brighter, still indicates low contrasts. To demonstrate this limita-

tion, Fig. 15 illustrates a CLL image after the proposed preprocess-

ing step ( Fig. 15 (a)) and after the normalization with conversion

to grayscale ( Fig. 15 (c)). Their histograms are also shown, where

it is noted that the normalized histogram presents a non-uniform

distribution, and consequently a lower contrast. 
Through the entropy metric used for the preprocessing evalu-

tion, the normalization technique also presented higher results

han the proposed preprocessing step. CLL, FL and MCL lesions pre-

ented quantitative results of −7.0409, −7.1282 and −7.0442, re-

pectively. These values are higher than the minimum values pre-

ented by Table 2 , indicating poorer performance for illumination

ifferences and noise presence. 

.2. Segmentation 

Figs. 16 (a), 17 (a) and 18 (a) present original images of CLL,

L and MCL, respectively. Figs. 16 (b), 17 (b) and 18 (b) present

mages resulting from preprocessing step of each class, and

igs. 16 (c), 17 (c) and 18 (c) illustrate the histograms of preprocessed

mages, as well as the threshold values obtained by the segmen-

ation step, indicated by red lines. Figs. 16 (d), 17 (d) and 18 (d) il-

ustrate segmentation results by binary images. Figs. 16 (e), 17 (e)

nd 18 (e) present the original images with regions identified by

he segmentation. Finally, Figs. 16 (f), 17 (f) and 18 (f) show in de-

ails some regions with correspondents highlighted for this step

nalysis. 

Figs. 16 (f), 17 (f) and 18 (f) present irregular contours of identi-

ed regions, which is in contrast to the expected results for this

pplication. Thus, the post-processing step was necessary to cor-

ect the resulting contours. Furthermore, the minimum size of true

ositive regions were determined in order to remove the false pos-

tive regions, which were associated to small noises. Nuclear seg-

entation containing two identified nuclei as one object was also

bserved, as represented by some regions with red contours in the

igs. 16 (f), 17 (f) and 18 (f). 

.2.1. Analysis of evolutionary algorithms 

The segmentation results obtained by the optimization methods

f ABC, CS, DE, GA, PSO and WDO are presented on Table 3 . 

Considering the CLL lesion, the best results of accuracy, sensi-

ivity, specificity and variation of information metrics were 81.06%

GA), 47% (ABC, CS, DE, PSO and WDO), 89% (GA) and 1.11 (ABC),

espectively. As noted on Table 3 , the highest value of sensitivity

as shared among the vast majority of the used methods, with ex-

eption of the GA. However, the best specificity result was obtained

y this method, which directly influenced its accuracy, which stood

ut from among the others. 

When the FL and MCL were considered, the best results of accu-

acy, specificity and variation of information of FL and MCL lesions

ere obtained by the GA. The best sensitivity (57%) of FL lesion

as shared among ABC, DE and PSO. In the MCL lesion, sensitiv-

ty reached the highest value (50%) in the DE technique. However,

gain, the GA achieved the lowest sensitivity and the highest speci-

city. 

The presented results indicate the poor performance of GA in

he identification of the gold-standard regions, represented by the

ensitivity metric. The results shown in Table 3 can also indicate

hat DE provided the best values for the three groups. Consid-

ring these values, DE can obtain good threshold values by the

se of only one best-proposed solution over the steps of selection,

rossover and mutation. Meanwhile, the GA uses a set of best so-

utions in the steps of selection and crossover, exploring more pos-

ible solutions at the cost of more variations in its results. Through

 consideration of the best results presented across most of the

etrics of the analyzed lesions, the GA technique is considered the

est optimization method for this application. 

The other methods have presented intermediate results due to

heir randomness. ABC uses a random approach in the proposal

f new solutions by the employed bees. In CS method, the pro-

osed solutions, represented by bird nests, also use randomness


T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 233 

Fig. 14. Reference image for normalization analysis in the preprocessing step (a) and its histogram (b). 

Table 3 

Average results of segmentation through the metrics of accuracy, sensitivity, specificity and variation of information using the ABC, CS, DE, GA, 

PSO and WDO on lymphoma images. 

Lesion Optimization methods Accuracy sensitivity Specificity Variation of Information 

CLL 

ABC 79.92% 47% 86% 1.11 

CS 80.07% 47% 87% 1.20 

DE 79.94% 47% 86% 1.21 

GA 81.06% 41% 89% 1.14 

PSO 80.02% 47% 86% 1.20 

WDO 79.88% 47% 86% 1.21 

FL 

ABC 81.44% 57% 85% 1.11 

CS 81.69% 56% 85% 1.10 

DE 81.47% 57% 85% 1.11 

GA 82.83% 51% 87% 1.04 

PSO 81.70% 57% 85% 1.11 

WDO 81.52% 56% 85% 1.13 

MCL 

ABC 78.40% 47% 84% 1.20 

CS 79.56% 42% 84% 1.15 

DE 79.29% 50% 80% 1.15 

GA 80.76% 42% 86% 1.09 

PSO 79.55% 46% 84% 1.15 

WDO 79.30% 47% 84% 1.16 

i  

p  

t  

t  

g  

b  

e  

i  

E  

t  

i  

3

 
e  

a  

i  

c  

i  

F  

o  

a

 
i  

m  

F  

o  

a  

a  

g  

r  

p  

r

3

 
t  

C  

l  

i  

t  

t  

l  

s  

s  

w  

l  

(  
ndependent of the best solution of each generation. In this ap-

roach, if a nest with a cuckoo egg defined by a random parame-

er is discovered, this nest is removed, even if it is the best solu-

ion of the current generation. Although it uses the best local and

lobal solutions in its exploration step, PSO presents the influence

etween the updates of particle position and its velocity. Consid-

ring this dependency relationship, PSO convergence becomes lim-

ted by controlling the exploration ability of this method ( Shi &

berhart, 1998 ). The influence between variables is also noted in

he WDO method, given by the updates of the position and veloc-

ty of a particle, which also limits the exploration of new solutions.

.3. Post-processing 

Fig. 19 illustrates the application of valley-emphasis on a case of

ach lesion. Fig. 19 (a), (b) and (c) shows a CLL segmented image,

 region in details and the corresponding area obtained by apply-

ng valley-emphasis method, respectively. Fig. 19 (d)–(f) illustrate a

ase of FL lesion containing a segmented image, a segmented area

n details and the same region after application of this technique.

ig. 19 (g)–(i) corresponds to a segmented image from a MCL case,

ne of its regions and the result of application of this step on this

rea. 

Fig. 19 (c), (f) and (i) indicates that some cells were character-

zed by holes and irregular contours (marked by red arrows). Thus,
orphological operations of dilation and opening were applied and

ig. 20 illustrates an example of each group after the application

f these methods. Fig. 20 (a), (d) and (g) presents segmented im-

ges with application of valley-emphasis method on the CLL, FL

nd MCL images, respectively. Fig. 20 (b), (e) and (h) illustrates re-

ions from CLL, FL and MCL cases in details, respectively, with ir-

egular regions marked by red arrows. Finally, Fig. 20 (c), (f) and (i)

resents the results of morphological operations marked by red ar-

ows indicating the obtained corrections. 

.4. Comparative analysis of results 

It is important to note that there were not found any compu-

ational techniques that had been developed for the analysis of

LL and MCL groups that contemplate histological images from

ymph nodes biopsy. Considering segmentation methods for FL

mages, there are no studies exclusively related to the segmen-

ation of both its characteristic structures, centrocytes and cen-

roblasts, as proposed by this study. Therefore, some studies re-

ated to segmentation of nuclear structures in histological images

tained with H&E were used for comparing results. The mean-

hift method ( Comaniciu & Meer, 2002 ) was used due to its

ide application by works proposed for segmentation of histo-

ogical images, such as Xing and Yang (2013) and Sertel et al.

2010a) . The techniques proposed by Vahadane and Sethi (2013) ,


234 T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 

Fig. 15. A CLL image after preprocessing step (a), its histogram (b), and the same image after normalization of Macenko et al. (2009) (c), and its histogram (d), indicating a 

lower contrast. 

 
e  

b  

t  

m

Wienert et al. (2012) , de Oliveira et al. (2013) , Phoulady et al.

(2016) and Paramanandam et al. (2016) were also applied for pre-

senting nuclear segmentation methods of histological images. 

Figs. 21 (a), 22 (a) and 23 (a) present original images with 20

× magnification of CLL, FL and MCL classes, respectively. Refer-
nce images manually segmented by the specialist are represented

y Figs. 21 (b), 22 (b) and 23 (b). Figs. 21 (c), 22 (c) and 23 (c) illus-

rate results obtained by application of the proposed segmentation

ethod on CLL, FL and MCL images, respectively. 


T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 235 

Fig. 16. Example of segmentation application on a subimage of sj-05-3344_006 case from CLL class: (a) original image, (b) image resulting from preprocessing step, (c) 

histogram of preprocessed image with threshold values of 80 and 197, indicated by red lines, (d) binary image that were obtained from the segmentation, (e) mapping of 

regions identified in segmentation on the original image and (f) the regions in details for result analysis. (For interpretation of the references to color in this figure legend, 

the reader is referred to the web version of this article.) 

 
m  

a  

f  

i  

n  

o  

a  

o  

t

 
a  

(  

f  

(  

t  

i  

s  

t  

c  

r

 
a  

b  

m  

b  

a  

o  

a  

(

 
b  

p  

w  

w

 
a  

a  

i  

t  

e

 
a  

i  

t  

m  

c  

a  

t  

c

Figs. 21 (d), 22 (d) and 23 (d) show the results obtained by the

ean-shift technique ( Comaniciu & Meer, 2002 ). Figs. 21 (e), 22 (e)

nd 23 (e) present the results after application of the method

ound in Vahadane and Sethi (2013) . Figs. 21 (f), 22 (f) and 23 (f)

llustrate the results of the Wienert et al. (2012) tech-

ique. Figs. 21 (g), 22 (g) and 23 (g) show the results

f de Oliveira et al. (2013) algorithm. Finally, Figs. 21 (h), 22 (h)

nd 23 (h), and Figs. 21 (i), 22 (i) and 23 (i) present the results

f Phoulady et al. (2016) and Paramanandam et al. (2016) , respec-

ively. 

In comparison to manual segmentation ( Figs. 21 (b), 22 (b)

nd 23 (b)), the results provided by the proposed method

 Figs. 21 (c), 22 (c) and 23 (c)) show the identified nuclei and

alse positive regions (indicated by red arrows). These results

 Figs. 21 (c), 22 (c) and 23 (c)) show a larger number of nuclei de-

ected (false positive) to the marking performed by the special-

st. However, its spatial distribution was similar to the specialist

egmentation. The segmentation result of FL ( Fig. 22 (c)) indicates

hat the proposed method enabled the identification of a signifi-

ant quantity of true positive regions, as indicated by the blue ar-

ows, corresponding to centrocytes and centroblasts. 

Results of mean-shift method are presented by Figs. 21 (d), 22 (d)

nd 23 (d). The results show false positive regions (indicated

y red arrows). It is also noted that regions identified by this

ethod have presented larger areas than the regions marked
y the specialist, which represents an overlapping problem. The

mount of false positive regions is still expressive in results

f Wienert et al. (2012) ( Figs. 21 (f), 22 (f) and 23 (f)). The false neg-

tive regions were shown to be of a larger quantity in this method

indicated by red arrows). 

Figs. 21 (e), 22 (e) and 23 (e) show the results obtained

y Vahadane and Sethi (2013) technique on the different lym-

homa classes. One observes that many regions were identified

ith several nuclei (marked by rectangular contours in red color),

hich result in a possible incorrect analysis of nuclear regions. 

The method of de Oliveira et al. (2013) ( Figs. 21 (g), 22 (g)

nd 23 (g)) presented overlapping in all classes. Using its own pre

nd post-processing, this proposal did not enable the separation of

dentified regions containing more than one nucleus. This limita-

ion can result in erroneous analyses in possible steps of feature

xtraction and classification. 

The results of Phoulady et al. (2016) ( Figs. 21 (h), 22 (h)

nd 23 (h)) also present a considerable overlapping rate, mainly

n the CLL lesion, where identified regions have larger areas than

hose manually segmented. Although it has consistent nuclear

orphology, this method segmented false positive regions, indi-

ated by red arrows, and false negative nuclei, indicated by blue

rrows in Fig. 22 (h). A common limitation of this method among

he lesions was the identification of two nuclei as one object, indi-

ated by red rectangles. 


236 T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 

Fig. 17. Example of segmentation application on a subimage of sj-05-6124-R3_006 case from FL class: (a) original image, (b) image resulting from preprocessing, (c) histogram 

of the previous image with thresholds 67 and 189, indicated by red lines, (d) binary image from the segmentation, (e) identification of regions obtained from segmentation 

on the original image and (f) the regions for result analysis. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of 

this article.) 

Table 4 

Quantitative results obtained by segmentation methods applied on CLL lesion. 

Techniques Accuracy Sensitivity Specificity Variation of 

Information 

Proposed Method 81.06% 41% 89% 1.14 

Mean-shift ( Comaniciu & Meer, 2002 ) 79.74% 48% 86% 1.21 

Vahadane and Sethi (2013) 77.07% 21% 88% 1.16 

Wienert et al. (2012) 78.53% 47% 84% 1.25 

de Oliveira et al. (2013) 70.60% 67% 71% 1.43 

Phoulady et al. (2016) 71.97% 70% 72% 3.77 

Paramanandam et al. (2016) 81.85% 4% 96% 0.42 

 
f  

r  

p  

S  

P

 
s  

t  

8  

T  

o  

c  

r  
The segmentation of Paramanandam et al. (2016) presented a

poor performance across all lesions. The parameter of this algo-

rithm corresponding to the width of a typical region of interest

was set to 4, which was shown to be an adequate value for ap-

plication on the images used. Some false negative regions are ob-

served in Figs. 21 (i), 22 (i) and 23 (i), as well as some false positive

regions, indicated by red arrows. Although the irregular contours,

some true positive regions were segmented, as indicated by blue

arrows in Fig. 22 (i). Fig. 23 (i) highlights, using red rectangles, seg-

mented regions that cover both true positive and false positive re-

gions. 

In order to evaluate the proposed method and the afore-

mentioned techniques, Tables 4–6 show their quantitative per-
ormance. These results were obtained by the metrics of accu-

acy, sensitivity, specificity and variation of information of the

roposed method, mean-shift and techniques of Vahadane and

ethi (2013) , Wienert et al. (2012) , de Oliveira et al. (2013) ,

houlady et al. (2016) and Paramanandam et al. (2016) . 

Evaluation of accuracy metric, which quantifies how close un-

upervised segmentation is from manual segmentation, indicates

hat the proposed method reached average results of 81.06%,

2.83% and 80.76% for the classes CLL, FL and MCL, respectively.

his result is higher than the values obtained by the meth-

ds used for comparison in FL class, but for the CLL and MCL

lasses, Paramanandam et al. (2016) reached the highest accuracy

esults. However, this quantitative metric may be insufficient for


T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 237 

Fig. 18. Example of application of segmentation step using a subimage of sj-05-768_013 case from MCL class: (a) original image, (b) image resulting from preprocessing step, 

(c) preprocessed image histogram with threshold values 73 and 194, indicated by red lines, (d) binary image from the segmentation, (e) identification of regions obtained 

from segmentation on the original image and (f) the regions for result analysis. (For interpretation of the references to color in this figure legend, the reader is referred to 

the web version of this article.) 

Table 5 

Performance of the segmentation methods for FL images. 

Techniques Accuracy Sensitivity Specificity Variation of 

Information 

Proposed Method 82.83% 51% 87% 1.04 

Mean-shift ( Comaniciu & Meer, 2002 ) 81.62% 58% 84% 1.10 

Vahadane and Sethi (2013) 79.38% 25% 86% 1.08 

Wienert et al. (2012) 81.05% 53% 85% 1.11 

de Oliveira et al. (2013) 70.47% 69% 70% 1.33 

Phoulady et al. (2016) 74.56% 70% 75% 3.42 

Paramanandam et al. (2016) 82.45% 10% 92% 0.46 

Table 6 

Evaluation of the application of the segmentation methods on MCL lesion. 

Techniques Accuracy Sensitivity Specificity Variation of 

Information 

Proposed Method 80.76% 42% 86% 1.09 

Mean-shift ( Comaniciu & Meer, 2002 ) 79.50% 48% 83% 1.15 

Vahadane and Sethi (2013) 80.67% 17% 89% 0.96 

Wienert et al. (2012) 79.95% 41% 85% 1.12 

de Oliveira et al. (2013) 67.10% 68% 66% 1.37 

Phoulady et al. (2016) 73.35% 59% 75% 3.36 

Paramanandam et al. (2016) 82.24% 9% 91% 0.45 


238 T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 

Fig. 19. Example of application of valley-emphasis segmentation method for nuclei splitting: (a), (d) and (g) represent segmented images of CLL, FL and MCL, respectively, 

(b), (e) and (h) illustrate zoomed regions of the correspondents segmented images, (c), (f) and (i) present their regions results from this process. (For interpretation of the 

references to color in the text, the reader is referred to the web version of this article.) 

 
o  

s  

b  

I  

r  
the analysis of segmentation techniques. Considering the results

of Paramanandam et al. (2016) , segmentation errors occurred in

the nuclei identification, as observed in Figs. 21 (i), 22 (i) and 23 (i). 

For the sensitivity and specificity metrics, the proposed method

presented low values in the first metric and high values in the sec-
nd one. The method of mean-shift and Wienert et al. (2012) pre-

ented similar results. The lowest sensitivities were reached

y Vahadane and Sethi (2013) and Paramanandam et al. (2016) .

n combination with their high specificity values, the accu-

acy metric of these methods are higher than or very close


T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 239 

Fig. 20. Example of morphological operations application for correction of segmented cells: (a), (d) and (g) illustrate segmented images of CLL, FL and MCL, respectively, (b), 

(e) and (h) present their segmented regions highlighted and zoomed, (c), (f) and (i) present their results after this processing. (For interpretation of the references to color 

in the text, the reader is referred to the web version of this article.) 

t  

r  

b  

t  

o

 
b  

a  

v  

p  
o the proposed method, even with their very poor qualitative

esults. The results with the highest sensitivity were obtained

y de Oliveira et al. (2013) and Paramanandam et al. (2016) , but

heir specificities were the lowest when compared to other meth-

ds, indicating higher overlapping rates. 
Variation of information enables the quantification of distances

etween unsupervised and manual segmentations. The proposed

lgorithm presented relevant results for all classes, but the lowest

alues were obtained by Paramanandam et al. (2016) . However, the

erformance of this algorithm in qualitative evaluation does not


240 T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 

Fig. 21. Comparison of segmentation techniques on a CLL sample: (a) original image, (b) image segmented by the specialist, (c) proposed method, (d) mean-shift, (e) 

technique of Vahadane and Sethi (2013) , (f) method of Wienert et al. (2012) , (g) algorithm of de Oliveira et al. (2013) , (h) proposal of Phoulady et al. (2016) and (i) study 

of Paramanandam et al. (2016) . (For interpretation of the references to color in the text, the reader is referred to the web version of this article.) 

Fig. 22. Results of segmentation techniques for FL image: (a) original image, (b) manually segmented image, (c) proposed method, (d) mean-shift, (e) study 

of Vahadane and Sethi (2013) , (f) method of Wienert et al. (2012) , (g) proposal of de Oliveira et al. (2013) , (h) algorithm of Phoulady et al. (2016) and (i) technique 

of Paramanandam et al. (2016) . (For interpretation of the references to color in the text, the reader is referred to the web version of this article.) 

Fig. 23. Segmentation MCL image with different segmentation techniques: (a) original image, (b) image segmented by the specialist, (c) proposed algorithm, (d) mean-shift, 

(e) algorithm of Vahadane and Sethi (2013) , (f) study of Wienert et al. (2012) , (g) technique of de Oliveira et al. (2013) , (h) method of Phoulady et al. (2016) and (i) algorithm 

of Paramanandam et al. (2016) . (For interpretation of the references to color in the text, the reader is referred to the web version of this article.) 


T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 241 

r  

fi

 
t  

t  

n  

c  

o  

i  

a  

t  

b  

t  

a  

p  

o  

a  

fi  

p  

p

3

 
o  

t  

a  

s  

w  

q

 
S

P  

a  

t  

m  

t  

o  

t  

c

4

 
o  

F  

p  

2  

t  

b  

S  

e  

i

 
s  

t  

t  

a  

t  

r  

p  

p  

m  

t

 
v  

e  

w  

t  

s  

t  

a  

b  

t  

l  

a  

c  

G

 
o  

r  

w  

i  

s  

s  

t  

i  

T  

y  

w  

o  

t  

m  

e  

s

 
i  

r  

n  

t  

t  

t  

s  

F  

t  

r  

c

 
p  

s  

r  

c  

m  

t  

m  

f  

v  

s  

F  

e  

p  

i  

p

A

 
-

R

A  
epresent this condition, since the results showed incorrect identi-

cation of nuclei regions in the lymphoma images. 

Through a consideration of the obtained results, one notes

he good performance of the proposed method for split-

ing ROIs. The same is not observed in the other tech-

iques, a consequence of a high overlapping rate, as indi-

ated by the results of de Oliveira et al. (2013) . The results

f Phoulady et al. (2016) indicate that the proposed method us-

ng the valley-emphasis technique was effective, since this study

lso used only intensity information for this purpose. In con-

rast to the large amount of false negative regions identified

y Vahadane and Sethi (2013) and Paramanandam et al. (2016) ,

he proposed algorithm and the techniques of mean-shift

nd Wienert et al. (2012) presented a larger amount of true

ositive identifications. However, mean-shift and the method

f Wienert et al. (2012) presented larger quantities of false neg-

tive regions, as indicated by the subtle differences in the speci-

city metric and by the qualitative results. Thus, even though the

roposed method presents a high false positive rate, it is still ca-

able of identifying more true positive regions. 

.4.1. Complexity analysis 

A complexity analysis was performed to verify the behavior

f the used methods. The proposed approach is composed of

hree stages: preprocessing, segmentation and post-processing. The

symptotic behavior was estimated at O ( n 2 ), considering that the

egmentation step requires the most significant processing time

ith operations of selection, crossover and mutation that are re-

uired by GA. 

The approaches of Wienert et al. (2012) , Vahadane and

ethi (2013) , de Oliveira et al. (2013) , Phoulady et al. (2016) and 

aramanandam et al. (2016) were based on different criteria

nd did not provide detailed discussions with focus in asymp-

otic behavior. This can be a limitation for comparisons with our

ethod. However, this study considered a general analysis and

hese methods suggest asymptotic behaviors of O ( n 2 ), for instance,

f Vahadane and Sethi (2013) and de Oliveira et al. (2013) . Thus,

he proposed method presents behavior similar with the estimated

osts for some relevant methods that are available in literature. 

. Conclusion 

This study presented an unsupervised segmentation method

f nuclei from neoplastic cells for histological images of CLL,

L and MCL, stained with H&E. Qualitative evaluation of the

roposed method reached relevant results for images with

0 × magnification. The proposed method was also quan-

itatively evaluated considering images manually segmented

y a specialist. The methods of mean-shift, Vahadane and

ethi (2013) , Wienert et al. (2012) , de Oliveira et al. (2013) , Phoulady

t al. (2016) and Paramanandam et al. (2016) were applied on

mages of a public domain dataset for comparison purposes. 

The proposed method presents robust results with great repre-

entativeness of nuclear contours. In comparison to other methods,

he described technique presents more consistent contours with

he specialist segmentation, as represented by the Figs. 21–23 , with

 low overlapping rate. These results were obtained considering

he processing stages, which enabled the identification of boundary

egions more accurately. In addition, the proposed preprocessing

resented better results than the normalization. Through its ap-

lication, the normalization has not reached a satisfactory perfor-

ance for the enhancement of illumination differences and noise

reatment, as quantitatively indicated by the entropy metric. 

The GA was also important for defining the adequate threshold

alues considering different solutions among a large set of param-

ter combinations of the fuzzy 3-partition method. In comparison
ith bio-inspired methods, the GA reached the best overall quan-

itative results. Exploring a set of the best parameters, the search

pace became wider, while being directed toward the definition of

he best possible threshold values. The DE method also presented

 good performance among the evolutionary algorithms analyzed,

ut it uses only one best solution in its iterations, which can lead

o fast convergence with poor exploration of the search space. The

ow usage of randomness, different from ABC and CS methods,

long with solution updates independent from other variables, in

ontrast to PSO and WDO, also point to the methodology used by

A. 

Through qualitative and quantitative analyses, the combination

f methods used in the different steps of this study was able to

each more effective results than the compared techniques. Even

ith noise amplification by histogram equalization, the preprocess-

ng step was successful in allowing contrast enhancements. Be-

ides, the Gaussian filter application makes the reduction of pos-

ible noise effects in the later stages. Due to segmentation limita-

ions, the post-processing was crucial for producing a greater def-

nition of contours, as well as reaching the splitting of the nuclei.

he valley-emphasis application was performed via intensity anal-

sis of the merged regions, where the suitability of this method

as noted for the regions color distribution. Since the histograms

f these regions were close to an unimodal distribution, the best

hreshold value should be located in the valley regions, which this

ethod reached. In the search for coherent morphological prop-

rties, the operations used were also defined through analyses of

egmented region contours. 

Considering its application on images with great contrast and

llumination differences, the proposed method provided relevant

esults for the investigated groups by applying intrinsic features of

uclei, disregarding their spatial distributions and shapes. Recently,

here have been a limited amount of studies related to segmen-

ation of CLL and MCL lesions. Thus, another important aspect is

hat the proposed method contributes through a new strategy for

egmentation of these lesions. In regards to histological images of

L, there are many papers dedicated to the detection and segmen-

ation of this tissue. However, this method demonstrated relevant

esults since it enabled the identification of both centrocytes and

entroblasts. 

One limitation of the proposed method is the amount of false

ositive regions presented by the obtained results. The investigated

egmentation methods presented similar behavior when the met-

ics of sensitivity and specificity are analyzed. Future works should

onsider optimization of the proposed method and the develop-

ent of a diagnosis support system using the described steps. In

he first proposal, validation can be performed using images seg-

ented by different specialists. This method can also be evaluated

or application on different public domain lymphoma images, to

erify its performance. Besides, new approaches of the GA method

hould be explored for its computational cost could be reduced.

or this purpose, crossover and mutation steps will be improved

liminating randomness from their implementations. In the second

roposal, features of identified regions could be explored, such as

nvestigations of frequency-based information, for removal of false

ositive objects, thus constituting a detection step. 

cknowledgement 

T.A .A .T. and M.Z.N. thank to CAPES ( 1575210 ) and FAPEMIG ( TEC

 APQ-02885-15 ) for financial support. 

eferences 

CS (2017). What are the key statistics about non-Hodgkin lymphoma? Online. Ac-

cessed 01.22.2017. 

http://dx.doi.org/10.13039/501100002322
http://dx.doi.org/10.13039/501100004901


242 T.A. Azevedo Tosta et al. / Expert Systems With Applications 81 (2017) 223–243 

 
B  

 
I  

 
J  

 
J  

K  

K  

K  

 
K  

 
K  

 
L  

 
L  

 
L  

 
L

 
M  

 
M

 
M  

M  

 
M  

 
M  

 
O  

 
O  

 
O  

O  

P  

 
P  

 
P  

P  

 
Arora, B. , & Banerjee, S. (2013). Computer assisted grading schema for follicular lym-
phoma based on level set formulation. In Students conference on engineering and

systems (SCES) (pp. 1–6). IEEE . 
Bayraktar, Z. (2013). The wind driven optimization (wdo) algorithm. Online.

Accessed 12.21.2016 http://www.mathworks.com/matlabcentral/fileexchange/
44865- the- wind- driven- optimization –wdo –algorithm . 

Bayraktar, Z. , Komurcu, M. , & Werner, D. H. (2010). Wind driven optimization (wdo):
A novel nature-inspired optimization algorithm and its application to electro-

magnetics. In 2010 IEEE antennas and propagation society international sympo-

sium (pp. 1–4). IEEE . 
Belkacem-Boussaid, K. , Prescott, J. , Lozanski, G. , & Gurcan, M. N. (2010). Segmenta-

tion of follicular regions on h&e slides using a matching filter and active con-
tour model. SPIE medical imaging . International Society for Optics and Photonics .

elkacem-Boussaid, K. , Samsi, S. , Lozanski, G. , & Gurcan, M. N. (2011). Automatic
detection of follicular regions in h&e images using iterative shape index. Com-

puterized Medical Imaging and Graphics, 35 (7), 592–602 . 

Bhandari, A. K. , Kumar, A. , & Singh, G. K. (2015). Tsallis entropy based multilevel
thresholding for colored satellite image segmentation using evolutionary algo-

rithms. Expert Systems with Applications, 42 (22), 8707–8730 . 
Biswas, P. (2014). Searching/tuning/optimizing by particle swarm optimiza-

tion (pso) method. Online. Accessed 12.21.2016 http://www.mathworks.com/
matlabcentral/fileexchange/43541- particle- swarm- optimization –pso- . 

Bose, A. , & Mali, K. (2016). Fuzzy-based artificial bee colony optimization for gray

image segmentation. Signal, Image and Video Processing , 1–8 . 
Byrd, K. A. , Zeng, J. , & Chouikha, M. (2007). A validation model for segmentation

algorithms of digital mammography images. Journal of Applied Science & Engi-
neering Technology, 1 . 

Canellos, G. P., Lister, T. A., & Young, B. (2006). The lymphomas(2nd ed.). 
Chang, V. , Saavedra, J. M. , Castañeda, V. , Sarabia, L. , Hitschfeld, N. , & Här-

tel, S. (2014). Gold-standard and improved framework for sperm head segmen-

tation. Computer Methods and Programs in Biomedicine, 117 (2), 225–237 . 
Comaniciu, D. , & Meer, P. (2002). Mean shift: A robust approach toward feature

space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence,
24 (5), 603–619 . 

Cuevas, E. , Zaldívar, D. , & Perez-Cisneros, M. (2016). Image segmentation based on
differential evolution optimization. In Applications of evolutionary computation in

image processing and pattern recognition (pp. 9–22). Springer . 

de Oliveira, D. L. L. , do Nascimento, M. Z. , Neves, L. A. , de Godoy, M. F. , de Ar-
ruda, P. F. F. , & de Santi Neto, D. (2013). Unsupervised segmentation method

for cuboidal cell nuclei in histological prostate images based on minimum cross
entropy. Expert Systems with Applications, 40 (18), 7331–7340 . 

Dimitropoulos, K. , Barmpoutis, P. , Koletsa, T. , Kostopoulos, I. , & Gramma-
lidis, N. (2016). Automated detection and classification of nuclei in pax5 and

h&e-stained tissue sections of follicular lymphoma. Signal, Image and Video Pro-

cessing , 1–9 . 
Dimitropoulos, K. , Michail, E. , Koletsa, T. , Kostopoulos, I. , & Grammalidis, N. (2014).

Using adaptive neuro-fuzzy inference systems for the detection of centroblasts
in microscopic images of follicular lymphoma. Signal, Image and Video Process-

ing, 8 (1), 33–40 . 
Estrada, F. J. , & Jepson, A. D. (2009). Benchmarking image segmentation algorithms.

International Journal of Computer Vision, 85 (2), 167–181 . 
Fuchs, T. J. , & Buhmann, J. M. (2011). Computational pathology: Challenges and

promises for tissue analysis. Computerized Medical Imaging and Graphics, 35 (7),

515–530 . 
Gartner, L. P., & Hiatt, J. L. (2003). Tratado de Histologia em Cores (2nd ed.). 

Gençtav, A. , Aksoy, S. , & Önder, S. (2012). Unsupervised segmentation and classifica-
tion of cervical cell images. Pattern Recognition, 45 (12), 4151–4168 . 

Ghose, S. , Oliver, A. , Martí, R. , Lladó, X. , Vilanova, J. C. , Freixenet, J. , . . . Meri-
audeau, F. (2012). A survey of prostate segmentation methodologies in ultra-

sound, magnetic resonance and computed tomography images. Computer Meth-

ods and Programs in Biomedicine, 108 (1), 262–287 . 
Gonzalez, R. C. , & Woods, R. E. (20 0 0). Processamento de Imagens Digitais . Edgard

Blucher . 
Gurcan, M. N. , Boucheron, L. E. , Can, A. , Madabhushi, A. , Rajpoot, N. M. , &

Yener, B. (2009). Histopathological image analysis: A review. Biomedical Engi-
neering, IEEE Reviews in, 2 , 147–171 . 

Haggerty, J. M. , Wang, X. N. , Dickinson, A. , O’Malley, C. J. , & Martin, E. B. (2014).

Segmentation of epidermal tissue with histopathological damage in images of
haematoxylin and eosin stained human skin. BMC Medical Imaging, 14 (1) . 

Hammouche, K. , Diaf, M. , & Siarry, P. (2008). A multilevel automatic thresholding
method based on a genetic algorithm for a fast image segmentation. Computer

Vision and Image Understanding, 109 (2), 163–175 . 
Heris, S. M. K. (2015a). Implementation of artificial bee colony in matlab. Online.

Accessed 12.21.2016 http://www.yarpiz.com/ . 

Heris, S. M. K. (2015b). Implementation of differential evolution (de) in matlab. On-
line. Accessed 12.21.2016 http://www.yarpiz.com/ . 

Hoffman, R. A. , Kothari, S. , & Wang, M. D. (2014). Comparison of normalization al-
gorithms for cross-batch color segmentation of histopathological images. In An-

nual international conference of the IEEEengineering in medicine and biology soci-
ety (pp. 194–197). IEEE . 

ImageJ (2016). Imagej (image processing and analysis in java). Online. Accessed

16.09.2016 http://imagej.nih.gov/ij/index.html . 
INCA (2016). Estimativa 2016 – incidência de câncer no Brasil. Online. Accessed

09.16.2016. 
Insana, M., Meyers, K., & Grossman, L. (20 0 0). Handbook of medical imaging: Med-

ical image processing and analysis. 
rshad, H. , Veillard, A. , Roux, L. , & Racoceanu, D. (2014). Methods for nuclei de-
tection, segmentation, and classification in digital histopathology: A review -

current status and future potential. IEEE Reviews in Biomedical Engineering, 7 ,
97–114 . 

ianli, L. , & Baoqi, Z. (2009). The segmentation of skin cancer image based on genetic
neural network. In World congress on computer science and information engineer-

ing: 5 (pp. 594–599). IEEE . 
othi, J. A. A. , & Rajam, V. M. A. (2016). A survey on automated cancer diagnosis

from histopathology images. Artificial Intelligence Review , 1–51 . 

aushik, D. , Singh, U. , Singhal, P. , & Singh, V. (2013). Medical image segmentation
using genetic algorithm. International Journal of Computer Applications, 81 (18) . 

han, A. (2013). Stain normalisation toolbox. Online. Accessed 12.23.2016
http://www2.warwick.ac.uk/fac/sci/dcs/research/combi/research/bic/software/ 

sntoolbox/ . 
ong, H. , Belkacem-Boussaid, K. , & Gurcan, M. (2011a). Cell nuclei segmentation for

histopathological image analysis. SPIE medical imaging . 79622R–79622R, Inter-

national Society for Optics and Photonics . 
ong, H. , Gurcan, M. , & Belkacem-Boussaid, K. (2011b). Partitioning histopatho-

logical images: An integrated framework for supervised color-texture seg-
mentation and cell splitting. Medical Imaging, IEEE Transactions on, 30 (9), 

1661–1677 . 
othari, S. , Phan, J. H. , Stokes, T. H. , & Wang, M. D. (2013). Pathology imaging infor-

matics for quantitative analysis of whole-slide images. Journal of the American

Medical Informatics Association, 20 (6), 1099–1108 . 
ad, K. , Agrawal, M. , & Pandya, M. M. (2014). Survey on genetic algorithms & basic

operators. International Journal of Advanced Information Science and Technology
(IJAIST), 22 (22), 44–48 . 

eong, F. W. , Brady, M. , & McGee, J. (2003). Correction of uneven illumination
(vignetting) in digital microscopy images. Journal of Clinical Pathology, 56 (8),

619–621 . 

i, X. , & Plataniotis, K. N. (2015). A complete color normalization approach to
histopathology images using color cues computed from saturation-weighted

statistics. IEEE Transactions on Biomedical Engineering, 62 (7), 1862–1873 . 
owry, L., & Linch, D. (2013). Non-Hodgkin’s lymphoma. 

Luo, Y. , Celenk, M. , & Bejai, P. (2006). Discrimination of malignant lymphomas and
leukemia using radon transform based-higher order spectra. Medical imaging .

International Society for Optics and Photonics . 61445K–1–61445K–10 

acenko, M. , Niethammer, M. , Marron, J. S. , Borland, D. , Woosley, J. T. , Guan, X. , . . .
Thomas, N. E. (2009). A method for normalizing histology slides for quantitative

analysis.. In ISBI: 9 (pp. 1107–1110) . 
auriño, B. B., & Siqueira, S. A. C. (2011). Classificação dos Linfomas. 

McCann, M. T. , Ozolek, J. A. , Castro, C. A. , Parvin, B. , & Kovacevic, J. (2015). Auto-
mated histology analysis: Opportunities for signal processing. Signal Processing

Magazine, 32 (1), 78–87 . 

eijering, E. (2012). Cell segmentation: 50 years down the road. IEEE Signal Process-
ing Magazine, 29 (5), 140–145 . 

ichail, E. , Kornaropoulos, E. N. , Dimitropoulos, K. , Grammalidis, N. , Koletsa, T. , &
Kostopoulos, I. (2014). Detection of centroblasts in h&e stained images of follic-

ular lymphoma. In 22nd signal processing and communications applications con-
ference (SIU) (pp. 2319–2322). IEEE . 

ohammed, E. A. , Far, B. H. , Naugler, C. , & Mohamed, M. M. A. (2013a). Application
of support vector machine and k-means clustering algorithms for robust chronic

lymphocytic leukemia color cell segmentation. 15th international conference on

e-health networking, applications and services . IEEE . 
ohammed, E. A. , Far, B. H. , Naugler, C. , & Mohamed, M. M. A. (2013b). Chronic

lymphocytic leukemia cell segmentation from microscopic blood images using
watershed algorithm and optimal thresholding. In 26th annual IEEE Canadian

conference on electrical and computer engineering (CCECE) (pp. 1–5). IEEE . 
Ng, H. (2006). Automatic thresholding for defect detection. Pattern Recognition Let-

ters, 27 (14), 1644–1649 . 

Oger, M. , Belhomme, P. , & Gurcan, M. N. (2012). A general framework for the seg-
mentation of follicular lymphoma virtual slides. Computerized Medical Imaging

and Graphics, 36 (6), 442–451 . 
rlov, N. V. , Chen, W. W. , Eckley, D. M. , Macura, T. J. , Shamir, L. , Jaffe, E. S. ,

& Goldberg, I. G. (2010). Automatic classification of lymphoma images with
transform-based global features. IEEE Transactions on Information Technology in

Biomedicine, 14 (4), 1003–1013 . 

swal, V. , Belle, A. , Diegelmann, R. , & Najarian, K. (2013). An entropy-based auto-
mated cell nuclei segmentation and quantification: Application in analysis of

wound healing process. Computational and Mathematical Methods in Medicine . 
tsu, N. (1979). A threshold selection method from gray-level histograms. IEEE

Transactions on Systems, Man, and Cybernetics, 9 (62–66), 1 . 
ztan, B. , Kong, H. , Gurcan, M. N. , & Yener, B. (2012). Follicular lymphoma grad-

ing using cell-graphs and multi-scale feature analysis. SPIE medical imaging : 

8315 . 
aramanandam, M. , O Byrne, M. , Ghosh, B. , Mammen, J. J. , Manipadam, M. T. , Tham-

buraj, R. , & Pakrashi, V. (2016). Automated segmentation of nuclei in breast can-
cer histopathology images. PloS One, 11 (9), e0162053 . 

aulinas, M. , & Ušinskas, A. (2015). A survey of genetic algorithms applications
for image enhancement and segmentation. Information Technology and Control,

36 (3) . 

edrini, H. , & Schwartz, W. R. (2007). Análise de Imagens Digitais: Princípios, Algorit-
mos e Aplicações . Thomson Learning . 

houlady, H. A. , Goldgof, D. B. , Hall, L. O. , & Mouton, P. R. (2016). Nucleus segmenta-
tion in histology images with hierarchical multilevel thresholding. SPIE medical

imaging . 979111–979111, International Society for Optics and Photonics . 

http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0001
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0001
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0001
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0001
http://www.mathworks.com/matlabcentral/fileexchange/44865-the-wind-driven-optimization-wdo-algorithm
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0002
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0002
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0002
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0002
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0002
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0003
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0003
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0003
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0003
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0003
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0003
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0004
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0004
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0004
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0004
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0004
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0004
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0005
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0005
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0005
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0005
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0005
http://www.mathworks.com/matlabcentral/fileexchange/43541-particle-swarm-optimization-pso-
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0006
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0006
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0006
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0006
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0007
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0007
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0007
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0007
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0007
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0008
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0008
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0008
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0008
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0008
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0008
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0008
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0008
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0009
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0009
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0009
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0009
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0010
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0010
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0010
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0010
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0010
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0041
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0041
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0041
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0041
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0041
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0041
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0041
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0041
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0011
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0011
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0011
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0011
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0011
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0011
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0011
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0012
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0012
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0012
http://refhub.elsevier.com/S0957-4174(17)30206-3/sbref0012
http://refhub.elsevier.com/S