ORIGINAL ARTICLE

Embedded real-time speed limit sign recognition using image
processing and machine learning techniques

Samuel L. Gomes1
• Elizângela de S. Rebouças1

• Edson Cavalcanti Neto1
•

João P. Papa2
• Victor H. C. de Albuquerque3

• Pedro P. Rebouças Filho1
•

João Manuel R. S. Tavares4

Received: 1 September 2015 / Accepted: 21 May 2016 / Published online: 3 June 2016

� The Natural Computing Applications Forum 2016

Abstract The number of traffic accidents in Brazil has

reached alarming levels and is currently one of the leading

causes of death in the country. With the number of vehicles

on the roads increasing rapidly, these problems will tend to

worsen. Consequently, huge investments in resources to

increase road safety will be required. The vertical R-19

system for optical character recognition of regulatory

traffic signs (maximum speed limits) according to Brazilian

Standards developed in this work uses a camera positioned

at the front of the vehicle, facing forward. This is so that

images of traffic signs can be captured, enabling the use of

image processing and analysis techniques for sign detec-

tion. This paper proposes the detection and recognition of

speed limit signs based on a cascade of boosted classifiers

working with haar-like features. The recognition of the sign

detected is achieved based on the optimum-path forest

classifier (OPF), support vector machines (SVM), multi-

layer perceptron, k-nearest neighbor (kNN), extreme

learning machine, least mean squares, and least squares

machine learning techniques. The SVM, OPF and kNN

classifiers had average accuracies higher than 99.5 %; the

OPF classifier with a linear kernel took an average time of

87 ls to recognize a sign, while kNN took 11,721 ls and
SVM 12,595 ls. This sign detection approach found and

recognized successfully 11,320 road signs from a set of

12,520 images, leading to an overall accuracy of 90.41 %.

Analyzing the system globally recognition accuracy was

89.19 %, as 11,167 road signs from a database with 12,520

signs were correctly recognized. The processing speed of

the embedded system varied between 20 and 30 frames per

second. Therefore, based on these results, the proposed

system can be considered a promising tool with high

commercial potential.

Keywords Cascade haar-like features � Pattern
recognition � Computer vision � Automotive applications

& Pedro P. Rebouças Filho

pedrosarf@ifce.edu.br

Samuel L. Gomes

samuelluz@lapisco.ifce.edu.br

Elizângela de S. Rebouças

elizangelareboucas@lapisco.ifce.edu.br

Edson Cavalcanti Neto

edsoncavalcantineto@gmail.com

João P. Papa

papa@fc.unesp.br

Victor H. C. de Albuquerque

victor.albuquerque@unifor.br

João Manuel R. S. Tavares

tavares@fe.up.pt

1 Laboratório de Processamento Digital de Imagens e

Simulação Computacional, Instituto Federal de Federal de

Educação, Ciência e Tecnologia do Ceará (IFCE), Ceará,

Brazil

2 Departamento de Ciência da Computação, Universidade

Estadual Paulista, Bauru, São Paulo, Brazil

3 Programa de Pós-Graduação em Informática Aplicada,

Laboratório de Bioinformática, Universidade de Fortaleza,

Fortaleza, CE, Brazil

4 Instituto de Ciência e Inovação em Engenharia Mecânica e

Engenharia Industrial, Departamento de Engenharia

Mecânica, Faculdade de Engenharia, Universidade do Porto,

Porto, Portugal

123

Neural Comput & Applic (2017) 28 (Suppl 1):S573–S584

DOI 10.1007/s00521-016-2388-3

http://orcid.org/0000-0002-1878-5489
http://crossmark.crossref.org/dialog/?doi=10.1007/s00521-016-2388-3&amp;domain=pdf
http://crossmark.crossref.org/dialog/?doi=10.1007/s00521-016-2388-3&amp;domain=pdf


1 Introduction

Car accidents are one of the major causes of death

worldwide. Estimates made by the Secretary of Politics of

Social Security, show that, in Brazil, the number of people

permanently disabled as a consequence of traffic accidents

increased from 33,000 to 352,000 between 2002 and 2012.

In addition, the number of deaths increased from 46,000 to

60,000 in the same time period. Consequently, close to one

million benefits paid nowadays by the National Social

Security Institute (INSS) are for victims of car accidents.

This cost represents more than 12 billion Brazilian Reais

from INSS funds. The data from the Lider Insurance

Company also indicate that most victims are between the

ages of 18 and 40. The benefit that generates the greatest

expense to the INSS is retirement due to disability, because

it is a benefit that is paid for a long period of time and in

most cases to young people [21].

Given this scenario, manufacturers such as Volvo,

Toyota and Ford are investing in technologies like

advanced driver assistance systems (ADAS) in their vehi-

cles to ensure the safety of occupants by helping to avoid

potential accidents. The United States Department of

Transportation estimates that an investment of US $ 1.2

billion in Intelligent Transportation Systems (ITS) tech-

nology would generate a return of US $ 30.2 billion in

approximately 20 years. Likewise, Japan has been invest-

ing US $ 700 million annually in these technologies since

2004, while South Korea plans to invest US $ 3.2 billion

between the years of 2008 and 2020 [36].

The goal of ADAS is to assist drivers and consequently

to significantly decrease the number of accidents. These

systems use technologies such as: global positioning, radar,

image sensors and techniques of computer vision. A study

by the US Insurance Institute for Highway Safety estimated

that the use of the intelligent assistance systems such as:

lane departure warning (LDW), forward collision warning

(FCW), blind spot detection and adaptable headlights that

are already available on the market can prevent or ame-

liorate one in three fatal collisions and one in five collisions

that result in moderate or severe injuries [36].

Observing this trend, this work explores the use of

computer vision as a solution for another major traffic

safety problem: the driver’s lack of attention to road signs,

which causes a large number of accidents.

Many of the ADAS systems use computer vision tech-

niques in their operations. For example, lane departure

warning system (LDWS) is a warning system that alerts the

driver when he or she is veering out of or changing lanes by

sending visual, audio and/or vibrational signals. This sys-

tem is designed to minimize accidents by addressing the

main causes, such as distraction and sleepiness [54, 60].

Adaptive cruise control systems read the speed limit signs

through computer vision and alert the driver if he/she is not

obeying the limit [61].

Assistance for vehicle parking uses ultra-sonic sensors

and/or computer vision to indicate the proximity of objects.

Current systems actively control the steering wheel, just

leaving the driver with the control of moving the vehicle

[12].

Other intelligent assistance systems are the blind spot

warning system (BSWS) and sleepiness detector. BSWS is

able to detect blind spots on the passenger side. This fea-

ture alerts the vehicle’s driver if at the moment he/she is

maneuvering, there is any risk of collision [59]. Sleepiness

detector is an intelligent system that monitors the driver’s

facial expressions to perceive the state of the driver’s

attention, alerting the driver that he or she should rest if

signs of sleepiness or tiredness are detected [19].

Some car manufacturers have develop technologies, like

an autonomous brake technology, that can stop a car when

other vehicles or obstacles are very close and provides

support to stay in the same lane, by applying a corrective

force on the steering when the driver veers from the correct

lane. A cruise control adapter can also be used to auto-

matically maintain a safe speed and a safe distance rela-

tively to other vehicles, which in its most active form can

prevent a driver from exceeding the speed limit.

Barthès and Bonnifait say that the development of

advanced driver assistance systems to assist the driver and

inform him/her of the road conditions can significantly

contribute to reduce the number of accidents [8]. These

systems should respond in real time to be useful and some

of the major technologies used to ensure these require-

ments are: global positioning systems, image sensors and

computer vision.

Cavalcanti Neto et al. [36] proposed a system to detect

and recognize Brazilian vehicle license plates, in which the

registered users have permission to enter a specific area.

These authors used techniques of digital image processing,

such as Hough transform, morphology, threshold and

Canny edge detector to extract the characters, as well as

least squares, least mean squares, extreme learning

machine and neural network multilayer perceptron to

identify the numbers and letters. Neto et al. [36] used

motion detection to accelerate the embedded application

previously developed because only the moving regions in

the image were analyzed, which is not possible here

because everything is moving in the input images.

The main aim of this work was to develop an android

application that can detect and recognize speed limit signs

in real time. The system exhibits sign detection in the car as

shown in Fig. 1. In order to develop the system, techniques

of digital image processing (DIP) are used to extract, i.e.,

S574 Neural Comput & Applic (2017) 28 (Suppl 1):S573–S584

123


segment, the characters, along with techniques of machine

learning (ML) to recognize the symbols obtained during

the DIP stage.

This paper proposes detection of speed limit signs based

on a cascade of boosted classifiers working with haar-like

features in the DIP step. It also proposes the normalization

of attributes for standardization of characters in the DIP

stage, which is usually done in the pattern recognition step.

Among the contributions of this work for the pattern

recognition step is the evaluation of seven classification

methods and some of their variations in terms of suitability

as an embedded application in real time. The recognition

algorithms used were the k-nearest neighbors (kNN),

optimum-path forest classifier (OPF) configured with seven

distance functions, least squares (LS), least mean squares

(LMS), extreme learning machine (ELM), artificial neural

network multilayer perceptron (MLP) and support vector

machines (SVM) configured with four kernels. As far as

the authors know, this is the first time that the OPF clas-

sifier has been analyzed in an embedded system.

2 Speed limit signs detection

The methodology proposed for the speed limit sign

detection is based on a cascade of boosted classifiers

working with haar-like features [30, 56]. This proposal will

be compared to the methodology suggested by Neto et al.

[36].

2.1 Based on a cascade of boosted classifiers

working with haar-like features

Viola and Jones [56] proposed a rapid object detection

algorithm using a boosted cascade of simple features, and

Lienhart and Maydt [30] proposed an extended set of Haar-

like features for rapid object detection.

This approach has been applied in various applications

to detect objects in real time, such as face detection

[17, 33, 47], pedestrian detection [43, 63], license plate

detection [64] and object classification in microscopy [3].

However, as far as the authors know, this methodology has

never been used for applications such as road sign detec-

tion neither incremented in an embedded solution.

The classifier used to detect speed limit signs is trained

with a few samples of signs, called positive and negative

examples [30, 56]. The positive examples included hun-

dreds of images with speed signs, and negative examples

included arbitrary images without valid signs. After a

classifier is trained, it can be applied to a region of interest

in an input image. The classifier outputs a ‘‘1’’ if the region

is likely to show a sign, and ‘‘0’’ otherwise. To search for

the object in the whole image, the search window moves

across the image and checks every location using the

classifier. The classifier is designed so that it can easily find

the objects of interest with different sizes, which is more

efficient than resizing the image itself. So, to find an object

of an unknown size in the image, the scan procedure is

done several times using different scales.

2.2 Based on the Hough transform and Canny edge

detector

This approach was proposed by Neto et al. [36] for the

detection of license plates according to the Brazilian

Standards. This approach uses the Canny operator com-

bined with the Hough transform to detect objects.

The Canny edge detector performs two tasks: the fil-

tering of noise and highlighting the pixels defining the

border of an object in a digital image [16, 46]. To develop

this algorithm, primary studies were focused on optimal

borders that can be represented by using functions in one

dimension (1-D) [11, 53]. The authors showed that the best

filter to start their algorithm was a smoothing algorithm,

Fig. 1 Illustration of the system

developed in this work

Neural Comput & Applic (2017) 28 (Suppl 1):S573–S584 S575

123


the Gaussian operator, followed by a Gaussian derivative,

which in one dimension can be given by [31]:

d ¼ � x

r2

� �
e�

x2

2r2 ; ð1Þ

where r2 consists of the data variance and x the input data.

The Canny algorithm was designed to have three main

properties: minimum error detection, good border locations

and minimal response time.

Edge detection has been used in many applications for

object segmentation [2, 35, 44, 45], and the Canny detector

has been commonly used to find objects, and Hough

transform to recognize the right objects.

The Hough transform is a feature extraction technique

used in digital image processing [15]. The aim of the

technique is to find imperfect points on the object by

comparing it to the desired object class through a voting

process. The Hough transform algorithm elects in the

voting process objects that have similarities to the desired

shape.

The classic Hough transform was projected to identify

lines on images, and it has been extended to identify other

shapes, like circles or ellipses [15, 62]. Neto et al. [36, 37]

applied it to detect lines, but in this work we will use it to

detect circles.

3 Speed limit sign recognition

In this section, the seven machine learning techniques

under comparison are introduced.

3.1 Support vector machines

One of the fundamental goals of the learning theory can be

stated as: given two classes of known objects, assign one of

them to a new unknown object. Thus, the objective in a

two-class pattern recognition is to infer a function [52]:

f : X ! f�1g; ð2Þ

regarding the input-output of the training data.

Based on the principle of structural risk minimization

[55], the SVM optimization process is aimed at establish-

ing a separating function while accomplishing a trade-off

between generalization and over-fitting.

Vapnik [55] considered a class of hyperplanes in some

dot product space H:

hw; xi þ b ¼ 0; ð3Þ

where w; x 2 H; b 2 R, corresponding to the decision

function:

f ðxÞ ¼ sgnðhw; xi þ bÞ; ð4Þ

and, based on the following two arguments, the author

proposed the Generalized Portrait learning algorithm for

problems that are separable by hyperplanes:

1. Among all hyperplanes separating the data, there exists

a unique optimal hyperplane distinguished by the

maximum margin of separation between any training

point and the hyperplane;

2. The over-fitting of the separating hyperplanes

decreases with increasing margin.

Thus, to construct the optimal hyperplane, it is necessary to

solve:

minimize
w2H;b2R

sðwÞ ¼ 1

2
jjwjj2; ð5Þ

subject to:

yiðhw; xii þ bÞ� 1 for all i ¼ 1; :::;m; ð6Þ

with the constraint (6) ensuring that f ðxiÞ will be þ1 for

yi ¼ þ1 and �1 for yi ¼ �1, and also fixing the scale of w.

A detailed discussion of these arguments is provided in

[52].

The function s in (5) is called the objective function,

while in Eq. 6 the functions are the inequality constraints.

Together, they form a so-called constrained optimization

problem. The separating function is then a weighted

combination of elements of the training set. These elements

are called Support Vectors and characterize the boundary

between the two classes.

The replacement referred to as the kernel trick [52] is

used to extend the concept of hyperplane classifiers to

nonlinear support vector machines. However, even with the

advantage of ‘‘kernelizing’’ the problem, the separating

hyperplane may still not exist.

In order to allow some examples to violate Eq. 6, the

slack variables n � 0 are introduced [52], which leads to

the constraints:

yiðhw; xii þ bÞ� 1� ni for all i ¼ 1; :::;m: ð7Þ

A classifier that generalizes efficiently is then found by

controlling both the margin (through jjwjj) and the sum of

the slack variables
P

i ni. As a result, a possible accom-

plishment of such a soft margin classifier is obtained by

minimizing the objective function:

sðw; nÞ ¼ 1

2
jjwjj2 þ C

Xm
i¼1

ni; ð8Þ

subject to the constraint in Eq. 7, where the constant C[ 0

determines the balance between over-fitting and general-

ization. Due to the tuning variable C, these kinds of SVM-

based classifiers are normally referred to as C-support

vector classifiers (C-SVC) [14].

S576 Neural Comput & Applic (2017) 28 (Suppl 1):S573–S584

123


The implementation used here for the SVM is the one

suggested in [10] and [13].

3.2 Optimum-path forest classifier

The optimum-path forest (OPF) is a framework for the

design of pattern classifiers based on optimal graph parti-

tions [39, 41], in which each sample is represented as a

node of a complete graph, and the arcs between them are

weighted by the distance of their corresponding feature

vectors. The idea behind OPF is to rule a competition

process between some key samples (prototypes) in order to

partition the graph into optimum-path trees (OPTs), which

will be rooted at each prototype. It is assumed that samples

that belong to the same OPT are more strongly connected

to their root (prototype) than to any other one in the opti-

mum-path forest. Prototypes assign their costs for each

node and the prototype that offered the optimum-path cost

will conquer that node, which will be marked with the label

of the corresponding prototype.

Let Z ¼ Z1 [ Z2 be a dataset labeled with a function k,
in which Z1 and Z2 are, respectively, training and test sets

such that Z1 is used to train a given classifier and Z2 is used

to assess its accuracy. Let S � Z1 be a set of prototype

samples. Essentially, the OPF classifier creates a discrete

optimal partition of the feature space such that any sample

s 2 Z2 can be classified according to this partition. This

partition is an optimum-path forest (OPF) computed in Rn

by the image foresting transform (IFT) algorithm [18].

The OPF algorithm may be used with any smooth path

cost function which can group samples with similar prop-

erties [18]. This work used the path cost function fmax,

which is computed as follows:

fmaxðhsiÞ ¼
0 if s 2 S,

þ1 otherwise ;

�

fmaxðp � hs; tiÞ ¼maxffmaxðpÞ; dðs; tÞg;
ð9Þ

in which d(s, t) means the distance between samples s and

t, and a path p is defined as a sequence of adjacent samples.

The fmaxðpÞ computes the maximum distance between

adjacent samples in p, when p is not a trivial path.

The implementation used here for the OPF is the one

suggested in [1, 40].

3.3 Least Squares

Least Squares (LS) was used first by [42] and is a very

popular technique to make adjustments around a varied

dataset:

ykðiÞ ¼ uk

Xm
j¼1

xj�wkj

 !
: ð10Þ

From Eq. 10, the output value from the network can be

obtained through:

Y ¼ W � X; ð11Þ

where Y corresponds to the output matrix stimulated by the

input vector X. Therefore, the W matrix is a matrix with a

dimension of m 9 (k ?1) because of the bias on the input

system, that is: i = 1, 2..., m and j = 1, 2,..., k in Eq. 10 [9].

Among the proposed LS models, this paper adopted the

model proposed by [29]. The input attributes of the clas-

sifier are in each column of the X matrix, and the vectors of

the classifier outputs are in each column of the Y matrix.

In Eq. 11, mathematical operations are used to achieve

the goal, which is to isolate the W matrix. In order to

remove the X matrix from the right side of the equation, it

is necessary to multiply this side by the inverse matrix.

However, the matrix must be square, so it is necessary to

multiply by its transpose:

W ¼ YXTðXXTÞ�1: ð12Þ

The optimal linear auto associative memory (OLAM)

algorithm is used for both regression functions and clas-

sification. This classifier can be used either as a batch or

iteratively depending on its application [23].

The implementation used here for the LS-based classi-

fier is the one suggested in [21, 36].

3.4 Least mean squares

According to [22], a least mean square (LMS) network is

based on the use of instantaneous values and the current

values of the input network for the activation function [57].

The topology of the simple perceptron network is similar to

the LS algorithm; however, they differ in their form of

training. The output values are achieved as follows:

YðiÞ ¼ W � XðiÞ; ð13Þ

where Y(i) corresponds to the output matrix simple per-

ceptron stimulated by the input vector X(i) and i corre-

sponds to the actual iteration. Therefore, the W matrix is a

matrix with the m 9 (k ? 1) dimensions due to bias from

the input system where i = 1, 2,..., m [9, 58].

The neuron activation function y(t) uses the signal

function, and the error value is calculated at each iteration.

The W matrix is the weight matrix, which is obtained

iteratively, using the derivative of the cost function:

wðt þ 1Þ ¼ wðtÞ � a
onðwÞ
ow

; ð14Þ

where w(t) is the value of the weights from the previous

iteration and a is the learning rate [57].

Thus, the final result ofW is obtained iteratively through

the LMS matrix rule:

Neural Comput & Applic (2017) 28 (Suppl 1):S573–S584 S577

123


wðt þ 1Þ ¼ wðtÞ þ aeðtÞxðtÞ: ð15Þ

The implementation used here for the LMS classifier is the

one suggested in [36].

3.5 Extreme learning machine

The extreme learning machine (ELM) is a neural network

with a topology of a single-hidden layer feedforward neural

network (SLFN), which is a network that has a single-

hidden layer [7, 24].

The ELM uses a training method for its layers as fol-

lows: the weights of the hidden layer are randomly gen-

erated, and the output layer weights are generated after the

activation function of the hidden layer. The output of the

hidden layer is used as input to the output layer, and then

the OLAM algorithm is used to obtain the values of the

output layer weights [26].

The ELM algorithm, unlike other traditional algorithms,

assumes a smaller training error and also the lowest stan-

dard of weights [26]. The disadvantage of the ELM is the

need to use a high number of neurons in the hidden layer

due to the need of higher hit rates, thus making the

implementation of the algorithm in real-time embedded

systems difficult, due to its high complexity and processing

time [7, 27].

The matrix weights of the hidden layer w are generated

randomly. After obtaining the weights randomly, it per-

forms the activation of the neurons in the hidden layer from

the input x(t) of the system, thus obtaining the activated

output of the hidden layer. The output of the hidden layer

becomes the input of the output layer, thus transforming

the network into a linear network [25].

The implementation used here for the ELM is the one

proposed in [36].

3.6 Multilayer perceptron

The multilayer perceptron (MLP) network is a single-layer

neural network (SLNN) organized in a cascade and sub-

divided in an input layer, one or more hidden layers and an

output layer [22, 38, 49].

According to [34], SLNN does not represent separable

functions linearly. This problem is solved by the use of two

or more neurons with adaptive weights. However, it is

necessary to use a training algorithm to adjust weights in

these layers [5], the ones that perform error backpropaga-

tion to compute the errors of hidden layers [22, 32].

One output layer with nonlinear neurons and one or

more intermediate layers composed of neurons that repre-

sent the network activation function is the composition of a

MLP network [4, 6, 21, 50]. The signal is always forward

propagated, layer-by-layer.

The data for training were defined as follows: the input

vector was equal to 1225, which is the result of 35 9 35

pixels size image vectorization, and the class labels were

0–9 for the class of digits, as the number of neurons in the

output layer is equal to the number of possible outputs, 10

digits.

In this work, we used a three-layer MLP with an input

layer, a hidden layer and an output layer. According to

[28], the classification of numbers on traffic signs can be

made using a MLP network. Thus, we developed an MLP

network for the database used. The training of the MLP

was based on the error backpropagation algorithm [22].

A network which has the number of hidden neurons

equal to three times the number of classes (30 neurons) was

used. The activation function used in the hidden layer was

the logistic sigmoid function. Numbers were randomly

generated between 0.0001 and -0.0001 for the initial

weights of the network [51]. In the training of the network,

a decreasing learning rate was used with an initial value of

0.5.

For the stopping criteria of the network training, it was

decided that the network should not be trained if the net-

work spent 10 cycles without decreasing the mean square

error or when this error was higher than the one of the

previous epoch. The problem of these stopping criteria is

that if the solution started to climb to a local minimum, the

network training could continue. With that, we defined

several starting solutions for the training to be sure that the

network stops at the global minimum.

The implementation used here for the MLP was the one

described in [48] and [36].

4 Proposed framework

The framework developed in this work was built using C

language for the Android operating system (OS). The

integrated algorithm for the automatic analysis of speed

limit signs is composed of two main steps: detection and

recognition, where the first step is to find the desired sign in

an image with several objects, and the second step is to

interpret the information on the sign, i.e., the maximum

speed limit, Fig. 2.

The first step in the computational pipeline developed is

the speed limit sign detection via a cascade of boosted

classifiers working with haar-like features [30, 56].

The classifier used to detect speed limit signs is trained

with a few sample signs, called positive and negative

examples [30, 56]. The positive examples included about

1,000 images with speed signs, and negative examples used

were arbitrary images without validate speed limit signs.

After a classifier is trained, it can be applied to a region of

interest in an input image. The classifier outputs a ‘‘1’’ if

S578 Neural Comput & Applic (2017) 28 (Suppl 1):S573–S584

123


the region is likely to show a speed limit sign and ‘‘0’’

otherwise. To search for the object in the whole image, the

search window moves across the image and checks every

location using the classifier. The classifier is designed so

that it can easily find the objects of interest with different

sizes, which is more efficient than resizing the input image

itself. So, to find an object of an unknown size in the

image, the scan procedure should be done several times

using different scales.

The second step of the pipeline is the segmentation and

identification of the digits, but before the segmentation, it is

necessary to perform a thresholding of the sign for easy

identification of the contours. These are identified through

the application of an adaptive thresholding algorithm [20].

The next step is to filter the contours found in order to

find the digits of the speed sign. The digits, before being

sent to the next step, go through 4 filtering processes: by

height, by width and through the spacing between digits.

The process of filtering checks the height of the average

height of all objects and all filters that are above or below

averagewith a toleranceof15 %.After the filteringprocess by

height, the next step is to filter by width. The width filtering

algorithm works the same as the algorithm for height, but

based on the width. After identifying the digits in possible

circles, the region where they are is segmented to separate the

digits correctly. Figure 2 shows the possible circles found in

green and the region where the digits are in blue.

After validation of the sign, the position of the digits,

which are separated and standardized before the last step, is

the recognition of the digits. This pattern recognition pro-

cess assumes white digits with dimensions of 35x35 pixels

on a black background. The digits are resized to make the

algorithm invariant to distance.

The scaling of the digits occurs primarily by resizing the

height which should be 33 pixels. Then, the width is

defined in proportion to the original. Each digit after

resizing is placed centrally in relation to the width.

The standardized digits were then subject to the LS,

LMS, ELM, MLP, kNN, SVM and OPF classifiers. The

recognition performance of these classifiers was evaluated

by accuracy and processing time.

5 Results and discussion

This work proposes an efficient and powerful embedded

system to recognize speed limit signs, where the stages

must have high accuracy and low processing time. This

section presents the results of the digital image processing

and pattern recognition steps to find the best method to use

in each step.

5.1 Speed limit digits detection

The proposed method in the step of digital image processing

for the detection of speed limit signs is based on a cascade of

boosted classifiers working with haar-like features. Figure 3

shows examples of speed limit sign digits segmented by the

developed approach, showing the stages involved from the

input image to the size standardized digits.

The detection step starts with an image acquired by a

smartphone camera, and the images obtained are satisfac-

tory, because the image sensor used presents low noise,

good focus and good brightness adjustments, as can be seen

in Fig. 3a–d.

Figure 3e–h shows the results of the conversion of the

color acquired images to grayscale images. After applying

the cascade of boosted classifiers working with haar-like

features, these images have several possible signs. After

Fig. 2 Pipeline of the developed framework

Neural Comput & Applic (2017) 28 (Suppl 1):S573–S584 S579

123


applying the adaptive thresholding (Fig. 3m–p), the sign

digits are obtained and presented in Fig. 3u–x.

The test of the embedded system was performed using

the speed limit signs of 20, 30, 40, 60 and 80 (km/h) as

these are the most commonly used in urban environments.

A total of 12,520 images were acquired with different

inclinations and distances in streets and avenues of the city

of Fortaleza and Maracanaú in the state of Ceará in Brazil.

From the 12,520 images of the speed limit signs that

were used as test, 11,320 signs were properly located by

the segmentation step, giving 90.41 % of success in the

detection of speed limit signs. On the other hand, the

approach proposed by Neto et al. [36] based on the Canny

operator combined with the Hough transform obtained only

45.3 % correct results.

5.2 Speed limit digits recognition

In this work, we evaluated several classifiers to inte-

grate an efficient and powerful embedded system to

recognize the speed limit signs. Then, the recognition

step using the k-nearest neighbors (kNN), optimum-path

forest (OPF), least squares (LS), least mean squares

(LMS), extreme learning machine (ELM), artificial

neural network multilayer perceptron (MLP) and sup-

port vector machines (SVM) based classifiers was car-

ried out. The following section presents the results and

discusses them.

From the results obtained in the step of digit detection

presented in Sect. 5.1, a database with the segmented digits

was built, Table 1. The feature extraction approaches and

Fig. 3 Examples of results

obtained for the segmentation of

speed limit digits

S580 Neural Comput & Applic (2017) 28 (Suppl 1):S573–S584

123


classifier algorithms were combined to yield an intelligent

system with high accuracy and low computational cost.

For the training and test set sample sizes, a holdout pro-

cedure with 50 % for the training and 50 % for the test, with

10 steps, was employed. Each classifier was configured in

various ways, and the best results obtained are the ones

shown in Table 2. The kNN was configured with 1, 3 and 5

nearest neighbors. The SVMwas configured using the linear

kernels, polynomial, RBF and sigmoid, but only the linear

and polynomial kernels had accuracy rates above 90 %. The

OPFwas set with seven distances, but only the Euclidean and

Chi-squared distance obtained accuracy rates above 99 %

for all samples. The MLP classifier trained by the ELM and

MLP used 1,225 neurons in the input layer, 10 in the output

layer and 30 in the hidden layer.

Each classifier was tested ten times, always shuffling the

training and testing samples on a mobile device with

Android OS 2.5 GHz quad core with 2GB of RAM.

Table 2 summarizes the results obtained by each

classifier. The values presented show that some

classifiers were distinguished in terms of the training

speed, mainly the ELM, kNN and SVM. Other classifiers

stand out in terms of the speed to process a sample, such

as the LS, LMS, MLP and SVM with linear kernel.

In terms of accuracy, the kNN, SVM and OPF classifiers

were superior than the others, especially the kNN with 5

nearest neighbors, the SVM with linear kernel and both

OPF configurations. These classifiers are distinguished for

the high average accuracies, always greater than 99%,

presenting also high classification stability with low stan-

dard deviations.

Table 2 shows the accuracy rates and the prediction

times to classify samples that are important criteria for

embedded applications, and the OPF and SVM classifiers

are the ones with the best results. To evaluate these

classifiers further, Table 3 presents the accuracy (Acc),

sensitivity (Se), specificity (Sp) and Harmonicmeans

(HM) metrics for each class under study for the worst

case obtained.

The values presented in Table 2 show that the SVM with

linear kernel stands out as it has the highest accuracy and

lowest standard deviation. Also, OPF with Euclidean dis-

tance had the lowest test time compared to the other

classifiers; its training time was 16 times lower, and the

testing time was 64 times lower than the SVM with linear

kernel. The OPF with Euclidean distance had an average

accuracy of 99.54 ± 0.10.

These findings confirm that the OPF classifier with

Euclidian distance is suitable to be integrated in an android

application for speed limit sign recognition with high

efficiency.

The standardized digit sizes obtained in the DIP step and

used in the classifiers evaluation for the pattern recognition

step are available at website

Table 1 Number of elements in

each class of the speed limit

sign digit database

Digit Class No. of elements

0 1 1428

1 2 1841

2 3 1879

3 4 1688

4 5 1824

5 6 1569

6 7 1725

7 8 1414

8 9 1650

9 10 1952

Table 2 Results obtained in the

evaluation of each classifier

used for the recognition of the

segmented speed limit digits

Classifier Maximum Minimum Mean Standard Average Average testing

accuracy accuracy accuracy deviation training time for

rate (%) rate (%) rate (%) time (s) a sample (ls)

LS 91.3 89.43 90.81 0.4 8.5 1.9

LMS 90.80 45.00 76.45 17.52 50.56 4.1

ELM 97.76 96.46 96.88 0.40 24.6 12.7

MLP 96.83 87.12 91.93 3.08 158 14

kNN (K=1) 99.83 98.51 99.14 0.61 0.027 10,651

kNN (K=3) 99.89 99.7 99.78 0.06 0.028 10639

kNN (K=5) 99.79 99.71 99.76 0.03 0.029 11,721

OPF (Euclidean) 99.65 99.36 99.54 0.10 2.5 87

OPF (Chi-squared) 99.7 99.23 99.47 0.13 2.5 748

SVM (Polynomial) 99.88 97.04 99.43 0.94 70 9875

SVM (Linear) 99.87 99.76 99.82 0.04 40 5595

Best values are in bold

Neural Comput & Applic (2017) 28 (Suppl 1):S573–S584 S581

123


5.3 Overall results and main contributions

Many methods have been proposed to detect speed signs,

often using some reference object in the input images. The

first contribution of the framework proposed is the detec-

tion of speed signs based on a cascade of boosted classifiers

combined with haar-like features. This approach detects

speed signs independent of the image acquisition distance,

which is an important feature since the signs are smaller the

further they are from the camera. This is because samples

of signs with various sizes were used in the training of the

cascade classifier.

Another important contribution of the proposed frame-

work is not having to use additional attributes, since the

digits are resized to a standard size of 35 9 35 pixels. By

doing this, it was found that the digits are invariant in terms

of size, but here this is attained by processing the image

and not in the recognition step as is normally done. This

increases the recognition speed and robustness. Another

contribution is also related to the processing speed, veri-

fying seven types of classifiers to check which one had the

best recognition performance and low processing time. The

top recognition rate obtained was superior to 99:7% with

the SVM and OPF classifiers performing in real time in the

embedded system.

Analyzing the framework in an optimal configuration, we

obtained a detection and recognition of 89:19%, which cor-

responds to 11,167 signs correctly detected and recognized

fromadatabasewith 12,520 signs. The speed of the embedded

system varied between 20 and 30 frames per second,

depending on the number of signs found in the input image.

All the solutions that were developed here are fast and

able to be embedded in commercial systems.

The drawback of the methodology developed is the error

generated by large rotations, but this can be mitigated with

the correct configuration of the camera.

Table 3 Acc, Se, Sp, FS for the

worst case of the SVM, with

linear and polynomial kernel,

and OPF, with Euclidean and

Chi-squared distances

SVM

Linear kernel Polynomial kernel

Class Sp (%) Se (%) HM (%) Acc (%) Class Sp (%) Se (%) HM (%) Acc (%)

0 99.94 100.0 99.95 99.72 0 99.97 100.0 99.97 99.86

1 99.96 100.0 99.96 99.83 1 100.0 74.04 97.18 85.09

2 99.97 100.0 99.97 99.89 2 100.0 100.0 100.0 100.0

3 100.0 99.17 99.91 99.58 3 96.87 99.88 97.17 87.53

4 99.94 99.78 99.92 99.67 4 100.0 99.78 99.97 99.89

5 99.98 100.0 99.98 99.93 5 99.96 100.0 99.96 99.80

6 99.98 99.76 99.96 99.82 6 100.0 99.53 99.95 99.76

7 99.97 99.85 99.96 99.78 7 99.98 99.85 99.97 99.85

8 99.97 99.51 99.92 99.63 8 99.97 99.63 99.94 99.69

9 99.98 99.59 99.94 99.74 9 99.94 99.89 99.94 99.74

Total 99.97 99.76 99.95 99.76 Total 99.67 97.04 99.40 97.04

OPF

Euclidean distance Chi-squared distance

Class Sp (%) Se (%) HM (%) Acc (%) Class Sp (%) Se (%) HM (%) Acc (%)

0 99.93 100.0 99.94 99.71 0 99.97 99.81 99.96 99.81

1 99.95 100.0 99.96 99.80 1 99.97 99.81 99.96 99.81

2 99.61 100.0 99.65 98.37 2 99.95 99.81 99.94 99.72

3 99.95 96.48 99.61 98.02 3 99.97 98.97 99.87 99.39

4 100.0 99.64 99.96 99.82 4 99.93 99.81 99.92 99.62

5 100.0 99.81 99.98 99.90 5 99.91 99.62 99.89 99.44

6 99.95 98.91 99.85 99.27 6 99.65 99.08 99.60 98.00

7 99.97 99.82 99.96 99.82 7 99.97 100.0 99.98 99.91

8 99.91 99.07 99.83 99.16 8 99.79 96.51 99.47 97.31

9 99.95 99.82 99.94 99.73 9 99.95 98.89 99.85 99.25

Total 99.92 99.36 99.87 99.36 Total 99.91 99.23 99.84 99.23

S582 Neural Comput & Applic (2017) 28 (Suppl 1):S573–S584

123


6 Conclusion

The addressed problem is very challenging because with

the growth of cities and the rise in the number of cars on

the streets, it is extremely important to use systems able to

identifying speed limit signs.

The objectives defined for this work were fully met,

since the system developed is able to detect and recognize

the speed limit signs satisfactorily. The developed system

successfully segmented the speed signs and recognized

their values with high accuracy.

The system obtained a global detection and recognition

rate of 89.19 %, with 90.41 % in the detection step and

98.64 % in the recognition step. From the images tested, it

can be concluded that the system implemented is quite

tolerant relative to the size of the image to be evaluated.

Even with good results, this work has some limitations.

First, the system implemented is limited in terms of the

distance between the image device and the speed limit sign,

and the rotation involved, since when these are high, the

identification can be erroneous.

Acknowledgments Pedro Pedrosa Rebouças Filho acknowledges the

sponsorship from the Instituto Federal do Ceará (IFCE) via grants

PROINFRA/2013, PROAPP/2014 and PROINFRA/2015. Victor

Hugo C. Albuquerque acknowledges the sponsorship from the

Brazilian National Council for Research and Development (CNPq)

through grants 470501/2013-8 and 301928/2014-2. Authors gratefully

acknowledge the funding of Project NORTE-01-0145-FEDER-

000022, SciTech—Science and Technology for Competitive and

Sustainable Industries, co-financed by Programa ‘‘Operacional

Regional do Norte (NORTE2020)’’ through ‘‘Fundo Europeu de

Desenvolvimento Regional (FEDER).’’

References

1. Albuquerque VHC, Barbosa CV, Silva CC, Moura EP, Rebouças

Filho PP, Papa JP, Tavares JMRS (2015) Ultrasonic sensor sig-

nals and optimum-path forest classifier for the microstructural

characterization of thermally-aged inconel 625 alloy. Sensors

15(6):12,474

2. Albuquerque VHC, Rebouças Filho PP, da Silveira Cavalcanti T,

Tavares JMRS (2010) New computational solution to quantify

synthetic material porosity from optical microscopic images.

J Microsc 240(1):50–59

3. Amat F, Keller P (2013) 3D Haar-like elliptical features for

object classification in microscopy. In: 10th international sym-

posium on biomedical imaging (ISBI), pp 1194–1197

4. Arbib MA (2003) The handbook of brain theory and neural

networks. MIT Press, Cambridge

5. de Azevedo FM, Brasil LM, de Oliveira RCL (2000) Neural net-

works with applications control and expert systems. Visual Books

6. Barreto G, Frota R (2013) A unifying methodology for the

evaluation of neural network models on novelty detection tasks.

Pattern Anal Appl 16(1):83–97

7. Barros ALBP, Barreto GA (2012) Extreme learning machine

robusta para reconhecimento de faces. In: Brazilian conference

on intelligent systems. Curitiba, PR, Brasil

8. Barthès JPA, Bonnifait P (2015) Chapter 9 - Multi-Agent active

collaboration between drivers and assistance systems. In:

Advances in artificial transportation systems and simulation,

pp 163–180. Academic Press, Boston

9. Bittencourt G (2006) Artificial Intelligence - Tools and Theories,

3 edn. Federal University of Santa Catarina

10. Burges C (1998) A tutorial on support vector machines for pattern

recognition. Data Mining Knowl Discov 2(2):121–167

11. Canny J (1986) A computational approach to edge detection.

IEEE Trans Pattern Anal Mach Intell 6:679–698

12. Carrese S, Mantovani S, Nigro M (2014) A security plan pro-

cedure for heavy goods vehicles parking areas: an application to

the lazio region (Italy). Transp Res E Logist Transp Rev

65:35–49

13. Chang CC, Lin CJ (2011) Libsvm: a library for support vector

machines. ACM Trans Intell Syst Technol 2(3):27:1–27:27

14. Cortes C, Vapnik V (1995) Support vector networks. Mach Learn

20(3):273–297

15. Duda RO, Hart PE (1972) Use of the hough transformation to

detect lines and curves in pictures. Commun ACM 15(1):11–15

16. da Silva Felix JH, Cortez PC, Rebouças Filho PP, de Alexandria

AR, Costa RCS, Holanda MA (2008) Identification and quan-

tification of pulmonary emphysema through pseudocolors. In:

MICAI 2008: Advances in Artificial Intelligence, pp 957–964.

Springer

17. Elmer P, Lupp A, Sprenger S, Thaler R, Uhl A (2015) Exploring

compression impact on face detection using haar-like features. In:

Paulsen RR, Pedersen KS (eds) Image analysis, lecture notes in

computer science, vol 9127, pp 53–64. Springer International

Publishing

18. Falcão AX, Stolfi J, Lotufo RA (2004) The image foresting

transform theory, algorithms, and applications. IEEE Trans Pat-

tern Anal Mach Intell 26(1):19–29

19. Garcia I, Bronte S, Bergasa L, Almazan J, Yebes J (2012) Vision-

based drowsiness detector for real driving conditions. In: Intel-

ligent vehicles symposium (IV), pp 618–623

20. Glasbey CA (1993) Analysis of histogram-based thresholding

algorithms. CVGIP Graph Models Image Process 55:532–537

21. Gomes SL, Rebouças ES, Rebouças Filho PP (2014) Reconhec-

imento Óptico de caracteres para reconhecimento das sinal-

izações verticais das vias de trânsito. Rev SODEBRAS 9:9–12

22. Haykin SO (2008) Neural networks and learning machines.

Pearson Prentice Hall, Upper Saddle River

23. Helene O (2006) Method of least squares. Livraria da Fı́sica

24. Horata P, Chiewchanwattana S, Sunat K (2013) Robust extreme

learning machine. Neurocomputing 102:31–44

25. Huang GB, Chen L, Siew CK (2006) Universal approximation

using incremental constructive feedforward networks with ran-

dom hidden nodes. IEEE Trans Neural Netw 17:879–892

26. Huang GB, Wang DH, Lan Y (2011) Extreme learning machines:

a survey. Int J Mach Learn Cybern 2:107–122

27. Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine:

theory and applications. Neurocomputing 70:489–501

28. Kocer HE, Cevik KK (2011) Artificial neural networks based

vehicle license plate recognition. Proc Comput Sci 3:1033–1037

29. Kohonen T (1989) Self-organization and associative memory, 3rd

edn. Springer-Verlag New York Inc, New York, NY

30. Lienhart R, Maydt J (2002) An extended set of haar-like features

for rapid object detection. In: International conference on image

processing, vol 1, pp I–900–I–903

31. McAndrew A (2004) Introduction do digital image processing

with matlab. Thomson Learning

32. Medeiros C, Barreto G (2013) A novel weight pruning method

for mlp classifiers based on the maxcore principle. Neural

Comput Appl 22(1):71–84

Neural Comput & Applic (2017) 28 (Suppl 1):S573–S584 S583

123


33. Mena AP, Bachiller Mayoral M, Dı́az-Lópe E (2015) Compara-

tive study of the features used by algorithms based on viola and

jones face detection algorithm. In: Bioinspired computation in

artificial systems, lecture notes in computer science, vol 9108,

pp. 175–183. Springer International Publishing

34. Minsky M, Papert S (1969) Perceptrons. MIT Press, Cambridge

35. Moreira FDL, Kleinberg MN, Arruda HF, Freitas FNC, Parente

MMV, de Albuquerque VHC, Rebouças Filho PP (2016) A novel

vickers hardness measurement technique based on adaptive bal-

loon active contour method. Expert Syst Appl 45:294–306

36. Neto EC, Gomes SL, Filho PPR, de Albuquerque VHC (2015)

Brazilian vehicle identification using a new embedded plate

recognition system. Measurement 70:36–46

37. Neto EC, Rebouças ES, Moraes JL, Gomes SL, Rebouças Filho

PP (2015) Development control parking access using techniques

digital image processing and applied computational intelligence.

IEEE Transactions on Latin. IEEE Trans Latin America

13:272–276

38. Nissen S (2003) Implementation of a fast artificial neural network

library (FANN). Department of Computer Science University of

Copenhagen (DIKU)

39. Papa JP, Falcão AX, de Albuquerque VHC, Tavares JMRS

(2012) Efficient supervised optimum-path forest classification for

large datasets. Pattern Recognit 45(1):512–520

40. Papa JP, Falcao AX, Suzuki CT (2009) Supervised pattern clas-

sification based on optimum-path forest. Int J Imaging Syst

Technol 19(2):120–131

41. Papa JP, Falcão AX, Suzuki CTN (2009) Supervised pattern

classification based on optimum-path forest. Int J Imaging Syst

Technol 19(2):120–131

42. Plucker JA, Esping A (2016) Human intelligence: historical

influences, current controversies, teaching resources. http://www.

intelltheory.com

43. Rakate G, Borhade S, Jadhav P, Shah M (2012) Advanced

pedestrian detection system using combination of haar-like fea-

tures, adaboost algorithm and edgelet-shapelet. In: IEEE inter-

national conference on computational intelligence computing

research (ICCIC), pp 1–5

44. Rebouças Filho PP, Cortez PC, da Silva Barros AC, Albuquerque

VHC (2014) Novel adaptive balloon active contour method based

on internal force for image segmentation - a systematic evalua-

tion on synthetic and real images. Expert Syst Appl

41(17):7707–7721

45. Rebouças Filho PP, Moreira FDL, de Lima Xavier FG, Gomes

SL, Santos JC, Freitas FNC, Freitas RG (2015) New analysis

method application in metallographic images through the con-

struction of mosaics via speeded up robust features and scale

invariant feature transform. Materials 8(7):3864

46. Rebouças Filho PP, Cortez PC, Félix JHDS, Cavalcante TdS,

Holanda MA (2013) Adaptive 2d crisp active contour model

applied to lung segmentation in ct images of the thorax of healthy

volunteers and patients with pulmonary emphysema. Revista

Brasileira de Engenharia Biomédica 29(4):363–376

47. Rezaei M, Ziaei Nafchi H, Morales S (2014) Global haar-like

features: a new extension of classic haar features for efficient face

detection in noisy images. Image and Video Technology, Lecture

Notes in Computer Science, vol 8333, pp 302–313. Springer

Berlin Heidelberg

48. Riedmiller M, Braun H (1993) A direct adaptive method for

faster backpropagation learning: the RPROP algorithm. IEEE Int

Conf Neural Netw 1:586–591

49. Ruck DW, Rogers SK, Kabrisky M, Oxley ME, Suter BW (1990)

The multilayer perceptron as an approximation to a bayes optimal

discriminant function. IEEE Trans Neural Netw 1(4):296–298

50. Russell SJ, Norvig P (2009) Artificial intelligence: a modern

approach, 3rd edn. Prentice Hall, Upper Saddle River

51. Schimidt W (1993) Initialization, backpropagation and general-

ization of feed-forward classifiers. IEEE Int Conf Neural Netw

1:598–604

52. Schölkopf B, Smola AJ (2002) Learning with kernels. MIT press,

Cambridge

53. Tavares JMR, Rebouças Filho PP, Cavalcante TDS, de Albu-

querque VHC (2009) Brinell and vickers hardness measurement

using image processing and analysis techniques. J Test Eval

38(1):1–7

54. Tu C, van Wyk B, Hamam Y, Djouani K, Du S (2013) Vehicle

position monitoring using hough transform. Int Conf Electron

Eng Comput Sci (EECS 2013) 4:316–322

55. Vapnik VN (1999) An overview of statistical learning theory.

IEEE Trans Neural Netw 10(5):988–999

56. Viola P, Jones M (2001) Rapid object detection using a boosted

cascade of simple features. IEEE Comput Soc Conf Comput

Vision Pattern Recognit 1:511–518

57. WidrowB (1990) 30 years of adaptive neural networks: perceptron,

madaline, and backpropagation. Proc IEEE 78:1415–1442

58. Widrow B, Winter R (1988) Neural nets for adaptative filtering

and adaptative pattern recognition. IEEE Comput 21:25–39

59. Wu BF, Huang HY, Chen CJ, Chen YH, Chang CW, Chen YL

(2013) A vision-based blind spot warning system for daytime and

nighttime driver assistance. Comput Electr Eng 39(3):846–862

60. Yi SC, Chen YC, Chang CH (2015) A lane detection approach

based on intelligent vision. Comput Electr Eng 42:23–29

61. Yu S, Shi Z (2015) The effects of vehicular gap changes with

memory on traffic flow in cooperative adaptive cruise control

strategy. Phys A Stat Mech Appl 428:206–223

62. Yuen HK, Illingworth J, Kittler J (1989) Detecting partially

occluded ellipses using the hough transform. Image Vis Comput

7(1):31–37

63. Zhang S, Bauckhage C, Cremers A (2014) Informed haar-like

features improve pedestrian detection. In: IEEE conference on

computer vision and pattern recognition (CVPR), pp 947–954

64. Zheng K, Zhao Y, Gu J, Hu Q (2012) License plate detection

using haar-like features and histogram of oriented gradients. In:

IEEE international symposium on industrial electronics (ISIE),

pp 1502–1505

S584 Neural Comput & Applic (2017) 28 (Suppl 1):S573–S584

123

http://www.intelltheory.com
http://www.intelltheory.com

	Embedded real-time speed limit sign recognition using image processing and machine learning techniques
	Abstract
	Introduction
	Speed limit signs detection
	Based on a cascade of boosted classifiers working with haar-like features
	Based on the Hough transform and Canny edge detector

	Speed limit sign recognition
	Support vector machines
	Optimum-path forest classifier
	Least Squares
	Least mean squares
	Extreme learning machine
	Multilayer perceptron

	Proposed framework
	Results and discussion
	Speed limit digits detection
	Speed limit digits recognition
	Overall results and main contributions

	Conclusion
	Acknowledgments
	References