SPECIAL ISSUE PAPER Real-time velocity measurement to linear motion of a rigid object with monocular image sequence analyses Danilo Filitto • Júlio Kiyoshi Hasegawa • Airton Marco Polidório • Nardênio Almeida Martins • Franklin César Flores Received: 2 July 2014 / Accepted: 3 November 2014 / Published online: 28 November 2014 � Springer-Verlag Berlin Heidelberg 2014 Abstract This paper presents a methodology and all procedures used to validate it, which were executed in a physics laboratory under controlled and known conditions. The validation was based on the analyses of registered data in an image sequence and the measurements acquired by high precision sensors. This methodology intended to measure the velocity of a rigid object in linear motion with the use of an image sequence acquired by commercial digital video camera. The proposed methodology does not need a stereo pair of images to calculate the object position in the 3D space: it needs only images sequence acquired for one, only one, angle view (monocular vision). To do so, these objects need to be detected while in movement, which is conducted by the application of a segmentation technique based on the temporal average values of each pixel registered in N consecutive image frames. After detecting and framing these objects, specific points belonging to the object (pixels), on the plane image (2D coordinates or space image), are automatically chosen, which are then transformed into corresponding points in the space object (3D coordinates) by the application of collinearity equations or rational functions (proposed in this work). After obtaining the coordinates of these points in the space object that are registered in the sequence of images, the distance, in meters, covered by the object in a particular time interval may be measured and, conse- quently, its velocity can be calculated. The system is low cost, use only a computer (architecture Intel I3), and a webcam used to acquire the images (640 9 480, 30 fps). The complexity of the algorithm is linear, fact that allows the system to operate in real time. The results of the analyses are discussed and the advantages and disadvan- tages of the method are presented. Keywords Moving objects � Image segmentation � Geometric transformation � Velocity measurement � Rational polynomials � Collinearity equations � Monocular image sequence 1 Introduction Object movement in any environment is a common research field in Digital Image Processing (DIP). Among the several types of systems capable of monitoring move- ment in any specific environment, there are those that detect and count moving vehicles [2, 12, 21]; and those that measure the velocity of vehicles [27, 28]. In [2], the monitoring system of traffic flow and road traffic analysis is developed. This system uses methods of image processing and pattern recognition to measure the velocity and recognize the license plate number of the vehicles. D. Filitto � A. M. Polidório � N. A. Martins (&) � F. C. Flores Department of Informatics, UEM, Maringá, PR, Brazil e-mail: nardenio@din.uem.br D. Filitto e-mail: dfilitto@gmail.com A. M. Polidório e-mail: ampolidorio@gmail.com F. C. Flores e-mail: fcflores@din.uem.br J. K. Hasegawa Cartography Department, UNESP, Presidente Prudente, SP, Brazil e-mail: hasegawa@fct.unesp.br 123 J Real-Time Image Proc (2016) 11:829–846 DOI 10.1007/s11554-014-0472-4 http://crossmark.crossref.org/dialog/?doi=10.1007/s11554-014-0472-4&domain=pdf http://crossmark.crossref.org/dialog/?doi=10.1007/s11554-014-0472-4&domain=pdf In [12], the system of detection and classification of vehicles is proposed, which uses the size of vehicles identified in segmentation process to calculate the height and the actual length of the same, allowing to classify them as vehicles and not vehicles (Vans, Pick Ups, Trucks). In [21], a vehicle counting system is proposed, which calculates the total number of vehicles that are traveling on the highway through a tracking zone, in which is performed the extraction of the highway structure, the detection of the moving vehicles and finally the count of vehicles identified in the detection process. Already the system developed by [27] uses the camera calibration method of Tsai in two phases [23] to define the intrinsic and extrinsic parameters of the camera and a mathematical model to define the geometric relationship between plane image and space object, aiming to convert the coordinates of the plane image (2D), in which the vehicle is, into the coordinates in the world (3D). Finally velocity in linear motion is measured on basis of the dif- ference of time between two sequential frames. In [28], the system developed is aimed to calculate the vehicle velocity. The main idea of this system is the cre- ation of several control lines in the image. The control lines introduced on the image are used to define the traffic ranges and create points of correspondence with the real world. Based on these points is calculated actual position of each pixel to control the vehicle in the world. With the coordi- nates in hand, the vehicle velocity is calculated through the covered distance between two points in a given time interval. To develop a monitoring system based exclusively on the use of DIP techniques, several problems need to be overcome. First of all, such a system must be able to detect and segment the element being monitored in the scene. The segmentation method must be robust, as it must be able to circumvent several adverse conditions such as those inherent to the element of interest itself (color, size, geo- metric shape, differences in texture patterns, and its own movement in the scene), as well as those problems relative to the environment (variations in the intensity of solar illumination, rain, shadows, interferences caused by other objects present in the scene, etc.). In other words, a mon- itoring system based only on images has to overcome not only the problems caused by the environment, but also those caused by the object of interest itself. These problems are difficult to be treated. Robust methods that detect movement of an object in an image sequence are constantly improved or proposed, as: based on optical flow [4, 6–8, 18]; motion history image [9, 10]; background segmentation by codebook model [15]. An additional problem exists when elements belonging to the imaged scene need to be reconstructed tridimen- sionally using, only, the data from the image. One possible way to perform this reconstruction is through the com- puting of stereo pairs images and reconstruct the space object by the application of photogrammetric methods [1, 17]. To proceed this reconstruction, it is necessary, at first, to carry out internal and external orientation of the camera and after performing the rectification and registration of such images acquired in stereo vision, then you should search for homologous pixels between images of the stereo pair for reconstructing these points in the space object [1, 6, 24]. As can be seen, only performing a study on the 3D reconstruction based on images analyses is a problem of significant value. The insertion of other problems can introduce severe errors in this study. How to consider, for example, the motion of a body in an uncontrolled envi- ronment introduces difficulties and uncertainties in the 3D reconstruction process. Thus, this work has only the aim to propose a method for 3D reconstruction of interest points observed in a sequence of images acquired under known conditions, and use those points to measure the velocity of moving objects. The innovation introduced by thiswork is to consider only images acquired by amonocular vision system, and this way, to eliminate the photogrammetric procedures (internal and external camera orientation, images rectification and regis- tration and the search for homologous pixels). So, a new method to solve this problem is proposed and to validate this method an experiment was executed in a physics laboratory under controlled and known conditions. This study involves the reconstruction of points belonging to the plane image (2D) into the corresponding points in the space object (3D) using monocular image sequences acquired by a commercially available, low-cost, digital video camera [3, 13]. To measure the velocity of a rigid object in linear motion, the covered distances have to be computed. When it is performed using images or points of interest on the image, the image must be geometrically corrected. This geometric correction is required due to the sphericity of the camera lens; all acquired images must be managed by the central perspective projection (as can be seen in Fig. 1), resulting in all points in the object image (3D) within an image being registered on the plane image (2D) leading to losses of data and geometric distortions. The goal of this study was to present a method enable to determine the velocity of a rigid object in linear motion in real time with the use of monocular image sequence analysis. This study presents the application of a technique to image sequence segmentation that considers the history of the variation of values associated with the homologue pixels registered in successive image frames [15], and the analysis of geometric relations between points in real space 830 J Real-Time Image Proc (2016) 11:829–846 123 (space object) and the corresponding points on the plane image, obtained by collinearity equations [5, 26] and rational functions [19, 20, 22]. This paper has been organized as follows: Sect. 2 presents the method applied during the segmentation phase. Section 3 describes how the conversion of plane image coordinates into space object coordinates was conducted. Section 4 describes the proposed system, as well as the experimental results. Finally, in Sect. 5, con- clusions are presented and the final considerations are made. 2 Image segmentation based on the history of the values associated with pixels Kim et al. [15] developed a technique for image segmenta- tion that permits capturing background variations and deal- ing with scenes that contain moving objects or differences in lighting. This technique quantizes samples of each pixel of an image in codebooks, which represents, in a compact manner, the background model of image sequences in a particular period of time. To apply this technique, it is necessary to acquire, a priori, a training value sequence X for each pixel from each image frame. For N image frames, N vectors x, belonging to the color space RGB, X ¼ fx1; x2; . . .; xNg, are neces- sary to enable the historical register of the color value variation (or any other attribute) occurred in a determined period of time associated with each pixel. Each pixel has a codebook C ¼ fc1; c2; . . .; cLg composed by L codewords. Each codeword ci, i = 1.. L, is composed by an RGB vector Vi ¼ ð�Ri; �Gi; �BiÞ and a tuple auxi ¼ I ^ mini; Îmaxi; fi; ki; pi; qi that contains brightness intensity and the tem- poral variables described as follow: • I ^ mini; Îmaxi: represent, respectively, the smallest and the largest brightness value observed among all brightness values associated with the variation occurred in the historical register of the pixel i; • fi: represents the frequency a codeword occurs; • ki: represents the longest time interval (during the training period) that one codeword was not recovered; • pi, qi: represent, respectively, the first and the last access time that occurred in the codeword. 2.1 Reference model training To cope with changing global and local lighting, Kim et al. [15] developed a model for dealing with color distortion and brightness distortion. Consider the input pixel Xt ¼ ðR;G;BÞ and a codeword ci, where Vi ¼ ð�Ri; �Gi; �BiÞ, and Xtk k2¼ R2 þ G2 þ B2 ð1Þ Vik k2¼ ð�Ri; �Gi; �BiÞ ð2Þ Xt;Vih i2¼ Vi ¼ �RiRi þ �GiGi þ �BiBið Þ ð3Þ The color distortion can be calculated by: p2 ¼ Xtk k2cos2h ¼ Xt;V 2 i � � Vik k2 ð4Þ This calculation is represented by the function colordist, i.e., colordist Xt;Við Þ ¼ d ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Xtk k2�p2 � �r ð5Þ The color distortion can be interpreted as a brightness- weighted version in the normalized color space. This is equivalent to normalizing a codeword vector to the brightness of an input pixel. This way, the brightness is taken into consideration for measuring the color distortion, and to avoid the instability of normalized colors. Consider also the brightness function, given by: brightness I; I ^ minm; Îmaxm � � ¼ true; if Ilow �Xt � Ihi false; otherwise � ð6Þ Fig. 1 Distortion patterns modeled by specific polynomial terms [20] J Real-Time Image Proc (2016) 11:829–846 831 123 In the training period, each value Xt sampled at the time t is compared to values on the codebook to determine if a codeword cm that corresponds to the sampled value exists. To determine if a codeword exists, an average value for the distortion on the limits of the color and brightness in the codeword of index m is applied. In the algorithm presented in Table 1, both conditions (a) and (b) present in step III—(ii) are satisfied when the color of Xt and cm are similar, and the brightness of Xt is among the acceptable limits of the brightness of cm. Condition (a) verifies the distortion (d) of the values associated with an entry pixel Xt ¼ ðR;G;BÞ relative to a codeword ci in which Vi ¼ ð�Ri; �Gi; �BiÞ. This condition compares the result of the colordist function with a threshold value e1. Condition (b) verifies in the tuple codeword cm, if the brightness value I of Xt belongs to the interval defined by the largest (Ihi) and by the smallest (Ilow) brightness values of each codeword cm. In other words, I 2 [Ilow, Ihi], as: Ilow ¼ min bÎmax; I ^ min a 8 < : 9 = ; ð7Þ Ihi ¼ aÎmax ð8Þ for a\ 1 and b[ 1: These values are working to reduce the range of the interval [Ilow, Ihi]. According to Kim et al. [15], a is between 0.4 and 0.7 and b is between 1.1 and 1.5. This range Ilow; Ihi½ � becomes a stable range during codebook updating. The a value is obtained through experiments. The value 0.4 allows large brightness bounds, but 0.7 gives tight bounds. The b is additionally used for limiting Ihi since shadows (rather than highlights) are observed in most cases. In the case of the experiment, the values a = 0.5 and b ¼ 1:2 were set empirically. The training for background modeling generates a codebook for each pixel, formed by codewords that rep- resent the history of values associated with each image pixel in a determined training period. However, many of these values belonging to a codebook of each pixel repre- sent invalid entries, such as noises or moving objects. Such entries are eliminated through the creation of a reference model. 2.2 Reference model The reference model of an imaged scene (Eq. 9) is gen- erated based on the data filtered from the codebook for all the pixels. This filtering process removes from the code- book all the codewords that represent noises or moving objects, keeping only the entries that represent the image background. m ¼ fcmjcm 2 C and km � Tmg ð9Þ The term Tm in Eq. 9 represents a threshold value used to remove codewords that are supposedly associated with noises and moving objects. According to [15], it is advis- able that the most adequate value for this threshold is the amount of frames used in the reference model training Table 1 Codebook construction algorithm [15] 832 J Real-Time Image Proc (2016) 11:829–846 123 process divided by 2. In outdoor environments it is advis- able [27], to use a reference model training period of over 5 min. 2.3 Background detection Once the reference model is obtained, it is possible to subtract from a current image the pixels referring to static objects (image background). The algorithm presented in Table 2 verifies if any pixel in the image belongs to the background or to the moving objects. 3 Coordinate conversion This section describes the techniques used to conduct coordinate conversions from plane image (2D) to space object (3D). Two models will be presented for this conversion, the polynomial one, suitable for images when you do not know the parameters of interior orientation of the acquisition sensor and collinearity equations, based on geometrical relationship of the perspective projection and knowledge of the parameters of interior orientation camera. 3.1 Polynomial models It is possible to relate the plane image and the space object through Polynomial models [22]. The short form of a simple polynomial (non-rational) may be given by the Eqs. 10 and 11: l ¼ Xm i¼0 Xn j¼0 Xp k¼0 aijkx iy jzk ð10Þ c ¼ Xm i¼0 Xn j¼0 Xp k¼0 bijkx iy jzk ð11Þ where (l, c) represent the coordinate of a determined pixel in the matrix image (row and column, respectively); x, y and z are the tridimensional Cartesian coordinates of the point on the ground; aijk, and bijk are the polynomial coefficients; and m, n, p are integer values, that belong to the interval [0, 3], with m ? n(?p) being the order of the polynomial functions, generally three. Equations 12 and 13 show in more detail an example of an expansion of the notations referring to Eqs. 10 and 11; in this case, a polynomial is of the third order in the 3D space. l ¼ a1 þ a2yþ a3xþ a4zþ a5yxþ a6yzþ a7xzþ a8y 2 þ a9x 2 þ a10z 2 þ a11xyzþ a12y 3 þ a13yx 2 þ a14yz 2 þ a15y 2xþ a16x 3 þ a17xz 2 þ a18y 2zþ a19x 2zþ a20z 3 ð12Þ c ¼ b1 þ b2yþ b3xþ b4zþ b5yxþ b6yzþ b7xzþ b8y 2 þ b9x 2 þ b10z 2 þ b11xyzþ b12y 3 þ b13yx 2 þ b14yz 2 þ b15y 2xþ b16x 3 þ b17xz 2 þ b18y 2zþ b19x 2zþ b20z 3 ð13Þ According to [20], there are distortions or movement patterns that may be modeled or corrected by specific polynomial terms (Fig. 1). However, the main advantage of using the polynomial model is the correction of all sources of distortions simultaneously [19]. The parameters of the Eqs. 12 and 13 are determined knowing of at least 21 points (control points) in three- dimensional cartesian coordinates (space object) and his record in the image. The same Eqs. (12 and 13), can be used to determine the three-dimensional coordinates of points, setting the Z coordinate and calculating the X and Y coordinates. 3.2 Collinearity equations Collinearity equations permit to relate plane image to space object transforming the space object coordinates into plane Table 2 Algorithm for background subtraction (BGS) [15] J Real-Time Image Proc (2016) 11:829–846 833 123 image coordinates (and vice versa). In photogrammetric process, collinearity equations [16] reproduce mathemati- cally the process of image formation, linking the coordi- nates in the space object (3D) to their corresponding coordinates in the plane image (2D). The basic principle to establish collinearity is based on the condition that points C (perspective center), p’ (image point) and P (object point) belong to the same line [5, 26]. The geometry shown in Fig. 2 presents the following similar triangles DCDB * DCdb and DCBA * DCba and, therefore, permits to establish the following geometric relationships between the measurements: xp X ¼ zp Z and yp Y ¼ zp Z ð14Þ which permits to determine the projective equation systems: xp ¼ zp X Z and yp ¼ zp Y Z ð15Þ where xp; yp; zp � is the coordinates of point p’ in the plane image, corrected for systematic errors (carried out interior orientation); (X, Y, Z) is the coordinates of point P in the space object. In case of the coordinates in the space object are translated, rotated or with a different scale from the plane image (Fig. 3), similarity transformation must be applied on the referential system coordinates of the space object (point P) to the referential system of the plane image (point p’). This transformation may be conducted in three steps: (a) applying three translations in order to compensate the spatial difference between the origins; (b) applying rotation to compensate angular differences; and (c) execute the correction of the difference in scale between the referential systems. The rotation movements occur counterclockwise, con- sidering that the referential system of the image gyrates while the object remains fixed, whose rotation matrices Mj;Mr; and Mx are given by: Mj ¼ cos j sinj 0 � sin j cos j 0 0 0 1 2 4 3 5;Mr ¼ cosu 0 � sinu 0 1 0 sinu 0 cosu 2 4 3 5;Mx ¼ 1 0 0 0 cosx sinx 0 � sinx cosx 2 4 3 5 ð16Þ The matrices Mj;Mre Mx are rotation matrices around the axes x, y and z, respectively. Fig. 2 Coordinate system, parallel image and object, originating in the perspective center Fig. 3 Coordinate systems of the image (xp, yp, zp) and the translated and rotated object (XT, YT, ZT) [16] 834 J Real-Time Image Proc (2016) 11:829–846 123 The rotation matrix M (M ¼ ðMj:Mr:Mx)) is given by: and the transformation will happen through: X Y Z 2 4 3 5 ¼ kM XP � Xc YP � Yc ZP � Zc 2 4 3 5 ð18Þ in which • k is a scale factor; • Xc, Yc and Zc are the coordinates of the perspective center in the object reference system; • X, Y and Z are the point coordinates in the system XYZ. The equation system described in Eq. 18 may be written as: X ¼ k½m11 XP � Xcð Þ þ m12 YP � Ycð Þ þ m13 ZP � Zcð Þ� Y ¼ k½m21ðXP � XcÞ þ m22 YP � Ycð Þ þ m23 ZP � Zcð Þ� Z ¼ k½m31ðXP � XcÞ þ m32 YP � Ycð Þ þ m33 ZP � Zcð Þ� ð19Þ By replacing Eq. 19 in the projective Eq. 15, k is eliminated from the equation system, resulting in the collinearity equations, Eq. 20: xp ¼ zp: m11 X � Xcð Þ þ m12 Y � Ycð Þ þ m13ðZ � ZcÞ m31 X � Xcð Þ þ m32 Y � Ycð Þ þ m33ðZ � ZcÞ yp ¼ zp: m21 X � Xcð Þ þ m22 Y � Ycð Þ þ m23ðZ � ZcÞ m31 X � Xcð Þ þ m32 Y � Ycð Þ þ m33ðZ � ZcÞ ð20Þ The equation system, Eq. 20, is used in the negative film image conception [19]; when applied on the reversal film, due to the fact that zp = -f (focal distance), it becomes: xp ¼ �f : m11 XP � Xcð Þ þ m12 YP � Ycð Þ þ m13ðZP � ZcÞ m31 XP � Xcð Þ þ m32 YP � Ycð Þ þ m33ðZP � ZcÞ yp ¼ �f : m21 XP � Xcð Þ þ m22 YP � Ycð Þ þ m23ðZP � ZcÞ m31 XP � Xcð Þ þ m32 YP � Ycð Þ þ m33ðZP � ZcÞ ð21Þ The Eq. 21 can be used to determine the exterior ori- entation parameters of the camera. Thus, it is necessary that at least four points with known coordinates in the three-dimensional Cartesian coordinate system to be imaged. The inverse form of the collinearity equations may be obtained by applying the inverse transformed in the equa- tion system, Eq. 18, which results in: XP � Xc YP � Yc ZP � Zc 2 4 3 5 ¼ k�1MT X Y Z 2 4 3 5 ð22Þ and X Y Z 2 4 3 5 ¼ kp xp yp zp 2 4 3 5 ð23Þ By replacing Eq. 23 in the Eq. 22, resulting in: XP � Xc YP � Yc ZP � Zc 2 4 3 5 ¼ k�1MTkp xp yp zp 2 4 3 5 ð24Þ The Eq. 24 may be written as: XP � Xc ¼ k�1kpðm11xp þ m21yp þ m31zpÞ YP � Yc ¼ k�1kpðm12xp þ m22yp þ m32zpÞ ZP � Zc ¼ k�1kpðm13xp þ m23yp þ m33zpÞ ð25Þ By isolating the terms XP and YP from Eq. 25, the inverse form of the collinearity equations is obtained (Eq. 26). XP ¼ Xcþ ðZP � ZcÞm11xp þ m21yp þ m31zp m13xp þ m23yp þ m33zp YP ¼ Ycþ ðZP � ZcÞm12xp þ m22yp þ m32zp m13xp þ m23yp þ m33zp ð26Þ 4 The system A computational system capable of measuring the velocity of a rigid object in linear motion was developed. This system is composed by five modules: 1. interface module, responsible for performing the interface between the computational system and the video camera, capturing the images acquired by the camera and making them available for processing; 2. the module responsible for conducting the detection of moving objects in a sequence of acquired images, using the segmentation module based on the Bradski’s Mean Shift algorithm; 3. the coordinate conversion module, responsible for performing the coordinate conversion from plane image to space object by using the collinearity equations or the polynomial model. This module has as the entry parameter the knowledge a priori of 21 M ¼ cosu cos j cosx sin j� sinx sinu cos j sinx sin j� cosx sin/ cos j � cosu sinj cosx cos j� sinx sinu sin j sinx cos j� cosx sinu sinj sinu � sinx cosu cosx cosu 2 4 3 5 ð17Þ J Real-Time Image Proc (2016) 11:829–846 835 123 Fig. 4 Flowchart of the system 836 J Real-Time Image Proc (2016) 11:829–846 123 points of the space object with their respective points on the plane image. In case the conversion method chosen is the collinearity equations, the information about camera calibration is also inserted; 4. the module responsible for measuring the velocity of an object and conducting the analyses relative to this value; 5. the module responsible for exhibiting the sequences of images with the superposition of the computed details. The implementation of the system (i.e., system modules) is represented in the flowchart in Fig. 4, which is explained in the sequence. All modules were implemented in C??. The OpenCV library was used to handle and to process the captured images. The implementation starts with the initialization of the interface module, where the source of images is set (webcam of video file). In the following, it is possible to pick an image frame from the video source and to store it in the RAM. This image is represented by the structure Ipl- Image, provided by the OpenCV library. The image orientation process is done in a unique image, because the camera stands in a still position during the acquisition of the images. After the observation of the controlling points from the image (in the reference plate, to see Fig. 6), the coordinate conversion module is acti- vated, receiving as a parameter, the mathematical model (Eqs. 12 and 13 to the polynomial method or Eq. 26 to the collinearity equations). The module also receives, as parameters, the intrinsic and extrinsic camera data, with the control points from the image space and their respective coordinates in the object space (21 control points). In this module, it is applied a method to deter- mine the orientation parameters that will be used to convert the points from the space image to the space object. Following the initialization of the coordinate conversion module, the object detection module is started, and the background image is computed from captured environment images for, at least, 40 s. The background image will be used to segment the moving bodies. In the following, the monitoring task is started, by activation of the velocity measurement module. Another module is responsible to exhibit the results on screen. The detection module captures a frame from the image sequence and segments it to verify if there exists an object inside a region of interest given by a bounding box (Fig. 5). If the object is found in the bounding box, the object control point is determined (lower left coordinate) and sent to the measurement module. The segmentation module recalculates the background image taking into account the last captured frame. After this recalculation, the process iterates and another frame is captured from the image sequence. If the detected object in the segmentation step is located after the region of interest, the measurement module con- verts the control points by the application of the coordi- nates conversion module and the velocity is measured as well. The exhibition module takes the computed data and displays it on screen. Fig. 5 Bounding box implemented in the detection module for the storing of control points J Real-Time Image Proc (2016) 11:829–846 837 123 4.1 The experiment To validate the system and the methodology developed, an experiment was performed in an Applied Physics Labora- tory, with the use of a device named ‘‘Air Track’’ (Fig. 6) to measure the velocity of a rigid object in linear motion on a surface. The Air Track possesses small orifices along its sides. When compressed air is injected inside the track, air leaves these small orifices and forms an air mattress on the surface of the track. This system permits that an object slides over the track with no (or insignificant) friction between the surfaces. Five photoelectric sensors are placed along the track, each sensor is connected to a chronometer (precision of 5 9 10-4 s). The distances (Ds) that separate the sensors one from the other are known. When an object slides over the track, passing by the first sensor, all chronometers are acti- vated from zero second display.When the object passes by the following sensors, the chronometer connected to that sensor is stopped. Thus, the distance traveled by the object between each sensor and the time (Dt) needed to cover each of the distances are obtained. With these data, the velocity value, v = Ds/Dt, is calculated at each of the measured points. The experiment was totally registered in a sequence of images, with the use of a low-cost digital camera. Each image frame registered was submitted to the movement detection method and the segmentation of moving objects developed by [15]. After segmenting the moving object belonging to i-th frame registered at time ti, the coordinate ci; lið Þ relative to the plane image was extracted (Fig. 6), but significant in relation to the movement performed by the object. After obtaining all coordinates of two consecutive frames ci; lið Þ and ciþ1; liþ1ð Þ, acquired at the moments ti and ti?1, respectively, they are analyzed to detect if the object altered its position. In case some alteration in the object position takes place, a rectangle is drawn surrounding the object and the coordinates of the lower left-hand corner are used in the calculations (Fig. 6). To calculate the velocity of the object, two parameters need to be determined: (1) the time interval in which the movement is observed (Dt); and (2) the covered distance (Ds), in metric units, by the moving object during this observation. The time interval of the observation is obtained by the amount of frames that the camera is capable of imaging per second. For each acquired frame i, the time ti in which that happened is also registered. Thus, for two consecutive image frames: Dt ¼ tiþ1 � ti ð27Þ The determination of the covered distance (Ds) must be performed in space object (3D—real world). The point mea- sured in the plane image (2D) needs to be transformed (Eqs. 12 and 13 – for polynomial models or Eq. 22 – for collinearity equations), so that velocity may be effectively calculated. In order that this transformation may be used in the determination of the Cartesian 3D coordinates (space object), the parameters ai and bi (i = 1,…, 20) must be determined (Eqs. 12 and 13). These 40 parameters are determined using at least 21 photo-identifiable points (support 3D coordinates known in the space object), ref- erencing the image to the space object coordinates. Simi- larly, in the case of collinearity equations (Eq. 21), the minimum requirement is four control points. Thus, the covered distance may be determined calculating the rect- angle surrounding the object in the two moments. For the performance of the experiment in the laboratory, a steel plate containing 88 visible points, distributed in a matrix arrangement, regularly spaced in 100 mm was used to establish a 3D Cartesian coordinate system, attributing coordinates X, Y and Z to the points on the plate (Fig. 7). When images were acquired, some of the points on the plate were chosen as control points, to be measured in the plane image, enabling the determination of the transfor- mation parameters of Eq. 22 [14]. When the transformation parameters and the calibration information of the camera are known, any point in the image may have its coordinate in the space object esti- mated quite precisely. By applying the equation for the Euclidian distance in the 3D coordinates in the space object at two moments, the covered distance (Ds) is determined, which is used to determine the object velocity. 4.2 Procedures for conducting the experiment Figure 8 illustrates the environment setup for the execution of the experiment. Under the air track a steel plate was Fig. 6 Image sequence, frame i. The movement of a rigid object is detected (vector d points to the lower left-hand corner of the image). The object is surrounded by a minimal rectangle (in white). The ccoordinate ci; lið Þ of the point (in yellow) on the left-hand side of the lower corner of the rectangle is extracted 838 J Real-Time Image Proc (2016) 11:829–846 123 installed (Fig. 7), which permits to determine the points that are related between the space object and the plane image. The experiment, with the object moving over the air track, was filmed and the images acquired submitted to the proposed system, through which the velocity was calcu- lated by image analysis and photogrammetric calculations. The value of the calculated velocity was compared with the real velocity value computed by the sensors on the air track. 4.3 Results This section describes the results concerning the calcula- tion of the velocity in a straight line of a rigid object with null acceleration. To do so, a linear regression was performed, using the least squares method (LSM) [11, 25] to enable velocity interpolation. This was performed using a linear model y = b ? ax, in which the linear b and angular a coefficients values may be estimated with the use of the following expressions: a ¼ n Pn 1 xiyi � Pn 1 xi � Pn 1 yi � n Pn 1 x 2 i � ð Pn 1 xiÞ 2 ð28Þ b ¼ Pn 1 yi � a Pn 1 xi n ð29Þ The following expressions permit to determine the deviations occurred in the calculation of the slope of the fit of the data for the linear model, a, Da, as well as deter- mining the linear coefficient b, Db. Fig. 7 Air track system used to calculate the velocity of a rigid object in controlled conditions Fig. 8 Environment setup of the experiment J Real-Time Image Proc (2016) 11:829–846 839 123 Da ¼ ffiffiffi n p :r ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n Pn 1 x 2 i � ð Pn 1 xiÞ 2 q ð30Þ r ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPn 1ðyi � axi � bÞ2 n� 2 s ð31Þ Db ¼ Da: ffiffiffiffiffiffiffiffiffiffiffiffiPn 1 x 2 i n r ð32Þ The slope (angular coefficient) is written as a ± Da, and the intercept (linear coefficient) as: b ± Db. The correla- tion coefficient (r) is a parameter for the study of bi- dimensional distributions, which indicates the level of dependency among associated data with the variables X and Y. The correlation coefficient r is a value obtained by applying the following expression: r ¼ Pn 1 xi � �xð Þ yi � �yð Þ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPn 1ðxi � �xÞ2 q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPn 1ðyi � �yÞ2 q ð33Þ with �x ¼ Pn 1 xi n and �y ¼ Pn 1 yi n . Knowing that r 2 �1;þ1½ �; and is interpreted in the following way: • r = ?1, indicates that the linear correlation between values X and Y is perfect and direct; • r = -1, indicates that the linear correlation between values X and Y is perfect and inverse; • r = 0, indicates that there is no correlation, i.e., there is total independence of values X and Y. 4.3.1 Computed results using photoelectric sensors Table 3 exhibits the velocity of a rigid object computed by the sensors on the air track. Time data collected in the four experiments were used to calculate the respective D�t average values. These average values minimize the occurrence of systematic errors occurred during the performance of these experiments. Only these average values were used in the analysis per- formed in this study. The linear equation (Fig. 9) calculated using LSM was s = 0.4 ? 27.7t. The correlation factor (Fig. 9) of the calculated line using LSM was 0.9997. This value dem- onstrated that the experimental data were strongly corre- lated with the linear model of uniform linear motion s = s0 ? vt, where: s is the covered distance from an initial position s0 as a function of time t with constant velocity v. 4.3.2 Computed results based on collinearity equations and polynomial transformation method using automatic object segmentation Table 4 illustrates the data obtained by the software developed, using the segmentation method developed by Table 3 Data obtained in the experiments performed in the laboratory and their respective calculated velocities Experimental data Calculated velocity (cm/s) Time intervals obtained at each experiment Dt0 (s) Dt1 (s) Dt2 (s) Dt3 (s) Dt4 (s) Experiment 1 0.0000 0.5150 1.0620 1.5540 2.1230 28.4 ± 0.3 Experiment 2 0.0000 0.5200 1.0730 1.5690 2.1470 28.1 ± 0.3 Experiment 3 0.0000 0.5200 1.0780 1.6180 2.2710 26.6 ± 0.6 Experiment 4 0.0000 0.5210 1.0760 1.5780 2.1610 27.9 ± 0.3 D�t (s) 0.0000 0.5190 1.0722 1.5798 2.1755 27.7 ± 0.4 Ds (cm) 0.0000 15.000 30.000 45.000 60.000 D�t—Time interval average values Dti—Value of each time interval i measured in the experiment Ds—Value of the covered distance by the rigid object in each time interval Fig. 9 Linear fit of the distances s using LSM as a function of the average time intervals Dt 840 J Real-Time Image Proc (2016) 11:829–846 123 [15], and the coordinate conversion method from plane image to space object based on the collinearity equations and polynomial transformation method. The values for the distance and time intervals were extracted from the sequences of image frames, which resulted in a velocity of 26.691 cm/s with a correlation factor equivalent to 0.999 (Fig. 10a) for the use of collin- earity equations, and a velocity of 33.484 cm/s with a correlation factor equivalent to 0.999 (Fig. 10c) for the use polynomial transformation method. Comparing the velocity values obtained by the method based on the collinearity equations (0.691 cm/s) with the value obtained by the photoelectric sensors (27.7 cm/s), the deviation was in the order of 3.64 %, whereas for the velocity values obtained by the polynomial transformation method (33.484 cm/s) with the value obtained by the photoelectric sensors (27.7 cm/s), the deviation was in the order of 20.88 %. 4.3.3 Computed results based on collinearity equations and polynomial transformation method using manual object segmentation Table 5 illustrates the data obtained by the manual capture of points in the plane image, using the coordinate con- version method from plane image to space object based on the collinearity equations and polynomial transformation method. Table 4 Automatic data extraction from images sequence acquired in the laboratory experiments Automatic point extraction Frame Time (s) Plane image coordinates Space object coordinates and distance computation Polynomial transformation Collinearity equations Xi Yi Xo Yo Distance (mm) Xo Yo Distance (mm) 1 0.000000 367 211 311.122 803.369 0.0000 323.655 424.786 0.0000 2 0.068966 367 213 311.428 791.296 12.0729 323.856 415.896 8.8900 3 0.103448 367 216 311.868 773.609 29.7602 324.154 402.727 22.0600 4 0.137931 367 218 312.150 762.081 41.2881 324.351 394.056 30.7300 5 0.172414 366 218 309.811 762.061 41.3081 322.508 394.074 30.7100 7 0.241379 367 223 312.819 734.107 69.2619 324.833 372.750 52.0400 8 0.275862 367 226 313.198 717.857 85.5121 325.116 360.213 64.5700 9 0.310345 367 228 313.442 707.229 96.1401 325.303 351.956 72.8300 10 0.344828 367 231 313.797 691.577 111.7920 325.579 339.718 85.0700 11 0.379310 367 233 314.026 681.327 122.0420 325.761 331.656 93.1300 12 0.413793 366 235 312.035 671.205 132.1640 324.174 323.687 101.1000 13 0.448276 366 237 312.265 661.230 142.1390 324.361 315.776 109.0100 14 0.482759 367 240 314.786 646.518 156.8510 326.385 304.032 120.7500 15 0.517241 368 242 317.165 636.864 166.5050 328.299 296.289 128.5000 16 0.551724 368 245 317.448 622.590 180.7790 328.546 284.830 139.9600 18 0.620690 369 250 320.027 599.356 204.0130 330.657 266.064 158.7200 19 0.655172 369 253 320.271 585.722 217.6470 330.883 255.017 169.7700 20 0.689655 369 255 320.430 576.755 226.6140 331.031 247.734 177.0500 21 0.724138 369 257 320.586 567.883 235.4860 331.179 240.516 184.2700 22 0.758621 370 260 322.891 554.746 248.6230 333.067 229.795 194.9900 23 0.793103 370 263 323.098 541.803 261.5660 333.272 219.229 205.5600 24 0.827586 370 265 323.234 533.280 270.0900 333.408 212.262 212.5200 25 0.862069 378 267 339.721 524.810 278.5590 346.691 205.242 219.5400 26 0.896552 379 270 341.839 512.283 291.0860 348.437 194.981 229.8100 27 0.931034 378 273 339.891 499.941 303.4280 346.916 184.880 239.9000 28 0.965517 379 276 341.979 487.747 315.6220 348.638 174.880 249.9100 29 1.000000 378 279 340.054 475.726 327.6430 347.135 165.033 259.7500 30 1.034480 376 282 336.172 463.863 339.5060 344.060 155.324 269.4600 The space object coordinates transformations and the distance computation by methods: polynomial transformation and collinearity equations J Real-Time Image Proc (2016) 11:829–846 841 123 The values for the distances and time intervals were extracted from the sequence of image frames, which resulted in a velocity of 27.905 cm/s with a correlation factor equivalent to 0.9980 (Fig. 10b) for the use of col- linearity equations, and a velocity of 34.99 cm/s with a correlation factor equivalent to 0.9983 (Fig. 10d) for the use polynomial transformation method. Comparing the velocity values obtained by the trans- formation method based on the collinearity Eqs. (27.905 cm/s) with the value obtained by the photo- electric sensors (27.7 cm/s), the deviation was in the order of 0.74 %, whereas for the velocity values obtained by the polynomial transformation method (34.99 cm/s) with the value obtained by the photoelectric sensors (27.7 cm/s), the deviation was in the order of 26.31 %. 4.3.4 Analysis of the results Table 6 illustrates the results obtained with the tests per- formed, highlighting the velocity measured by the auto- matic detection of rigid objects using the segmentation method described by [15] and by manual detection. It also shows the error in percentage caused by the automatic identification of rigid objects in the velocity measurement process, and the error in percentage of the coordinate conversion methods from plane image to space object in comparison with the velocity obtained by the photoelectric sensors data. Based on the results obtained, it is possible to state that the technique used for the automatic identification of moving rigid objects may lead to errors in the process of Fig. 10 Linear fit of the time (t) versus distance ðsÞ using LSM. a Result Computed by collinearity equations using automatic object segmentation. b Result Computed by collinearity equations using manual object segmentation. c Result Computed by polynomial transformation method using automatic object segmentation. d Result Computed by polynomial transformation method using manual object segmentation 842 J Real-Time Image Proc (2016) 11:829–846 123 measuring the velocity. This was due to the fact that errors occurred in the identification of the lower left-hand corner (Fig. 11) of the moving rigid object used as a reference point to perform the velocity measurement. 5 Final considerations This study presented, implemented and experimented a methodology for the velocity measurement of a rigid object in motion, having as the starting point a monocular images sequence acquired by a digital video camera. Table 5 Manual data extraction from images sequence acquired in the laboratory experiments Manual point extraction Frame Time (s) Plane image coordinates Space object coordinates and distance computation Polynomial transformation Collinearity equations Xi Yi Xo Yo Distance (mm) Xo Yo Distance (mm) 1 0.00000 367 211 311.122 803.369 0.0000 323.655 424.786 0.0000 2 0.06897 367 213 311.428 791.296 12.0729 323.856 415.896 8.8900 3 0.10345 367 216 311.868 773.609 29.7602 324.154 402.727 22.0592 4 0.13793 367 218 312.150 762.081 41.2881 324.351 394.056 30.7300 5 0.17241 366 218 309.811 762.061 41.3081 322.508 394.074 30.7121 7 0.24138 367 223 312.819 734.107 69.2619 324.833 372.75 52.0368 8 0.27586 367 226 313.198 717.857 85.5121 325.116 360.213 64.5733 9 0.31035 367 228 313.442 707.229 96.1401 325.303 351.956 72.8304 10 0.34483 367 231 313.797 691.577 111.7920 325.579 339.718 85.0683 11 0.37931 367 233 314.026 681.327 122.0420 325.761 331.656 93.1301 12 0.41379 366 235 312.035 671.205 132.1640 324.174 323.687 101.1000 13 0.44827 366 237 312.265 661.230 142.1390 324.361 315.776 109.0100 14 0.48276 365 240 310.418 646.498 156.8710 322.891 304.063 120.7230 15 0.51724 365 243 310.761 632.052 171.3170 323.176 292.497 132.2890 16 0.55172 366 247 313.344 613.201 190.1680 325.27 277.308 147.4780 18 0.62069 364 251 309.502 594.756 208.6130 322.214 262.438 162.3480 19 0.65517 364 254 309.823 581.200 222.1690 322.497 251.439 173.3470 20 0.68966 365 256 312.126 572.290 231.0790 324.368 244.174 180.6120 21 0.72414 364 260 310.437 554.725 248.6440 323.05 229.879 194.9080 22 0.75862 364 263 310.732 541.787 261.5830 323.322 219.311 205.4750 23 0.79310 364 266 311.018 529.036 274.3340 323.589 208.882 215.9040 24 0.82759 364 268 311.205 520.635 282.7340 323.766 202.004 222.7820 25 0.86207 364 271 311.479 508.180 295.1900 324.027 191.799 232.9880 26 0.89655 364 272 311.569 504.065 299.3040 324.114 188.426 236.3600 27 0.93103 364 276 311.920 487.790 315.5790 324.456 175.078 249.7080 28 0.96552 364 278 312.092 479.757 323.6120 324.625 168.489 256.2970 29 1.00000 364 281 312.344 467.836 335.5330 324.875 158.709 266.0770 30 1.03448 364 285 312.671 452.169 351.2010 325.204 145.858 278.9280 The space object coordinates transformations and the distance computation by methods: polynomial transformation and collinearity equations Table 6 Results of the tests performed in the velocity measurement Method used in velocity measurement Velocity (cm/s) obtained with the identification of objects Error (%) relative to photoelectric sensors (velocity measured by object identification) Automatic Manual Automatic Manual Photoelectric sensors 27.700 – – – Collinearity equations 26.691 27.905 3.64 0.74 Polynomial transformation method 33.484 34.990 20.80 26.31 J Real-Time Image Proc (2016) 11:829–846 843 123 The application of the modified collinearity model pre- sented in the experiments conducted to the best results in comparison to the other models. In the best result, the estimated velocity was 27.90 cm/s, with an error of 0.74 % in relation to the velocity obtained by the photoelectric sensors (27.7 cm/s). It is necessary to complement this study approaching the influence of scale variation in image acquisition and know if the use of points belonging to the object, with different elevation values on the object of interest have influence on the accuracy of the results achieved by the application the proposed methods. The implemented and tested prototype in controlled conditions, with the proper considerations, demonstrated that the methodology is valid. However, further experi- mental tests should be performed, especially with the use of different camera viewing angles, using different dis- tances between the camera and the scene (scale factor) and using different frame rate image acquisition to better understand how these factors may influence the results. The advantages of using the proposed methods in this paper are: 1. need inexpensive and commercially available technology; 2. not need to use other types of sensors, only the video camera is sufficient; 3. has high computational efficiency because the required algorithms have linear complexity; 4. does not need to make calculations related to recon- struction of 3D images acquired at different viewing angles (stereo pairs). The disadvantages of the use of the proposed methods in this paper are: 1. these methods are based on the analysis of images, then you must use a robust segmentation method that is able to overcome the problems encountered by the scene environment; 2. it is necessary that specific points of the scene (space object) with known coordinates are previously estab- lished and these points must be identified at any time operating system. References 1. Aguilar, M.A., Aguilar, F.J., Agüera, F., Sánchez, J.A.: Geo- metric accuracy assessment of quickbird basic imagery using different operational approaches. Photogramm. Eng. Remote Sens. 73(12), 1321–1332 (2007) 2. Atkociunas, E., Blake, R., Juozapavicius, A., Kazimianec, M.: Image processing in road. Traffic Anal., Nonlinear Anal. Model. Control 10(4), 315–332 (2005) 3. Avidan, S., Shashua, A.: Trajectory triangulation: 3D recon- struction of moving points from a monocular image sequence. IEEE Trans. Pattern Anal. Mach. Intell. 22(4), 348–357 (2000). doi:10.1109/34.845377 4. Barranco, F., Tomasi, M., Diaz, J., Vanegas, M., Ros, E.: Parallel architecture for hierarchical optical flow estimation based on FPGA. IEEE Trans. Very Large Scale Integration (VLSI) Syst. 20(6), 1058–1067 (2012) 5. Bonneval, H.: Levés topographiques par photogrammétrie aéri- enne. In Photogrammétrie Générale: Tome 3, Collection scien- tifique de l’Institut Géographique National. Paris, France: Eyrolles Editeur 1972 6. Botella, G., Garcia, A., Rodriguez, M., Ros, E., Baese, U., Mo- lina, M.: Robust bioinspired architecture for optical flow com- putation. IEEE Trans. Very Large Scale Integration (VLSI) Syst. 18(4), 616–629 (2010) 7. Botella, G., Ros, E., Rodriguez, M., Garcia, A., Romero, S.: Pre- processor for bioinspired optical flow models: a customizable hardware implementation. Proceedings of the IEEE Mediterra- nean Electrotechnical Conference (MELECON), Málaga, Spain, 93–96 (2006). doi:10.1109/MELCON.2006.1653044 8. Bruhn, A., Weickert, J., Schnörr, C.: Lucas/Kanade meets Horn/ Schunck: combining local and global optic flow methods. Int. J. Comput. Vision 61(3), 211–231 (2005). doi:10.1023/B:VISI. 0000045324.43199.43 9. Davis, J., Bobick, A.: The representation and recognition of human movement using temporal templates. IEEE Computer Soc Conf Computer Vision Pattern Recognit (CVPR’97), 928–934 (1997). doi:10.1109/CVPR.1997.609439 10. Davis, J., Bradiski, G.: Real-time motion template gradients using Intel CVLib. In: proceedings of the ICCV Workshop on Frame- rate Vision, 1–20 (1999) 11. Ghilani, C.D.: General Least Squares Method and Its Application to Curve Fitting and Coordinate Transformations, in Adjustment Computations: Spatial Data Analysis (Fifth Edition). Hoboken, NJ, USA: Wiley, Inc. (2010). doi:10.1002/9780470586266.ch22 12. Gupte, S., Masoud, O., Martin, R.F.K., Apanikolopoulos, N.P.: Detection and classification of vehicles. IEEE Trans. Intell. Transp. Syst. 3(1), 37–47 (2002). doi:10.1109/6979.994794 13. Habib, Ayman F., Morgan, Michel F.: Automatic calibration of low-cost digital cameras. Opt. Eng. 42(4), 948–955 (2003). doi:10.1117/1.1555732 14. Hamid, N.F.A., Ahmad, A.: Calibration of high resolution digital camera based on different photogrammetric methods. Proceed- ings of the 8th International Symposium of the Digital Earth Fig. 11 Capture of the moving rigid object with the reference point identified worngly 844 J Real-Time Image Proc (2016) 11:829–846 123 http://dx.doi.org/10.1109/34.845377 http://dx.doi.org/10.1109/MELCON.2006.1653044 http://dx.doi.org/10.1023/B:VISI.0000045324.43199.43 http://dx.doi.org/10.1023/B:VISI.0000045324.43199.43 http://dx.doi.org/10.1109/CVPR.1997.609439 http://dx.doi.org/10.1002/9780470586266.ch22 http://dx.doi.org/10.1109/6979.994794 http://dx.doi.org/10.1117/1.1555732 (ISDE8), IOP Publishing on IOP Conf. Series. Earth Environ Sci 18, 1–6 (2014). doi:10.1088/1755-1315/18/1/012030 15. Kim, K., Chalidabhongse, T.H., Harwood, D., Davis, L.: Real- time foreground-background segmentation using codebook model. Real-Time Imaging 11(3), 167–256 (2005). doi:10.1016/j. rti.2004.12.004 16. Kraus K.: Photogrammetry Vol. 1. Fundamentals and Standard Processes (4th edition). Bonn, Germany: Ferdinand Dummlers (1993) 17. Li, R., Niu, X., Liu, C., Wu, B., Deshpande, S.: Impact of imaging geometry on 3d geopositioning accuracy of stereo ikonos imagery. Photogramm. Eng. & Remote Sens. 75(9), 1119–1125 (2009) 18. Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of the Sev- enth International Joint Conference on Artificial Intelligence, vol.2, pp. 674–679. Vancouver (1981) 19. Novak, K.: Rectification of digital imagery. Photogramm. Eng. and Remote Sens. 58(3), 339–344 (1992) 20. Petrie, G., El Niweri, A.E.H.: The applicability of space imagery to the small scale topographic mapping of developing countries: a case study—the Sudan. ISPRS J. Photogramm. Remote Sens. 47(1), 1–42 (1992). doi:10.1016/0924-2716(92)90002-Q 21. Soh, J., Chun, B.T., Wang, M.: Analysis of road image sequences for vehicle counting. IEEE Trans. Intell. Transp. Syst. 1, 679–683 (1995). doi:10.1109/ICSMC.1995.537842 22. Toutin, T.: Review article: Geometric processing of remote sensing images: models, algorithms and methods. Int. J. Remote Sens. 25(10), 1893–1924 (2004). doi:10.1080/014311603100010 1611 23. Tsai, R.Y.: A versatile camera calibration technique for high- accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses. IEEE J. Robot. Autom. 3(4), 323–344 (1987). doi:10.1109/JRA.1987.1087109 24. Weng, J., Cohen, P., Herniou, M.: Camera calibration with dis- tortion models and accuracy evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 14(10), 965–980 (1992). doi:10.1109/34. 159901 25. Wolberg, J.: Data analysis using the method of least squares: extracting the most information from experiments. Berlin and Heidelberg GmbH & Co. KG, Germany: Springer (2005) 26. Wong, K.W.: Basic mathematics of photogrammetry. In: Slama, C.C. (ed.) Manual of Photogrammetry (4th edition), pp. 37–101. ASP Publishers, Falls Church (1980) 27. Yan, Y., Yancong, S., Zengqiang, M.: Research on vehicle speed measurement by video image based on Tsai’s two stage method. In: proceedings of the 5th International Conference on Computer Science and Education, 502–506 (2010). doi:10.1109/ICCSE. 2010.5593565 28. Zhiwei, H., Yuanyuan, L., Xueyi, Y.: Models of vehicle speeds measurement with a single camera. Proceedings of the International Conference on Computational Intelligence and Security Workshops, 283–286 (2007). doi:10.1109/CISW.2007. 4425492 Danilo Filitto master in Com- puter Science by State Univer- sity of Maringá (UEM), postgraduate in Computer Net- works and Data Communication by State University of Paraná (UEP), bachelor in Computer Science by University of Oeste Paulista (UNOESTE). Acts in academia as a professor since 2006, teaching at the Union of Educational Institutions of São Paulo (UNIESP—Presidente Prudente Campus), and in the National Commercial Training Service (Senac—Presidente Prudente Campus). Area of Research/ Expertise: Software Development, Data Structure, Digital Image Processing, Computer Networks. Júlio Kiyoshi Hasegawa Degree in Cartographic Engi- neering by State University Paulista Júlio de Mesquita Filho (UNESP—Presidente Prudente Campus), master in Geodetic Sciences by Federal University of Paraná (UFPR) and doctorate in Electrical Engineering by State University of Campinas (UNICAMP). Currently acts as adjunct professor at the State University Paulista Júlio de Mesquita Filho (UNESP—Pres- idente Prudente Campus), with experience in Geosciences with an emphasis in Photogrammetry, acting on the following topics: phototriangulation, exterior orienta- tion, photogrammetry, digital photogrammetry and GPS. Airton Marco Polidório degree in Chemical Engineering from the State University of Maringá (UEM), master in Electrical Engineering and Industrial Informatics by Federal Techno- logical University of Paraná (UTFPR), and doctorate in Cartography by State University Paulista Júlio de Mesquita Filho (UNESP-Presidente Prudente Campus). Currently acts as adjunct professor at the State University of Maringá (UEM), with experience in Pattern Rec- ognition, Computer Vision, and Image Processing. J Real-Time Image Proc (2016) 11:829–846 845 123 http://dx.doi.org/10.1088/1755-1315/18/1/012030 http://dx.doi.org/10.1016/j.rti.2004.12.004 http://dx.doi.org/10.1016/j.rti.2004.12.004 http://dx.doi.org/10.1016/0924-2716(92)90002-Q http://dx.doi.org/10.1109/ICSMC.1995.537842 http://dx.doi.org/10.1080/0143116031000101611 http://dx.doi.org/10.1080/0143116031000101611 http://dx.doi.org/10.1109/JRA.1987.1087109 http://dx.doi.org/10.1109/34.159901 http://dx.doi.org/10.1109/34.159901 http://dx.doi.org/10.1109/ICCSE.2010.5593565 http://dx.doi.org/10.1109/ICCSE.2010.5593565 http://dx.doi.org/10.1109/CISW.2007.4425492 http://dx.doi.org/10.1109/CISW.2007.4425492 Nardênio Almeida Martins master in Electrical Engineering at Federal University of Santa Catarina (UFSC), and doctorate in Automation and Systems Engineering at Federal Univer- sity of Santa Catarina (UFSC). Currently acts as adjunct pro- fessor at the State University de Maringá (UEM), with research concentrated in the areas of control dynamical systems, robot manipulators, mobile robots and control mechatronic systems. Franklin César Flores doctor- ate in Electrical Engineering by State University of Campinas (UNICAMP), master in Com- puter Science by University of São Paulo (USP), and Bachelor in Computer Science by State University of Maringá (UEM). Currently acts as adjunct pro- fessor in the Informatics Department of the State Uni- versity of Maringa (DIN-UEM), with research interests in Digital Image Processing, Computer Vision and Mathematical Morphology. 846 J Real-Time Image Proc (2016) 11:829–846 123 Real-time velocity measurement to linear motion of a rigid object with monocular image sequence analyses Abstract Introduction Image segmentation based on the history of the values associated with pixels Reference model training Reference model Background detection Coordinate conversion Polynomial models Collinearity equations The system The experiment Procedures for conducting the experiment Results Computed results using photoelectric sensors Computed results based on collinearity equations and polynomial transformation method using automatic object segmentation Computed results based on collinearity equations and polynomial transformation method using manual object segmentation Analysis of the results Final considerations References