Effect of stochastic transition in the fundamental diagram of traffic flow

In this work, we propose an alternative stochastic model for the fundamental diagram of traffic flow with minimal number of parameters. Our approach is based on a mesoscopic viewpoint of the traffic system in terms of the dynamics of vehicle speed transitions. A key feature of the present approach lies in its stochastic nature which makes it possible to study not only the flow-concentration relation, namely, the fundamental diagram, but also its uncertainty, namely, the variance of the fundamental diagram \textemdash an important characteristic in the observed traffic flow data. It is shown that in the simplified versions of the model consisting of only a few speed states, analytic solutions for both quantities can be obtained, which facilitate the discussion of the corresponding physical content. We also show that the effect of vehicle size can be included into the model by introducing the maximal congestion density $k_{max}$. By making use of this parameter, the free flow region and congested flow region are naturally divided, and the transition is characterized by the capacity drop at the maximum of the flow-concentration relation. The model parameters are then adjusted to the observed traffic flow on the I-80 Freeway Dataset in the San Francisco area from the NGSIM program, where both the fundamental diagram and its variance are reasonably reproduced. Despite its simplicity, we argue that the current model provides an alternative description for the fundamental diagram and its uncertainty in the study of traffic flow.

In this work, we propose an alternative stochastic model for the fundamental diagram of traffic flow with minimal number of parameters. Our approach is based on a mesoscopic viewpoint of the traffic system in terms of the dynamics of vehicle velocity transitions. A key feature of the present approach lies in its stochastic nature which makes it possible to describe not only the flow-concentration relation, the so-called fundamental diagram in traffic engineering, but also its variance -an important ingredient in the observed data of traffic flow. It is shown that the model can be seen as a derivative of the Boltzmann equation when assuming a discrete velocity spectrum. The latter assumption significantly simplifies the mathematics and therefore, facilitates the study of its physical content through the analytic solutions. The model parameters are then adjusted to reproduce the observed traffic flow on the "23 de maio"highway in the Brazilian city of São Paulo, where both the fundamental diagram and its variance are reasonable reproduced. Despite its simplicity, we argue that the current model provides an alternative description for the fundamental diagram in the study of traffic flow.

I. INTRODUCTION
Aside from its complexity and nonlinearity, traffic flow modeling has long attracted the attention of physicists due to the connections to transport theory and hydrodynamics (For reviews, see for example [1][2][3][4]). Corresponding to the three main scales of observation in physics, traffic flow models can generally be categorized into three classes, namely, microscopic, mesoscopic and macroscopic approaches. The macroscopic models [5][6][7][8][9][10][11][12] describe the traffic flow at a high level of aggregation, the system is treated as a continuous fluid without distinguishing its individual constituent parts. In this approach, the traffic stream is represented in terms of macroscopic quantities such as flow rate, density and velocity. Many methods in the conventional hydrodynamics thus can be directly borrowed into the investigation of traffic flow. For instance, one may discuss shock waves [5,6], the stability of the equation of motion [8,11], or investigate the role of viscosity [12] analogous to those for real fluid. Mathematically, the problem is thus expressed in terms of a system of partial differential equations. The microscopic approach, on the other hand, deals with the space-time behavior of each individual vehicle as well as their interactions at the most detailed level. In this case, an ordinary differential equation is usually written down for each vehicle. Owing to its mathematical complexity, approximation is commonly introduced in order to obtain asymptotic solutions or to make the problem less computational expensive. The car-following models [13][14][15][16][17][18][19] and the cellular automation [20][21][22][23] both can be seen as microscopic approaches in this context. For certain cases, such as Greenberg's logarithmic model [7,14], the above two approaches were shown to be equivalent in reproducing fundamental diagram of traffic flow. A mesoscopic model [24][25][26][27] seeks compromise between the microscopic and the macroscopic approaches. The model does not attempt to distinguish nor trace individual vehicles, instead, it describes traffic flow in terms of vehicle distribution density as a continuous function of time, spatial coordinates and velocities. The dynamics of the distribution function, following methods of statistical mechanics [28], is usually determined by an integro-differential equation such as the Boltzmann equation. Most mesoscopic models are derived in analogy to gas-kinetic theory. As it is known that hydrodynamics can be obtained by using Boltzmann equation [29][30][31], mesoscopic model for traffic flow has also been used to obtain the corresponding macroscopic equations [10,12]. These efforts thus provide a sound theoretical foundation for macroscopic models, besides heuristic arguments and lax analogies between traffic flow and ordinary fluids.
One important empirical measurement for long homogeneous freeway system is the so called "fundamental diagram"of traffic flow. It is plotted in terms of vehicle flow q as a function of vehicle density k: In a macroscopic theory, when the dynamics of the system is determined by an Euler-like or Navier-Stokes-like equation of motion, the fundamental diagram can be derived. Alternatively, one may use the fundamental diagram as an input together with the conservation of vehicle flow and the initial conditions to determine the temporal evolution of the system. Also, the equation of motion of either the microscopic or the mesoscopic model can be employed to calculate the fundamental diagram. The resulting theoretical estimations from any of the above approaches then can be used to compare to the empirical observations which have been accumulated on highways in different countries for nearly 8 decades (see for instance ref. [2,32,33]). The following common features are observed in most data: (1) Usually the flow-concentration curve is divided into two different regions of lower and higher vehicle density respectively, which correspond to "free"and "congested"flow; (2) The maximum of the flow occurs at the junction between free and congested region and (3) Congested flow in general presents a broader scattering of the data points on the flow-concentration plane, in comparison to that of the free flow. In other word, the variance of flow for free traffic flow is relatively small, it increases as the vehicle density increases, and eventually the system becomes unstable or chaotic toward the onset of traffic congestion. For this very reason, it is understood by the many authors that the transition from free traffic to congestion is a phase transition. Though most traffic flow models are able to reproduce the main features of the observed fundamental diagram, the variance of the traffic flow is somehow less-discussed in literature, and in particular, in terms of analytic model accompanied by data analysis. This motivated us to carry out the present study. In this work, we introduce a simple mesoscopic model for the traffic flow by the method of stochastic differential equation (SDE). The equation of motion of the model governs the distributions of vehicles among different velocity states. In addition to the conventional transition terms, stochastic transition is introduced in order to describe the stochastic nature of traffic flow. We show that in our model analytic solutions can be obtained not only for the expected value of velocity and traffic flow, but also for their variances. These analytic solutions are then used to compared to the empirical data. The paper is organized as follows, in the next section, we introduce our transport model which features a velocity spectrum and the corresponding transitions dynamics among different velocity states. To show the essence of the approach, the model is then simplified to consider only two velocity states. The resulting two-velocity-state model is discussed in detail in section III, where we derive the analytic solutions for the flowconcentration curve and its variance. The physical content of these solutions is discussed. In order to compare to the data, a chi-square fitting is carried out for model parameters in section IV and the results are presented together with the measured data of "23 de maio"highway of Brazilian city of São Paulo. The last section is devoted to the conclusion remarks and perspectives.

II. A STOCHASTIC TRANSPORT MODEL WITH DISCRETE VELOCITY SPECTRUM
Let us consider a section of highway where the spatial distribution of the vehicles is homogeneous. For simplicity, one only considers discrete values for velocity, namely v 1 , v 2 , · · · , v D and denotes the number of vehicles conducting at velocity state v i by n i . In time, a vehicle may with velocity v i may transit to another state v j according to the following set of SDE [34] where the velocity transition on the r.h.s. of the equation is a summation of two contributions: besides the determined transition measured by the transition rate c ij , one also allows for some randomness by introducing the stochastic transition rates s j , where w j is white noise. To be specific, c ij measures the average rate for a vehicle with velocity v j to transit to another state with velocity v i . For j = i, the coefficient −c ii gives the rate of vehicles originally travelling with v i changes its velocity. It is noted that these transition rates are not necessarily constants. In this model, the differential formalism is to be understood in terms of the Itô interpretation [34]. The physical content of these transition coefficients is further discussed below in Section IV. It is obvious that for any physical solution, n i must be bounded from above and below. In our model, the measured vehicle velocity v is defined by Consequently, the traffic flow q is defined as the product of velocity v and vehicle (linear) density k as follows One notes that the transition coefficients c ij can be seen as the the element in the i-th row and j-th column of a D × D matrix c. When one is only interested in the time evolution of the expected value of n i , the stochastic transition terms can usually be ignored (see for example [34] for the discussion of its condition) and consequently Eq.(2) can be written as Now, for a closed road system (a system which satisfies periodic boundary condition), the total number of vehicles is conserved, i.e.
By using Eq. (6), it is straightforward to show that which means that the matrix c ij is singular. Therefore, one may explicitly express n D in terms of n i (i = 1, · · · , D − 1) and rewrite the equations for n i (i = 1, · · · , D − 1) in terms of the first D − 1 rows of c ij , namely wherec ij ≡ c ij − c iD . Similarly, one may again viewc ij as the i-th row and j-th column of a (D − 1) × (D − 1) matrixc and rewrite the equation as is non-singular, and the number of degree of freedom of the system is thus D − 1.
As an example, let us discuss the case where all the elements of the matrix c are constant. The resulting problem is to diagonalize the (D −1)×(D −1) matrixc. A necessary condition to have physical solution is, therefore, that all eigen-values of the matrixc are negative. This is because, any positive eigen-value would imply that the vehicle number of some state increases unboundedly in time, which is not physical due to the vehicle number conservation introduced in Eq. (6). It can be shown that this model is related to the Boltzmann equation approach, we attach a brief disscussion on the connection of the present model to the discrete limit of the Boltzmann equation in Appendix A.

III. A SIMPLIFIED TWO-VELOCITY-STATE MODEL
In order to show that the present model may describe the main features of the observed traffic flow data, we proceed to discuss the most simple case where one considers only two velocity states. The equation of motion of this simplified model reads where w 1 and w 2 are independent white noises, and again the Itô formulae are assumed. One has n 1 + n 2 = N, p 11 = p 21 and p 22 = p 12 due to the normalization condition, Eqs.(6-7). It is noted that a factor N α is introduced in the transition coefficients c 12 and c 22 to measure the asymmetry between the two velocity states, with α being an adjustable parameter. As discussed in the following, α > 1 will be taken. The form of Eq.(10) is common in the application of SDE [34,35]. Due to the simplicity of the model, one is allowed to obtain its analytic solutions straightforwardly. First, by ignoring the stochastic transition terms, the solution of the expected value of n i reads where n 1 (0) ≡ n 1 (t = 0). The steady state solution is obtained by taking the limit t → ∞ The measured vehicle velocity v in this case is and its expected value reads In literature, the above result is usually expressed in term of traffic flow q as a function of vehicle density k, the latter can be written as where L is the length of the highway section in question. Therefore, by using Eq.(4) one obtains the expected value of traffic flow q as the following One sees that when α > 1, the vehicles have a tendency to transition to the low velocity state when the total number N increases, which is consistent with the common sense: the average velocity tends to decrease when the load on the highway system becomes heavier.
In the left panel of Fig.1, we show a sketch plot of the resulting fundamental diagram of the two-velocity-state model, obtained by assuming a set of rather trivial parameters in Eqs. (14)(15)(16). It is shown that main feature of the flow concentration curve is naturally reproduced. The parameter α controls the shape of the curve, and it seems α > 3 mostly gives qualitatively good agreement to the data which was also observed in many other cases [32,33].
As mentioned above, one important feature of the current approach is that the stochastic transitions introduce uncertainties in the vehicle numbers, which thereafter cause the vehicle flow to fluctuate around its mean value. The main features of the uncertainties of the traffic flow in data is well known [32,33], but somehow less-discussed on the theoretical side. In the present model we are able to calculate the variance of the flow-concentration curve analytically. Following the standard procedure of Itô calculus (see Appendix B for details), one finds the variance of the measured vehicle velocity satisfies As observed in data, the variance of the velocity is very small at small concentration. This can be understood as follows, when there are very few vehicles on the highway, all of them tend to move at the upper speed limit, thus the variance is negligible. In our model, for α ≥ 0, one finds that the variance goes to zero when k → 0. If α < 0, the variance will diverge at small concentration which makes the model unrealistic. On the other hand, at very high density, all the vehicles tend to occupy the lowest velocity state corresponding to v 1 . It is easy to imagine that this corresponds to the case of a complete traffic jam, when all the vehicles are forced to stop, and consequently the variance of the velocity also goes When α > 1, one has k c . This is another motive of our choice of the value of α. It is worth noting that the above features of our model come out quite naturally. In the right panel of Fig.1, we show a sketch plot for the variance of the traffic flow in our model.

IV. MODEL CALIBRATION AND DATA ANALYSIS
In this section, the parameters of our model are adjusted to reproduce the observed flowconcentrations data and its variance. We make use of the data from the "23 de maio"highway in the Brazilian city of São Paulo [36], collected by the company of traffic engineering (Companhia de Engenharia de Tráfego, CET) of the state of São Paulo between 2009 and 2010. The data were collected by the speed sensors for the time interval between 07:00 am to 19:00 pm. Since the size of the vehicle is also measured, we only include in our analysis vehicles between 3 to 5 meters in their length. The reason for this is that in São Paulo, small size (such as motorcycles) and large size vehicles (such as commuter buses) may behave very differently from others automobiles. In Fig.2 and Fig.3, we show the raw data as well as the results from the two-velocity-state model. The resulting curves of our model are obtained by using the parameters in Table  I, where chi-square fitting was employed. One sees that the data show the main features observed in fundamental diagram: flow increases from zero when the density of the vehicles increases, it hits the maximum then starts to decrease; meanwhile, the flow variance also increases from zero with increasing density while the traffic starts to build up, it attains its maximum at a bigger density value than that of the flow. Unfortunately, the data set employed in this study has little statistics at big density region, as both the flow and its variance show bigger uncertainty when k ≥ 220 (1/km).

Concentration (1/km
In Fig.3, the blue curves and error bars show the results from our model. One sees that the data is well reproduced, their qualitative trend is consistent with the discussions in Section III. It is noticed that at high density, after the flow reaches the maximum and starts to decrease, it increases again round the density k ∼ 200 (1/km). This feature of the data is somewhat peculiar but similar to what was described in ref. [37] (see Fig.3 of the reference) in a different context. Since it is difficult to draw any conclusion unless one has better statistics, we implemented two different schemes in our model calibration. The first set of parameters P1 treats v 1 as a free parameter, and as a result, the increase of the flow at high density is reproduced by the fit. In the second set of parameters P2, one considers that the increase of flow at large density is purely due to fluctuation and the lack of statistics, therefore one simply assumes that v 1 = 0, so that the flow will always decrease once it reaches its maximum. It is found that the parameter set P1 agrees better with the empirical measurements. The set P2 shows some difficulty in reproducing the data, especially for the variance of the velocity in the region with larger concentration. However, taking into account of the fact that the model only considers two velocity states, one may argue that the general trend of the data is reproduced reasonably well. It is also interesting to note that the lack of statistics at large density is partially due to the fact that the time window of the data set does not include the whole period of peak hours when the congestion usually takes place. It is therefore intriguing to do further analysis by using data set which includes the measurements at peak hours.

V. CONCLUSIONS REMARKS
In this work, an alternative stochastic transport model is proposed to calculate the fundamental diagram of traffic flow and its variance. In order to show the physical content of our approach more transparently, we have focused on a simplified version of the model with minimal number of parameters. It is shown that even in this two-velocity-state model, the stochastic nature of the model helps to capture the main features of the observed flowconcentration data. It is worth noting that in order to describe the transition from "free flow"to "congested flow", the current model does not introduce different rules for different regions of the flow. The "congested phase"in our model comes out natural as the variance of the traffic flow grows with increasing flow, by solving a uniform set of SDE. The model is then put to the test by calibration for the observed traffic flow data on the "23 de maio"highway in the Brazilian city of São Paulo, where both the fundamental diagram and its variance are reasonable reproduced. SDE has lots of applications in applied mathematics, to the best of our knowledge, there is very little application found in the area of traffic flow. In fact, this is the very motivation of our work. The nature of "randomness"or "stochastic process"in the traffic flow was explored by many authors, but most of the approaches do not involve the mathematical concept of "signal noise"(such as white noise or Brown noise which are more realistic in modeling random process by their nature) and are usually heavily based on numerical solutions. In the present work, the approach is based on an analytic solution for the expected value and variance of SDE in Ito interpretation. Though the stochastic transition term plays an important role in this study, we have deliberately avoided the discussion of some mathematical aspects of our approach. For instance, the simple form of white noise is adopted to describe the uncertainty in vehicle transitions, where the corresponding SDE is interpreted in terms of Itô formulae. However, it is well known that there are various forms of stochastic noises in stochastic calculus, and in fact it is not clear if other form of stochastic noise might be more appropriate for the description of the physical system. We have neither discussed whether Stratonovich stochastic calculus could be more convenient to our investigation. However, we argue that these features are not the focus of the present study, it is because the main goal of this work is to study qualitatively the effect of stochastic transition, by employing a simple model with the minimum number of parameters. Therefore, as a first step, it is of higher priority to reproduce the main feature of the observed flow-concentration curve as well as its variance. In order to understand the physical content of the model more transparently, it is worthwhile to the simplify the mathematics.
One important aspect which has not been discussed in the work is the stability of the traffic system. It was understood by many authors that the traffic congestion is closely connected to the stability of the equation of motion [8,19,38], or to the phase transition of the system [1,21,39,40]. Since our approach itself involves a mathematical description of uncertainty, it is quite natural to ask whether the question of stability of SDE [41][42][43][44] is related to that of traffic congestion. It is an interesting topic for further investigation.

VI. ACKNOWLEDGMENTS
Wei-Liang Qian is thankful for valuable discussions with Marcio Maia Vilela, José Aquiles Baesso Grimoni, Pasi Huovenin and Yogiro Hama. We are grateful to the company of traffic engineering (Companhia de Engenharia de Tráfego, CET) of the state of São Paulo who has generously provided us the data used in our analysis. We acknowledge the financial support from Fundação de Amparoà Pesquisa do Estado de São Paulo (FAPESP), Fundação de Amparoà Pesquisa do Estado de Minas Gerais (FAPEMIG), Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), and Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES).

A. The connection to the Boltzmann Equation
In this section, we discuss the connection of the present model to a more general approach, the Boltzmann equation. In fact, the present model can be viewed as a special case of the gas-kinetic theory. The non-relativistic Boltzmann equation [25] reads where the number density f = f (x, p, t) of the vehicles at position x and momentum p changes in time because (1) their positions change due to the free streaming at velocity v = dx/dt; (2) their momenta are gradually altered by external force F = dp/dt and (3) their momenta are changed due to collisions. The collision terms are usually different for different models. In the study of traffic flow, the external force F is mostly ignored. Due to its non-linear nature, it is rather difficult to solve the Boltzmann equation even numerically. In ref. [24], for instance, only the steady state solution is discussed, where the solution is time independent f = f (x, p). In this work, on the other hand, owing to the nature of the data we are to compare, we content ourselves with the space independent solution f = f (p, t), i.e., the homogeneous solution. As a first approximation, one may only consider two velocity states with v 1 < v 2 . This simplifies the integro-differential equation to two coupled ordinary differential equations, namely If one further assumes that the collision terms on the right hand side of the equation consist of two parts: the determined transition rates and the stochastic transition rates, Eq.(20) is reduced to Eq.(10) in section III.
Comparing to the Boltzmann approach in literature, our model is in fact different in some aspects. Firstly, we do not have the relaxation term on the r.h.s. of the Boltzmann equation as in [24]. Instead, the convergence of the solutions in time is achieved due to the negative definite matrixc introduced in Eq. (8). And as a result, the present model behave differently at small vehicle density in comparison with traditional Boltzmann approach [25]. In the latter case, the relaxation term mostly determines the velocity transiton which is not present in our model. In the second place, to calculate the variance of the velocity or flow in our model, one simply evaluates the variance of Eq.(3) or Eq.(4). This is also different from the traditional gas-kinetic approach such as ref. [10,26]. We understand these differences simply come from the different interpretations of the model.
The above resulting expression is partly due to the fact that Var[n By substituting the concentration k, one arrives at the expression of variance of flow Var[q] = (v 1 − v 2 ) 2 L 2 p 11 p 22 L α+1 k α+1 (p 11 + p 22 L α k α ) 2 used in Section III.