Contents lists available at ScienceDirect Fuel journal homepage: www.elsevier.com/locate/fuel Full Length Article Rapid and sensitive method for detecting adulterants in gasoline using ultra- fast gas chromatography and Partial Least Square Discriminant Analysis Maurílio Gustavo Nespecaa,⁎, João Fernando Villarrubia Lopes Munhoza, Danilo Luiz Flumignanb, José Eduardo de Oliveiraa a Center for Monitoring and Research of the Quality of Fuels, Biofuels, Crude Oil, and Derivatives (Cempeqc), Institute of Chemistry, São Paulo State University (UNESP), Prof. Francisco Degni 55, Zip Code 14800-060 Araraquara, SP, Brazil b Federal Institute of Education, Science and Technology of São Paulo (IFSP), Estéfano D’avassi 625, Zip Code 1.991-502 Matão, SP, Brazil G R A P H I C A L A B S T R A C T A R T I C L E I N F O Keywords: Gasoline adulteration Ultra-fast gas chromatography Partial Least Square Discriminant Analysis Fuel quality control Multivariate filters Selectivity ratio A B S T R A C T In the last years, the Brazilian Fuel Quality Monitoring Program drastically reduced the number of analyzed fuel samples as consequence of the current economic crisis in the country. The impoverishment of the monitoring program may lead to an increase in cases of gasoline adulteration, nonetheless, it also strengthens the search for faster and less costly methodologies for the fuel quality monitoring. Thus, this study aimed the development of a rapid analytical method to detect the adulteration of gasoline with organic solvents through ultra-fast gas chromatography with flame ionization detector (UFGC-FID) associated with the supervised pattern recognition method, Partial Least Square Discriminant Analysis (PLS-DA). The sample set consisted of 171 Brazilian common gasoline (i.e., with ethanol in its composition) and 171 adulterated gasoline prepared in laboratory using 19 different solvents in the concentration range of 2–10% (v/v). The chromatographic method required only 2.85 min and the chromatograms presented 125 peaks on average. The PLS-DA model was developed with 3 latent variables and provided correlation coefficients close to 0.99 and correct discrimination of 100% of cali- bration and validation samples. Therefore, the developed UFGC-FID/PLS-DA method provided a sensitive, fast and automated alternative method for the detection of adulterants in the monitoring of gasoline quality. https://doi.org/10.1016/j.fuel.2017.11.032 Received 1 September 2017; Received in revised form 8 November 2017; Accepted 9 November 2017 ⁎ Corresponding author. E-mail address: mauriliogn@iq.unesp.br (M.G. Nespeca). Fuel 215 (2018) 204–211 Available online 21 November 2017 0016-2361/ © 2017 Elsevier Ltd. All rights reserved. T http://www.sciencedirect.com/science/journal/00162361 https://www.elsevier.com/locate/fuel https://doi.org/10.1016/j.fuel.2017.11.032 https://doi.org/10.1016/j.fuel.2017.11.032 mailto:mauriliogn@iq.unesp.br https://doi.org/10.1016/j.fuel.2017.11.032 http://crossmark.crossref.org/dialog/?doi=10.1016/j.fuel.2017.11.032&domain=pdf 1. Introduction Brazil has experienced the biggest recession in its history and sev- eral sectors of production and services were inevitably affected by the economic crisis [1]. In 2015, the Fuel Quality Monitoring Program (PMQC) was also subject to budget cuts and, as consequence, the number of fuel samples analyzed per year reduced by 80% [2]. In the same year, the number of non-conforming samples, i.e. samples with physicochemical parameters outside the specifications regulated by the National Petroleum Agency (ANP), increased 0.7%, which is the largest increase since the appearance of the PMQC [3]. The composition of gasoline has great importance in engine per- formance and the emission of pollutants [4]. Gasoline is a complex mixture consisting mainly of paraffinic, olefinic and aromatic hydro- carbons ranging from 4 to 12 carbon atoms and, in a lower con- centration, substances containing oxygen and sulfur [5]. The final composition of the gasoline depends on several factors such as the nature of the crude oil, the process that was used to obtain it, the presence of additives (detergents, dispersants, octane improvers, etc) and the local legislation that specifies the maximum content of me- thanol, benzene, sulfur, aromatic and olefin hydrocarbons [6,7]. In Brazil, anhydrous ethanol is a mandatory additive to replace tetraethyl lead in gasoline composition [6,8] and its concentration varies between 20% (v/v) and 27% (v/v) according to economic and production factors [6,7,9]. Unfortunately, reducing fuel monitoring provides greater opportu- nities for illegal adulteration of gasoline. Due to the complexity of ga- soline composition, several miscible solvents can be added to this fuel without causing major changes in its physicochemical properties [10,11]. The gasoline adulteration is commonly practiced using low- cost solvents, such as kerosene, white spirit, naphtha, thinner, rubber solvent, or lower-value fuels such as diesel and ethanol, however, the use of hexane, toluene, and xylenes has also been reported by authors [6,12,13]. Therefore, an effective quality control of commercial gaso- line is highly necessary for today's economic scenario, since the addi- tion of these compounds or mixtures to gasoline can lead to engine malfunction, increase in fuel consumption, tax evasion, and environ- mental damage caused by the intensification of CO and NOx emissions [12–15]. The quality of the gasoline can be verified by several routine tests. For example, distillation curves may reveal whether the gasoline has been adulterated with higher or lower boiling solvents. According to Takeshita et al. (2008) [6], the presence of only 2% (v/v) of diesel fuel in gasoline composition can be easily detected by the final boiling point (FBP) of the distillation curve. However, a sample of gasoline can present non-conform results even without the addition of adulterants, as well as being adulterated and presenting physicochemical para- meters within specifications [16,17]. For this reason, the ANP invested in the strategy of adding an isotopic marker to the solvents that are sold in Brazil [12]. If the gasoline sample presented the marker in the result of the official gas chromatographic (GC) method, the adulteration of gasoline was done with some type of solvent [10]. Although the marker monitoring has been an efficient methodology, insertion of new mar- kers periodically into the solvent market requires large financial re- sources and, additionally, the marker detection is a laborious process since chromatographic analysis requires about 20min per sample [10,12,17]. Many researchers have dedicated efforts to develop alternative methods to improve the gasoline quality monitoring. Many of these methods are based on spectroscopic techniques, such as FTIR [9,14,17–19], NIR [9,20,21], HS-MS [5,20], NMR [22–24], Raman [25,26], or GC methods [11,12,16,27–33] associated with chemometric techniques. However, few studies are devoted to detecting adulterants in gasoline through alternative methods [12,14,16–18,27,34]. Tanaka et al. (2011) [34] attempted to discriminate adulterated and un- adulterated samples using the gasoline physicochemical parameters, such as atmospheric distillation temperatures, research octane number (RON) and composition, however, the SIMCA model correctly classified only 77.1% of the prediction samples. The FTIR/LDA method devel- oped by Pereira et al. (2006) [17] was able to correctly classify 96% of adulterated samples with four different solvents (kerosene, thinner, light and heavy naphtha). The FTIR/SIMCA method of Teixeira et al. (2008) [14] obtained satisfactory results with correct discrimination of 100% of samples adulterated with diesel, kerosene, turpentine, and thinner, in the concentration range of 0–50% (v/v). Skrobot et al. (2005, 2007) [16,27] carried out two studies on the identification of adulterants (thinner, kerosene, light and heavy naphtha) in gasoline using gas chromatography with flame ionization detector (GC-FID) combined with unsupervised (HCA and PCA) and supervised (KNN and LDA) pattern recognition methods. The GC-FID analysis required 75min for each sample and the supervised methods presented low sensitivity (89.3%), therefore, the method was unsuitable for routine quality monitoring [16,27]. Pedroso et al. (2008) [12] developed a sophisticated method for quantifying adulterants (white spirit, kero- sene, and paint thinner) using comprehensive two-dimensional gas chromatography with flame ionization detection (GC×GC-FID) and second-order multivariate calibration (multi-way partial least squares regression, N-PLS). The second-order data obtained from 40min chro- matographic analyses resulted in models with accuracy (RMSEP) ranged from 3.3% (v/v) to 8.2% (v/v), depending on the adulterant. In this study, no pattern recognition models were developed to detect the presence of adulterant prior to quantification [12]. The use of fast, automated and accurate methods for the routine analysis of fuel quality is crucial for the improvement of monitoring programs since there is a reduction in the consumption of analysis time, expenditures on inputs and manual labor. In this perspective, this paper proposes an analytical method for detection of several adulterants in gasoline using ultra-fast gas chromatography with flame ionization detector (UFGC-FID) associated with a supervised pattern recognition method, Partial Least Square Discriminant Analysis (PLS-DA). 2. Material and methods 2.1. Samples The gasoline samples used in this work were acquired from a la- boratory specialized in fuel analysis, Cempeqc (Center for Monitoring and Research of the Quality of Fuels, Biofuels, Crude Oil and Derivatives), which collects fuel samples in the state of São Paulo, Brazil. From 424 analyzed samples of gasoline, we selected 171 samples to construct a representative sample set of conforming and non- conforming gasoline. Since the physicochemical parameters may be out of specification even without the adulteration of gasoline, it was im- portant to use a substantial number of nonconforming samples to avoid false positives after modeling. The physicochemical parameters of the 171 selected samples and the number of nonconformities are shown in Table 1. Since the detection of low concentrations of solvent in gasoline is the greatest challenge for conventional methods [15], the adulterated samples set was prepared by adding solvents to the selected gasoline samples in the concentration range of 2–10% by volume. The adul- teration was carried out with hydrocarbon (7 aliphatics; 5 aromatics; 1 cyclic; and 5 mixtures of hydrocarbon groups) and oxygenated (ether and dialcohol) solvents (Table 2). Nine samples were prepared using each solvent, totaling 171 adulterated samples. 2.2. Ultra-fast gas chromatographic analysis Chromatograms of the 342 samples were obtained using an ultra- fast gas chromatograph (UFGC), Trace GC Ultra (Thermo Scientific), equipped with a direct resistive heating module, split/splitless injector, and high-frequency FID (300 Hz). The capillary chromatographic M.G. Nespeca et al. Fuel 215 (2018) 204–211 205 column used for separation was a Thermo Scientific PH5 (5% phenyl; 95% dimethylpolysiloxane, 5m×0.10mm id× 0.40 μm) surrounded by a resistor to provide high heating rates. The samples were injected by an AS3000 autosampler with a 5 μL syringe and the injection volume was 0.1 μL. The temperature of the injector and the detector were set at 270 °C. The flow of the gases used by the detector were 45mLmin−1 for hydrogen, 30 mLmin−1 for nitrogen (make-up gas), and 350mLmin−1 for synthetic air. Hydrogen was used as carrier gas due to the high diffusion coefficient and its flow was kept constant at 0.5 mLmin−1. As the samples were not diluted in any solvent prior to injection, the split rate used was 1:500. The column temperature programming started at 40 °C, held for 0.60min, and raised at a rate of 100 °Cmin−1 to 250 °C, maintained for 0.15min. The chromatographic analysis time was 2.85min per sample. 2.3. Chemometric analysis The data were collected using ChromQuest software (Thermo Scientific) and then transferred to Matlab 2013a (Mathworks) with PLS toolbox 7.3.1 (Eigenvector Research Inc.), where the alignment and chemometric analyses were performed using a Windows 8 PC running on an Intel Core i7 3632QM 2.20 GHz processor with 8.0 Gb of RAM. The obtained chromatograms were exported as vectors for the modeling and, as the data acquisition rate was about 10 points per second, each chromatogram consisted of 1711 variables. Thus, the combination of all vectors resulted in a 342 × 1711 data matrix. The first step prior to modeling was the alignment of the chroma- tograms. The displacements of the peaks were corrected using the Correlation Optimized Warping (COW) algorithm [35]. Firstly, the alignment requires a reference chromatogram containing peaks which show a relatively consistent maximum. The algorithm searches for peaks in the mean chromatogram by identification of positions with a clear inflection point as a peak maximum [36]. Then, the chromato- grams are divided into windows, which are aligned with the respective reference chromatogram windows. The size of the chromatograms will be extended or compressed according to the defined slack and, finally, the number of points of each window will be equalized to the reference window [35]. Different values for the slack and window size parameters were tested and evaluated visually to provide the best alignment. After the correction of peak displacement, the samples were divided into a calibration set (230 samples) for the development of the multi- variate model and validation set (112 samples) to test the model. The Onion algorithm was used to select the samples with less covariance (based on distance from the mean) for each set and, consequently, to obtain greater sample representativeness in both sets [37,38]. The 1:1 ratio of unadulterated and adulterated samples was maintained at ca- libration and validation. From the premise that gasoline samples are composed mainly of hydrocarbons in the range of C6–C12 [39] and that the compounds have different response factors using FID [40], the data were autoscaled prior to modeling to equalize the impact of all variables [41]. The Partial Least Square Discriminant Analysis (PLS-DA) was used to develop the pattern recognition model. The PLS-DA is based on the PLS algorithm, however, it uses the encoded classes as dependent vector, y, rather than continuous numbers such as the concentration of an analyte [42]. The PLS method maximizes the relationship between the dependent variable and the scores and, consequently, the latent variables (LV) represent the directions that best discriminate the classes. When the application involves discrimination of only two classes, the PLS1 method is employed. In this paper, we labeled the classes as −1 (unadulterated) and +1 (adulterated) to center the va- lues of y, making the algebraic derivations simpler, and the threshold Table 1 Physicochemical parameters of the gasoline samples, the number of nonconformities and specifications of ANP Regulation 40/2013. Parameter Method Unit Specification Gasoline samples set min max average standard deviation nonconforming samples Relative density D1298 g cm−3 not specified 0.7128 0.7765 0.7430 0.0075 Distillation curve 10% evaporated D86 °C 65.0, max. 46.3 60 53.9 1.6 50% evaporated °C 80.0, max. 69.5 74.3 72.1 0.9 90% evaporated °C 190, max. 134.6 180.9 158.8 6.9 Final boiling point °C 215, max. 168.8 265.6 202.4 8.7 8 Residue % v/v 2, max. 0 1.7 0.9 0.3 Octane number Motor octane number Correlation to D2699 D2700 – 82, min. 81.4 84.4 82.4 0.4 7 Research octane number – not specified 89.1 101.6 96.7 1.5 Anti-knocking index – 2, min. 86.2 92.1 89.6 0.8 1 Composition Benzene Correlation to D1319 % v/v 1.0, max. 0 0.6 0.4 0.1 Saturates % v/v not specified 37.7 76.1 48.2 5.9 Olefins % v/v 25, max. 0 22.6 12.5 4.7 Aromatics % v/v 35, max. 8.5 26.3 14.7 2.3 Anhydrous ethanol D5501 % v/v 25 ± 1 22 31 25.2 1.6 55 Table 2 Solvent used in adulterated gasoline sample set. Classification Solvent Hydrocarbon Aliphatic Light aliphatic DT solvent (Carbosolv) Light aliphatic SB solvent (Carbosolv) Medium aliphatic AD solvent (Carbosolv) Solvent A-4070 (Carbosolv) n-heptane 99.5% P.A (Vetec Química Fina) n-hexane 95% P.A (Vetec Química Fina) Aromatic Aromatic solvent AB-9 (Carbosolv) Benzolum crystallisabile (ECIBRAS) Toluene 99,9% P.A. (J.T. Baker) Xylol 98% P.A. (Vetec Química Fina) Ethylbenzene 99,8% P.A. (Acros Organic) Cyclic Cyclohexane 99% P.A. (Vetec Química Fina) Mixture White spirit (Thinsol Química) Halogen free solvent (Carbosolv) Rubber solvent Heavy naphtha Kerosene Oxygenated Ether Ethyl ether (Isofar) Dialcohol Ethylene glycol 99,5% P.A. (Carlo Erba) M.G. Nespeca et al. Fuel 215 (2018) 204–211 206 probability value was set at 50%. The number of LV was chosen based on root mean square errors of cross-validation (RMSECV) values of the two classes in order to minimize the prediction errors and avoid model overfitting [43,44]. More details on PLS-DA method can be found in the Ref. [42]. 3. Results and discussion 3.1. Samples According to the PMQC statistics, between 2007 and 2016, the average of nonconforming samples in the country was equal to 1.7% and the main nonconformities were related to the content of anhydrous ethanol (41%), distillation temperatures (31%), and octane number (11%) [3]. In this work, the sample set consisted of 38% of non- conforming samples and the nonconformities were 1% related to anti- knock index, 10% to motor octane number, 11% to final boiling point and 77% to anhydrous ethanol content. The selection of many non- conforming samples was due to the objective of this work, which is to identify the presence of adulterants regardless of the sample con- formity. In addition, ethanol was not used as an adulterant, since 19% of the samples used had an ethanol content above the amount specified by the legislation. Other works found in the literature have already shown the possibility of discriminating gasoline samples according to conformity through GC-FID [30] or mid-infrared [19] combined with the chemometric techniques, therefore, it was not the focus of the current paper. 3.2. UFGC-FID analysis 3.2.1. Chromatographic separation The main reason for the use of gas chromatography in solvent de- tection by the marker method is to obtain the information of individual compounds in gasoline [20,23]. However, a proper identification of the marker requires a good chromatographic resolution, which is not an easy task since gasoline is a mixture of hundreds of compounds. A good separation of the gasoline compounds by conventional GC requires a long analysis time, thus the detection of solvents in gasoline becomes a costly process [23]. Since the UFGC system uses smaller columns with reduced internal diameter and resistive heating system, the chromato- graphic analyses are much faster than conventional GC and provide satisfactory resolutions [45]. Unlike the detection of solvents in gaso- line by the marker method, chemometric tools allows the adulteration detection through the whole chromatogram, that is, there is no need for high-resolution [33]. The analyzed samples presented an average of 125 chromatographic peaks, i.e., many gasoline compounds were not separated by the de- veloped UFGC method since gasoline typically has more than 230 compounds [47]. Nevertheless, the class discrimination can be per- formed by pattern recognition methods even when there is no high chromatographic resolution. Therefore, a satisfactory separation of the gasoline compounds for the classification model could be achieved in only 2.85min through the UFGC-FID. 3.2.2. Peak alignment Variations in the retention time of the analytes are very common to occur in chromatographic analyses due to oscillations of instrumental parameters such as pressure, temperature, and gas flow [46]. Thus, the application of pattern recognition or regression methods is highly de- pendent on peak alignment. The peak alignment was performed by the COW algorithm and the mean chromatogram was taken as reference. Different values of COW parameters were evaluated, and the best alignment was obtained using slack and window size equal to 5 and 10, respectively. 3.2.3. Comparison between class chromatograms To evaluate the main differences between the chromatograms of adulterated and unadulterated samples, the mean chromatogram of each class was obtained, and the standard deviation of the means was calculated (Fig. 1). The standard deviation of the means of the classes Fig. 1. Means of the chromatograms of the adulterated samples and of the unadulterated samples and the standard deviation of the two means. M.G. Nespeca et al. Fuel 215 (2018) 204–211 207 reveals the peaks with greater difference of intensity between the classes. The peak with the greatest difference was related to the re- tention time of the ethanol (Fig. 1). This can be justified by the dilution of the ethanol present in the mixture when other solvents are added to the gasoline. The adulteration of gasoline using pure solvents, such as benzene, toluene, ethylbenzene and xylenes (BTEX), played an important role in verifying the sensitivity of the developed model because these com- pounds are important in gasoline composition and their sum may reach up to 35% by volume [7]. The addition of pure solvents to the gasoline resulted in the dilution not only of the ethanol but of the other gasoline compounds. Thus, the more intense peaks of the gasoline samples presented the highest standard deviation. 3.3. Chemometric analysis 3.3.1. Model development The classification model was developed using the PLS-DA method after the peak alignment by the COW algorithm. Although a 1:1 ratio of adulterated and unadulterated samples was used in the model devel- opment, this ratio is not a requirement to obtain a high sensitivity model. The steps of the PLS-DA method provide an adjustment ac- cording to the class distribution. The first step of the modeling is to estimate the values of y by the PLS method. Thereafter, assuming the values of y predicted have a normal distribution, the threshold value (discriminant function) is calculated by the Bayesian method to obtain the same prediction probability for both classes. If the number of samples in each group is different, the threshold value is adjusted [42,48,49]. As the classes were coded as −1 (unadulterated) and +1 (adulterated) and we used the same number of samples in each class, the threshold was close to zero (0.018). Autoscaling is a typical preprocessing of chromatographic data when FID is used as detector because different compounds usually present different response factors. In addition, autoscaling is useful to equalize compounds with different variances of concentration [41]. However, regions of the chromatogram where there is no retention of analytes, i.e. noisy variables for the model, are amplified using this preprocessing. Therefore, the initial variables of the chromatograms (between 0.00 s and 0.18 s) were excluded and the final matrix was constituted of 1602 variables. The number of latent variables (LV) was selected through the op- timal values for each class to provide the minor prediction error for both classes without generating an overfitting [44]. Fig. 2 reveals that 3 LVs is the most appropriate LV number for the modeling since the addition of more LV does not significantly reduce the values of RMSECV1 (unadulterated samples) and RMSECV2 (adulterated sam- ples). The scores plot with 3 LV (Fig. 3) shows that the classes were well- separated in the 1st LV, which explained 55.62% of the X-Block var- iance. The addition of the 2nd and 3rd LVs increased the distance be- tween the groups providing a better discrimination ability to the model. 3.3.2. Model validation The application of the alternative method in the analysis of un- known samples depends on its accuracy and representativeness. The accuracy of a PLS-DA model can be evaluated by the sensitivity (true positive rate), the prediction error values and the correlation coeffi- cients (r) [25,50]. Sensitivity equal to 100% indicates that the model correctly classified all samples, while low prediction errors and high correlation coefficients are indicative of how well fitted the model is. The results in Table 3 were obtained by the cross-validation (ve- netian blinds) and by the prediction of 112 samples (validation set) that were not used in the model development. In both validations, the model correctly predicted 100% of the samples and presented correlation coefficients above 0.98. Given that the sample set consisted of 171 gasoline samples and 19 different adulterants, the developed model was accurate representative for the monitoring of gasoline adulteration. However, the constant addition of new samples to the prediction model is an important practice to make the model more robust. 3.3.3. Evaluation of multivariate filters The exclusion of the variables between 0.00 and 0.18 s resulted in a less parsimonious modeling, but there was no significant reduction in RMSE values and increase in correlation coefficients. To reduce RMSE values and improve the model accuracy, multivariate filters were evaluated as preprocessing. Multivariate filters such as Orthogonal Signal Correction (OSC), Generalized Least Squares Weighting (GLSW) and External Parameter Orthogonalization (EPO) aim to increase model selectivity by reducing sources of unwanted variations or covariance structures. Each filter has a different mathematical approach and de- tailed information can be found in the Ref. [51–53]. Among the tested filters, only the EPO with two principal components improved the PLS- DA fit (rCV= 0.9878 and rpred= 0.9886), however, the increase in correlation coefficients was not significant. Since the application of these multivariate filters occurs in several stages, the calibration pro- cess and the prediction of new samples requires more computational processing and becomes more time-consuming. Therefore, we chose to use only autoscale preprocessing and the x variables after 0.19 s, since there was no increase in computational processing. 3.3.4. Model selectivity ratio The selectivity ratio for Y plot is an indicative of which variables are most important for the PLS1 discrimination. Firstly, the y values (classes) are used as target to obtain a single predictive target-projected through the PLS components. Then, the selectivity ratio plot is obtained by calculating the ratio between explained and residual variance of the x variables on the target-projected component [54]. The higher the ratio, the greater the influence of the variable in the prediction. The selectivity ratio plot, in Fig. 4, showed that the most important vari- ables were related to the retention time of the ethanol peak. The se- lectivity ratio also justifies the insignificant improvement when using multivariate filters (OSC, GLSW or EPO) since most of the x variables were useful in prediction. The variables after 2min did not present high selectivity for Y, so the time of chromatographic analysis can be re- duced by increasing the heating rate after 2min without decreasing the performance of the model. 3.3.5. Model limitations Although the sample set used in this study is representative in view of the practice of adulteration of gasoline in Brazil, the classificationFig. 2. Latent variable number versus RMSECV values plot. M.G. Nespeca et al. Fuel 215 (2018) 204–211 208 model is limited to the physicochemical parameters of the calibration samples, to the solvents used in the adulteration and to the Brazilian gasoline composition, which has anhydrous ethanol. Changes in the specifications of gasoline physicochemical parameters will require a revalidation of the model. In addition, new samples should be peri- odically added to the model to make it more robust and up-to-date with the fuel trade. The limitations mentioned above will be present in prediction models independently of the instrumental technique used. In this per- spective, the use of UFGC-FID associated to the PLS-DA method has advantages such as speed, automation and sensitivity for the detection of adulterants in commercial gasoline. Therefore, the method proposed in this work is suitable for the routine monitoring of gasoline quality. 4. Conclusion Ultra-fast gas chromatography associated with Partial Least Square Discriminant Analysis provided a rapid and sensitive analytical method for detecting adulterants in gasoline. Through chromatographic ana- lyses of only 2.85min, it was possible to correctly discriminate 100% of the adulterated samples from the unadulterated samples. The method was sensitive to 19 different solvents in concentrations between 2 and 10% (v/v) and the discrimination ability was independent of the non- conformities of the gasoline samples. According to the selectivity ratio of the x variables for the prediction of Y, the time of chromatographic analysis can be further reduced since the x variables after 2min are not Fig. 3. Scores plot from PLS-DA model. Table 3 Parameters of the PLS-DA model for adulterated and unadulterated gasoline dis- crimination. Parameter Value Variables 1602 X block variance captured 73.77% Y block variance captured 48.93% Latent variables 3 unadulterated adulterated Sensitivity (cal) 1.000 1.000 Sensitivity (CV) 1.000 1.000 Sensitivity (pred) 1.000 1.000 RMSEC 0.5096 0.5010 RMSECV 0.5072 0.5055 RMSEP 0.4990 0.5128 r (cal) 0.9893 0.9893 r (CV) 0.9871 0.9871 r (pred) 0.9883 0.9883 Fig. 4. Selectivity ratio for each variable from PLS-DA model. M.G. Nespeca et al. Fuel 215 (2018) 204–211 209 significantly useful for class discrimination. In view of the current need to implement a fast and automated method for the detection of adul- terants in gasoline, the method proposed here is fully adequate for the monitoring of Brazilian gasoline quality. Acknowledgements The authors would like to thank CAPES and CNPq for providing academic scholarships, Fundunesp for financial support and Cempeqc for providing samples and equipment for analyses. References [1] Irredeemable? Econ 2016:1–9. https://www.economist.com/news/briefing/ 21684778-former-star-emerging-world-faces-lost-decade-irredeemable. [2] ANP. Boletim de Monitoramento da Qualidade dos Combustíveis 2017. http:// www.anp.gov.br/wwwanp/publicacoes/boletins-anp/2388-pmqc-edicoes- anteriores. [3] ANP. Anuário estatístico brasileiro do petróleo, gás natural e biocombustíveis: 2016. Rio de Janeiro: 2016. [4] Wang Y, Rong Z, Qin Y, Peng J, Li M, Lei J, et al. The impact of fuel compositions on the particulate emissions of direct injection gasoline engine. Fuel 2016;166:543–52. http://dx.doi.org/10.1016/j.fuel.2015.11.019. [5] Ferreiro-González M, Ayuso J, Álvarez JA, Palma MG, Barroso C. New headspace- mass spectrometry method for the discrimination of commercial gasoline samples with different research octane numbers. Energy Fuels 2014;28:6249–54. http://dx. doi.org/10.1021/ef5013775. [6] Takeshita EV, Rezende RVP, de Souza SMAGU, de Souza AAU. Influence of solvent addition on the physicochemical properties of Brazilian gasoline. Fuel 2008;87:2168–77. http://dx.doi.org/10.1016/j.fuel.2007.11.003. [7] ANP. Resolução ANP No 40 DE 25/10/2013 2013. [8] Da Silva R, Cataluña R, Menezes EW De, Samios D, Piatnicki CMS. Effect of ad- ditives on the antiknock properties and Reid vapor pressure of gasoline. Fuel 2005;84:951–9. http://dx.doi.org/10.1016/j.fuel.2005.01.008. [9] Da Silva MPF, Brito LRE, Honorato FA, Paim APS, Pasquini C, Pimentel MF. Classification of gasoline as with or without dispersant and detergent additives using infrared spectroscopy and multivariate classification. Fuel 2014;116:151–7. http://dx.doi.org/10.1016/j.fuel.2013.07.110. [10] de Paulo JM, Mendes G, Barros JEM, Barbeira PJS. A study of adulteration in ga- soline samples using flame emission spectroscopy and chemometrics tools. Analyst 2012;137:5919–24. http://dx.doi.org/10.1039/c2an35441a. [11] Ugena L, Moncayo S, Manzoor S, Rosales D, Cáceres JO. Identification and dis- crimination of brands of fuels by gas chromatography and neural networks algo- rithm in forensic research. J Anal Methods Chem 2016;2016. http://dx.doi.org/10. 1155/2016/6758281. [12] Pedroso MP, de Godoy LAF, Ferreira EC, Poppi RJ, Augusto F. Identification of gasoline adulteration using comprehensive two-dimensional gas chromatography combined to multivariate data processing. J Chromatogr A 2008;1201:176–82. http://dx.doi.org/10.1016/j.chroma.2008.05.092. [13] Ré-Poppi N, Almeida FFP, Cardoso CAL, Raposo Jr. JL, Viana LH, Silva TQ, et al. Screening analysis of type C Brazilian gasoline by gas chromatography – flame ionization detector. Fuel 2009;88:418–23. http://dx.doi.org/10.1016/j.fuel.2008. 10.014. [14] Teixeira LSG, Oliveira FS, dos Santos HC, Cordeiro PWL, Almeida SQ. Multivariate calibration in Fourier transform infrared spectrometry as a tool to detect adul- terations in Brazilian gasoline. Fuel 2008;87:346–52. http://dx.doi.org/10.1016/j. fuel.2007.05.016. [15] Mabood F, Gilani SA, Albroumi M, Alameri S, Al Nabhani MMO, Jabeen F, et al. Detection and estimation of Super premium 95 gasoline adulteration with Premium 91 gasoline using new NIR spectroscopy combined with multivariate methods. Fuel 2017;197:388–96. http://dx.doi.org/10.1016/j.fuel.2017.02.041. [16] Skrobot VL, Castro EVR, Pereira RCC, Pasa VMD, Fortes ICP. Identification of adulteration of gasoline applying multivariate data analysis techniques HCA and KNN in chromatographic data. Energy Fuels 2005;19:2350–6. http://dx.doi.org/10. 1021/ef050031l. [17] Pereira RCC, Skrobot VL, Castro EVR, Fortes ICP, Pasa VMD. Determination of gasoline adulteration by principal components analysis-linear discriminant analysis applied to FTIR spectra. Energy Fuels 2006;20:1097–102. http://dx.doi.org/10. 1021/ef050203e. [18] Al-Ghouti MA, Al-Degs YS, Amer M. Determination of motor gasoline adulteration using FTIR spectroscopy and multivariate calibration. Talanta 2008;76:1105–12. http://dx.doi.org/10.1016/j.talanta.2008.05.024. [19] Khanmohammadi M, Garmarudi AB, Ghasemi K, De La Guardia M. Quality based classification of gasoline samples by ATR-FTIR spectrometry using spectral feature selection with quadratic discriminant analysis. Fuel 2013;111:96–102. http://dx. doi.org/10.1016/j.fuel.2013.04.001. [20] Ferreiro-González M, Ayuso J, Álvarez JA, Palma M, Barroso CG. Gasoline analysis by headspace mass spectrometry and near infrared spectroscopy. Fuel 2015;153:402–7. http://dx.doi.org/10.1016/j.fuel.2015.03.019. [21] Balabin RM, Safieva RZ. Gasoline classification by source and type based on near infrared (NIR) spectroscopy data. Fuel 2008;87:1096–101. http://dx.doi.org/10. 1016/j.fuel.2007.07.018. [22] Monteiro MR, Ambrozin ARP, Liao LM, Boffo EF, Tavares LA, Ferreira MMC, et al. Study of Brazilian gasoline quality using hydrogen nuclear magnetic resonance (1H NMR) spectroscopy and chemometrics. Energy Fuels 2009;23. http://dx.doi.org/ 10.1021/ef800436p. [23] Flumignan DL, Boralle N, De Oliveira JE. Screening Brazilian commercial gasoline quality by hydrogen nuclear magnetic resonance spectroscopic fingerprintings and pattern-recognition multivariate chemometric analysis. Talanta 2010;82:99–105. http://dx.doi.org/10.1016/j.talanta.2010.04.002. [24] Kaiser CR, Borges JL, dos Santos AR, Azevedo DA, DAvila LA. Quality control of gasoline by 1H NMR: aromatics, olefinics, paraffinics, and oxygenated and benzene contents. Fuel 2010;89:99–104. http://dx.doi.org/10.1016/j.fuel.2009.06.023. [25] Li S, Dai LK. Classification of gasoline brand and origin by Raman spectroscopy and a novel R-weighted LSSVM algorithm. Fuel 2012;96:146–52. http://dx.doi.org/10. 1016/j.fuel.2012.01.001. [26] Flecher PE, Welch WT, Albin S, Cooper JB. Determination of octane numbers and Reid vapor pressure in commercial gasoline using dispersive fiber-optic Raman spectroscopy. Spectrochim Acta Part A Mol Biomol Spectrosc 1997;53A:199–206. http://dx.doi.org/10.1016/S1386-1425(97)83026-0. [27] Skrobot VL, Castro EVR, Pereira RCC, Pasa VMD, Fortes ICP. Use of principal component analysis (PCA) and linear discriminant analysis (LDA) in gas chroma- tographic (GC) data in the investigation of gasoline adulteration. Energy Fuels 2007;21:3394–400. http://dx.doi.org/10.1021/ef0701337. [28] Pierce KM, Hope JL, Johnson KJ, Wright BW, Synovec RE. Classification of gasoline data obtained by gas chromatography using a piecewise alignment algorithm combined with feature selection and principal component analysis. J Chromatogr A 2005;1096:101–10. http://dx.doi.org/10.1016/j.chroma.2005.04.078. [29] Watson NE, VanWingerden MM, Pierce KM, Wright BW, Synovec RE. Classification of high-speed gas chromatography-mass spectrometry data by principal component analysis coupled with piecewise alignment and feature selection. J Chromatogr A 2006;1129:111–8. http://dx.doi.org/10.1016/j.chroma.2006.06.087. [30] Flumignan DL, Tininis AG, Ferreira F de O, de Oliveira JE. Screening Brazilian C gasoline quality: application of the SIMCA chemometric method to gas chromato- graphic data. Anal Chim Acta 2007;595:128–35. http://dx.doi.org/10.1016/j.aca. 2007.02.049. [31] Flumignan DL, de Oliveira Ferreira F, Tininis AG, de Oliveira JE. Multivariate ca- librations in gas chromatographic profiles for prediction of several physicochemical parameters of Brazilian commercial gasoline. Chemom Intell Lab Syst 2008;92:53–60. http://dx.doi.org/10.1016/j.chemolab.2007.12.003. [32] Rudnev VA, Boichenko AP, Karnozhytskiy PV. Classification of gasoline by octane number and light gas condensate fractions by origin with using dielectric or gas- chromatographic data and chemometrics tools. Talanta 2011;84:963–70. http://dx. doi.org/10.1016/j.talanta.2011.02.049. [33] Parastar H, Mostafapour S, Azimi G. Quality assessment of gasoline using com- prehensive two-dimensional gas chromatography combined with unfolded partial least squares: a reliable approach for the detection of gasoline adulteration. J Sep Sci 2016;39:367–74. http://dx.doi.org/10.1002/jssc.201500720. [34] Tanaka GT, De Oliveira Ferreira F, Ferreira da Silva CE, Flumignan DL, De Oliveira JE. Chemometrics in fuel science: demonstration of the feasibility of chemometrics analyses applied to physicochemical parameters to screen solvent tracers in Brazilian commercial gasoline. J Chemom 2011;25:487–95. http://dx.doi.org/10. 1002/cem.1394. [35] Nielsen NPV, Carstensen JM, Smedsgaard J. Aligning of single and multiple wa- velength chromatographic profiles for chemometric data analysis using correlation optimised warping. J Chromatogr A 1998;805:17–35. http://dx.doi.org/10.1016/ S0021-9673(98)00021-1. [36] Eigenvector Research. Registerspec 2009. [37] Sousa AG, Ahl LI, Pedersen HL, Fangel JU, Sørensen SO, Willats WGT. A multi- variate approach for high throughput pectin profiling by combining glycan mi- croarrays with monoclonal antibodies. Carbohydr Res 2015;409:41–7. http://dx. doi.org/10.1016/j.carres.2015.03.015. [38] Shrestha S, Deleuran L, Gislum R. Classification of different tomato seed cultivars by multispectral visible-near infrared spectroscopy and chemometrics. J Spectr Imaging 2016;5:a1. http://dx.doi.org/10.1255/jsi.2016.a1. [39] Wiedemann LSM, D’Avila LA, Azevedo DA. Adulteration detection of Brazilian gasoline samples by statistical analysis. Fuel 2005;84:467–73. http://dx.doi.org/10. 1016/j.fuel.2004.09.013. [40] Scanlon JT, Willis DE. Calculation of flame ionization detector relative response factors using the effective carbon number concept. J Chromatogr Sci 1985;23:333–40. http://dx.doi.org/10.1093/chromsci/23.8.333. [41] Gemperline P. Practical Guide to Chemometrics, second ed., 2006. doi: 10.1201/ 9781420018301. [42] Brereton RG, Lloyd GR. Partial least squares discriminant analysis: taking the magic away. J Chemom 2014;28:213–25. http://dx.doi.org/10.1002/cem.2609. [43] Hawkins DM. The problem of overfitting. J Chem Inf Comput Sci 2004;44:1–12. http://dx.doi.org/10.1021/ci0342472. [44] Di Anibal CV, Callao MP, Ruisánchez I. 1H NMR variable selection approaches for classification. A case study: the determination of adulterated foodstuffs. Talanta 2011;86:316–23. [45] Dorman FL, Overton EB, Whiting JJ, Cochran JW, Gardea-torresdey J. Gas Chromatography. Micro 2008;80:4487–97. http://dx.doi.org/10.1021/ac800714x. [46] Johnson KJ, Wright BW, Jarman KH, Synovec RE. High-speed peak matching al- gorithm for retention time alignment of gas chromatographic data for chemometric analysis. J Chromatogr A 2003;996:141–55. http://dx.doi.org/10.1016/S0021- 9673(03)00616-2. [47] O’Shay TA, Hoddinott KB. Analysis of soils contaminated with petroleum con- stituents. Philafrlphia: ASTM; 1994. M.G. Nespeca et al. Fuel 215 (2018) 204–211 210 https://www.economist.com/news/briefing/21684778-former-star-emerging-world-faces-lost-decade-irredeemable https://www.economist.com/news/briefing/21684778-former-star-emerging-world-faces-lost-decade-irredeemable http://www.anp.gov.br/wwwanp/publicacoes/boletins-anp/2388-pmqc-edicoes-anteriores http://www.anp.gov.br/wwwanp/publicacoes/boletins-anp/2388-pmqc-edicoes-anteriores http://www.anp.gov.br/wwwanp/publicacoes/boletins-anp/2388-pmqc-edicoes-anteriores http://dx.doi.org/10.1016/j.fuel.2015.11.019 http://dx.doi.org/10.1021/ef5013775 http://dx.doi.org/10.1021/ef5013775 http://dx.doi.org/10.1016/j.fuel.2007.11.003 http://dx.doi.org/10.1016/j.fuel.2005.01.008 http://dx.doi.org/10.1016/j.fuel.2013.07.110 http://dx.doi.org/10.1039/c2an35441a http://dx.doi.org/10.1155/2016/6758281 http://dx.doi.org/10.1155/2016/6758281 http://dx.doi.org/10.1016/j.chroma.2008.05.092 http://dx.doi.org/10.1016/j.fuel.2008.10.014 http://dx.doi.org/10.1016/j.fuel.2008.10.014 http://dx.doi.org/10.1016/j.fuel.2007.05.016 http://dx.doi.org/10.1016/j.fuel.2007.05.016 http://dx.doi.org/10.1016/j.fuel.2017.02.041 http://dx.doi.org/10.1021/ef050031l http://dx.doi.org/10.1021/ef050031l http://dx.doi.org/10.1021/ef050203e http://dx.doi.org/10.1021/ef050203e http://dx.doi.org/10.1016/j.talanta.2008.05.024 http://dx.doi.org/10.1016/j.fuel.2013.04.001 http://dx.doi.org/10.1016/j.fuel.2013.04.001 http://dx.doi.org/10.1016/j.fuel.2015.03.019 http://dx.doi.org/10.1016/j.fuel.2007.07.018 http://dx.doi.org/10.1016/j.fuel.2007.07.018 http://dx.doi.org/10.1021/ef800436p http://dx.doi.org/10.1021/ef800436p http://dx.doi.org/10.1016/j.talanta.2010.04.002 http://dx.doi.org/10.1016/j.fuel.2009.06.023 http://dx.doi.org/10.1016/j.fuel.2012.01.001 http://dx.doi.org/10.1016/j.fuel.2012.01.001 http://dx.doi.org/10.1016/S1386-1425(97)83026-0 http://dx.doi.org/10.1021/ef0701337 http://dx.doi.org/10.1016/j.chroma.2005.04.078 http://dx.doi.org/10.1016/j.chroma.2006.06.087 http://dx.doi.org/10.1016/j.aca.2007.02.049 http://dx.doi.org/10.1016/j.aca.2007.02.049 http://dx.doi.org/10.1016/j.chemolab.2007.12.003 http://dx.doi.org/10.1016/j.talanta.2011.02.049 http://dx.doi.org/10.1016/j.talanta.2011.02.049 http://dx.doi.org/10.1002/jssc.201500720 http://dx.doi.org/10.1002/cem.1394 http://dx.doi.org/10.1002/cem.1394 http://dx.doi.org/10.1016/S0021-9673(98)00021-1 http://dx.doi.org/10.1016/S0021-9673(98)00021-1 http://dx.doi.org/10.1016/j.carres.2015.03.015 http://dx.doi.org/10.1016/j.carres.2015.03.015 http://dx.doi.org/10.1255/jsi.2016.a1 http://dx.doi.org/10.1016/j.fuel.2004.09.013 http://dx.doi.org/10.1016/j.fuel.2004.09.013 http://dx.doi.org/10.1093/chromsci/23.8.333 http://dx.doi.org/10.1002/cem.2609 http://dx.doi.org/10.1021/ci0342472 http://refhub.elsevier.com/S0016-2361(17)31434-5/h0220 http://refhub.elsevier.com/S0016-2361(17)31434-5/h0220 http://refhub.elsevier.com/S0016-2361(17)31434-5/h0220 http://dx.doi.org/10.1021/ac800714x http://dx.doi.org/10.1016/S0021-9673(03)00616-2 http://dx.doi.org/10.1016/S0021-9673(03)00616-2 http://refhub.elsevier.com/S0016-2361(17)31434-5/h0235 http://refhub.elsevier.com/S0016-2361(17)31434-5/h0235 [48] Wong KH, Razmovski-Naumovski V, Li KM, Li GQ, Chan K. Differentiation of Pueraria lobata and Pueraria thomsonii using partial least square discriminant analysis (PLS-DA). J Pharm Biomed Anal 2013;84:5–13. http://dx.doi.org/10. 1016/j.jpba.2013.05.040. [49] Eigenvector Research Incorporated. How is the prediction probability and threshold calculated for PLSDA? FAQ n.d.:1. http://www.eigenvector.com/faq/index.php? id=38%7C (accessed January 1, 2017). [50] Yin M, Tang S, Tong M. Identification of edible oils using terahertz spectroscopy combined with genetic algorithm and partial least squares discriminant analysis. Anal Methods 2016;8:2794–8. http://dx.doi.org/10.1039/C6AY00259E. [51] Wold S, Antti H, Lindgren F, Öhman J. Orthogonal signal correction of near-infrared spectra. Chemom Intell Lab Syst 1998;44:175–85. http://dx.doi.org/10.1016/ S0169-7439(98)00109-9. [52] Zorzetti BM, Shaver JM, Harynuk JJ. Estimation of the age of a weathered mixture of volatile organic compounds. Anal Chim Acta 2011;694:31–7. http://dx.doi.org/ 10.1016/j.aca.2011.03.021. [53] Roger JM, Chauchard F, Bellon-Maurel V. EPO-PLS external parameter orthogo- nalisation of PLS application to temperature-independent measurement of sugar content of intact fruits. Chemom Intell Lab Syst 2003;66:191–204. http://dx.doi. org/10.1016/S0169-7439(03)00051-0. [54] Rajalahti T, Arneberg R, Berven FS, Myhr KM, Ulvik RJ, Kvalheim OM. Biomarker discovery in mass spectral profiles by means of selectivity ratio plot. Chemom Intell Lab Syst 2009;95:35–48. http://dx.doi.org/10.1016/j.chemolab.2008.08.004. M.G. Nespeca et al. Fuel 215 (2018) 204–211 211 http://dx.doi.org/10.1016/j.jpba.2013.05.040 http://dx.doi.org/10.1016/j.jpba.2013.05.040 http://www.eigenvector.com/faq/index.php?id=38%7C http://www.eigenvector.com/faq/index.php?id=38%7C http://dx.doi.org/10.1039/C6AY00259E http://dx.doi.org/10.1016/S0169-7439(98)00109-9 http://dx.doi.org/10.1016/S0169-7439(98)00109-9 http://dx.doi.org/10.1016/j.aca.2011.03.021 http://dx.doi.org/10.1016/j.aca.2011.03.021 http://dx.doi.org/10.1016/S0169-7439(03)00051-0 http://dx.doi.org/10.1016/S0169-7439(03)00051-0 http://dx.doi.org/10.1016/j.chemolab.2008.08.004 Rapid and sensitive method for detecting adulterants in gasoline using ultra-fast gas chromatography and Partial Least Square Discriminant Analysis Introduction Material and methods Samples Ultra-fast gas chromatographic analysis Chemometric analysis Results and discussion Samples UFGC-FID analysis Chromatographic separation Peak alignment Comparison between class chromatograms Chemometric analysis Model development Model validation Evaluation of multivariate filters Model selectivity ratio Model limitations Conclusion Acknowledgements References