Multivariate Chemometrics with Regression and Classification Analyses in Heroin Profiling Based on the Chromatographic Data

authors:

avatar Slobodan B. Gadžurić a , * , avatar Sanja O. Podunavac Kuzmanović b , avatar Milan B. Vraneš a , avatar Marija Petrin c , avatar Tatjana Bugarski d , avatar Strahinja Z. Kovačević b

Department of Chemistry, Biochemistry & Environmental Protection, Faculty of Science, University of Novi Sad, Trg D. Obradovića 3, 21000 Novi Sad, Serbia.
Faculty of Technology, University of Novi Sad, Bulevar cara Lazara 1, 21000 Novi Sad, Serbia.
National Forensic Technical Centre in Novi Sad, Pap Pavla 46, Novi Sad, Serbia.
Faculty of Law, University of Novi Sad, 21000 Novi Sad, Serbia.

how to cite: B. Gadžurić S, O. Podunavac Kuzmanović S, B. Vraneš M, Petrin M, Bugarski T, et al. Multivariate Chemometrics with Regression and Classification Analyses in Heroin Profiling Based on the Chromatographic Data. Iran J Pharm Res. 2016;15(4):e125240. https://doi.org/10.22037/ijpr.2016.1905.

Abstract

The purpose of this work is to promote and facilitate forensic profiling and chemical analysis of illicit drug samples in order to determine their origin, methods of production and transfer through the country. The article is based on the gas chromatography analysis of heroin samples seized from three different locations in Serbia. Chemometric approach with appropriate statistical tools (multiple-linear regression (MLR), hierarchical cluster analysis (HCA) and Wald-Wolfowitz run (WWR) test) were applied on chromatographic data of heroin samples in order to correlate and examine the geographic origin of seized heroin samples. The best MLR models were further validated by leave-one-out technique as well as by the calculation of basic statistical parameters for the established models. To confirm the predictive power of the models, external set of heroin samples was used. High agreement between experimental and predicted values of acetyl thebaol and diacetyl morphine peak ratio, obtained in the validation procedure, indicated the good quality of derived MLR models. WWR test showed which examined heroin samples come from the same population, and HCA was applied in order to overview the similarities among the studied heroine samples.

Introduction

Illicit drug profiling provides law and police authorities essential physicochemical information that may assist in identification and disruption of drug trafficking in one country. Results of chemical analysis may allow investigators to determine geographical origin of the illicit drug, synthetic path and chemical precursors of synthetic drugs. The physical evidence combined with chemical analysis can be also used to establish links between different seizures of illicit drugs. This is the first attempt in Serbia to acquire chemical and profiling data on seized heroin samples and disseminate information to appropriate national and regional governmental agencies. In this paper, authors endeavored to determine the chemical fingerprints or signatures of seized heroin samples in three different locations in Serbia: border crossings: Batrovci and Horgoš, but also Novi Sad municipality, labelled as batch (1), (2) and (3). Serbia was chosen as the entering point to European Union, since the common trafficking routes from Middle East to EU are passing its territory mostly through the mentioned border crossings (1) with Croatia and (2) with Hungary. It is also known that heroin can be easily prepared at the many “homemade” laboratories, some of which have existed on the territory of Novi Sad municipality.

Heroin is a semi-synthetic derivative of morphine. Due to differences in the way of growing opium poppies and the different synthetic routes during the synthesis of heroin, the presence and concentration of opium alkaloids vary in the final product. Also, during the acetylation of morphine, the other alkaloids of opium may react. Thus, their presence and concentrations vary in the final product. The presence of diluents (mannitol, glucose) and adulterants (caffeine, acetaminophen) provide additional information on the geographical origin of heroin and the way of production. Determining the concentration of opium alkaloids, adulterants and diluents makes chemical «profile» of heroin. Due to the presence of a large number of compounds, the chemical profiles of heroin can be very complex, which further complicates the analysis and profiling of the illicit drug (13).

One of the most recent ways to use large amounts of data collected during different chemical analysis of illicit drugs is the application of the chemometric tools suitable for data mining and forensic profiling. In this way, investigators can get insight of the drug production, drug trafficking and geographical origin of the sample. Chemometric approach studies are undoubtedly of a great importance in modern chemistry and biochemistry, especially because of possibility to screen a large number of chemical data (i.e. different molecules or analytes) in a short time and with a low cost (47). Multivariate chemometrics is a very useful and powerful tool when the main issue includes dealing with multicomponent data sets (8, 9). It allows the extraction of maximum information from complicated datasets. The conclusions in forensic science must be drawn from objective sources as much as possible. The forensic scientists must always follow rigid statistical protocols in the process of making decisions based on experimental data. Hence, our present paper explores the usefulness of multivariate chemometrics with regression and classification approaches in the discrimination of seized heroin samples. The aim of the study was to establish a simple analytical procedure followed by chemometric approach that can recover batch links among limited number of seized heroin samples.

Material and methods

All samples used in this work were seized during various actions by the Serbian police at the border crossings (1) and (2), together with those seized on the territory of Novi Sad municipality (3). Chemical analysis was performed in the laboratories of the National Forensic Technical Center in Novi Sad. All chemicals used (chloroform, pyridine and MSTFA) were pro analysi quality manufactured by Merck.

a, b, c) Plots of predicted versus experimentally observed TEB/DAM ratio
a, b, c) Plots of the residual values against the experimentally observed TEB/DAM ratio
Dendrogram of HCA as a result of classification of analyzed heroin samples
Table 1

Peak ratio of TEB/DAM, MAM/DAM, PAP/DAM and NOS/DAM for all investigated heroin samples

(1)
(2)
(3)
TEB/DAMMAM/DAMPAP/DAMNOS/DAMTEB/DAMMAM/DAMPAP/DAMNOS/DAMTEB/DAMMAM/DAMPAP/DAMNOS/DAM
10.12690.00620.04240.11570.03670.14330.04790.09020.02670.12330.03790.0802
20.15620.00680.05680.12830.04110.14880.04810.14620.03110.13880.03810.1362
30.12690.00620.04240.11570.01410.11040.02460.13680.02410.12040.03460.1468
40.15620.00680.05680.12832.50766.75210.46820.00762.52766.77210.44820.0276
50.18730.00730.04920.19092.76727.10830.47460.03742.76727.10830.47460.0374
60.15240.00620.04410.17900.01350.06570.02190.05240.01350.06570.02190.0524
70.15900.00620.04520.18280.01470.24560.03270.00490.01470.24560.03270.0049
80.15660.00590.04390.17640.11030.14630.42710.10450.01030.04630.32710.0045
90.39450.02760.04740.12480.22080.22930.23350.30830.02080.09300.03350.1083
100.36520.02840.04520.11920.02480.09650.03930.10990.02080.09250.03630.1055
110.35250.02840.04550.14160.02090.13070.03030.10010.02290.13370.03630.1021
120.35770.02420.03930.12110.11430.21370.12320.10520.01430.11370.02320.0049
130.35190.02870.04650.14070.20960.25890.22290.20290.00960.08880.02290.0029
140.34490.02870.04590.14420.31010.44740.33640.45640.01010.14740.03640.1564
150.43030.02840.04850.12880.01000.07320.02790.10360.01000.07320.02790.1036
160.32610.02530.03910.11360.02430.12080.03420.14720.02430.12080.03420.1472
170.33600.02880.44600.13532.52886.77300.44990.02792.52796.77300.44890.0279
180.33780.02880.04490.13450.11400.11350.02390.10450.01400.11340.02290.0045
190.37370.02720.04540.11100.00940.08830.02250.00230.00940.08830.02250.0023
200.34320.02510.03930.11960.21500.24600.23300.00520.01500.24600.03300.0052
210.33500.02900.04590.13630.11100.04870.32760.10490.01100.04670.32760.0049
220.31490.02840.04490.13920.24700.12970.25830.08170.02700.12370.03830.0807
230.30650.02770.04450.13840.13150.13920.03860.13640.03150.13920.03860.1364
240.32610.02830.04490.13850.02920.09200.03600.10500.02020.09200.03600.1050
250.25510.02770.04530.13840.42200.13300.03600.10170.02200.13300.03600.1017
260.27190.02860.04600.14112.79757.10870.47490.03782.76757.10870.47490.0378
270.38060.02900.04780.11920.14140.06600.02240.05280.01390.06600.02240.0528
280.34960.02800.04470.14350.02990.09010.03300.10800.02000.09000.03300.1080
290.38120.02880.04630.13740.09090.05570.02150.04240.00150.08570.03190.0528
300.40470.03860.04700.14010.02960.08100.03300.11000.07720.28760.04270.0104
310.33050.02840.04880.11620.04170.14920.03760.12640.01820.04820.30710.0145
320.36210.02790.04580.14150.10030.10040.02550.13770.00720.12370.02450.0070
330.36850.02670.04660.14170.07920.11380.03420.14620.00130.08780.03290.0429
340.29210.02540.04550.14320.07130.13170.03130.10210.03680.15740.02640.1364
350.32890.02840.04650.12990.06900.12350.03390.10650.01960.12470.04830.0707
360.30630.02630.04010.11760.07100.07600.03240.05280.02880.11920.03460.1324
370.35570.02770.43600.13830.05910.08650.03930.11890.00500.09200.04400.1050
380.36660.02820.04540.12510.07680.13400.03600.10190.01090.08250.03530.1057
Table 2

Best MLR models for the prediction of heroin geographical origin

ModelYCoefficientNRSF
NOS/DAM0.1449
(1)TEB/DAMIntercept0.1274280.92700.033448.8618
MAM/DAM8.6544
PAP/DAM-0.0353
NOS/DAM-0.1806
(2)TEB/DAMIntercept0.0073280.99190.0922873.29
MAM/DAM0.3627
PAP/DAM0.2680
NOS/DAM0.1453
(3)TEB/DAMIntercept-0.0419280.99960.0273106.23
MAM/DAM0.3798
PAP/DAM0.1073
Table 3

The cross-validation parameters

ModelPRESSSSYPRESS/SSYSPRESSr2CVr2adj
(3)0.028823.76160.00120.03210.99880.9992
(1)0.07940.23800.33360.05320.66640.8417
(2)0.260122.49690.01160.09640.98840.9898
Table 4

Predicted TEB/DAM peak ratio of test set with the residual values

CompoundTEB/DAM ratio predicted
Residuals
(1)(2)(3)(1)(2)(3)
100.34720.06640.00850.01940.01040.0024
10.35020.07990.00170.03100.0110-0.0002
20.43450.03940.0734-0.0298-0.00980.0038
30.35050.06150.0114-0.0199-0.01980.0068
40.34170.08980.00870.02050.0105-0.0015
50.33120.07050.00120.03730.00870.0001
60.31980.07890.0404-0.0276-0.0076-0.0036
70.34810.07820.0208-0.0192-0.0092-0.0012
80.33240.07660.0262-0.0260-0.00560.0026
90.32680.05120.01290.02900.0079-0.0079
Table 5

The results of Wald-Wolfowitz run test for comparison of heroin samples taking into account TEB/DAM, MAM/DAM, PAP/DAM and NOS/DAM ratios together

rcr = 135.94(1)(2)(3)
(1)-H0 rejected (r = 65)H0 rejected (r = 67)
(2)H0 rejected (r = 65)-H0 accepted (r = 149)
(3)H0 rejected (r = 67)H0 accepted (r = 149)-
Table 6

The results of Wald-Wolfowitz run test for comparison of heroin samples taking into account TEB/DAM, MAM/DAM, PAP/DAM and NOS/DAM ratios separately

(1)(2)(3)
TEB/DAMrcr = 30.51
(1)-H0 rejected (r = 11)H0 rejected (r = 3)
(2)H0 rejected (r = 11)-H0 rejected (r = 28)
(3)H0 rejected (r = 3)H0 rejected (r = 28)-
MAM/DAMrcr = 30.51
(1)-H0 rejected (r = 2)H0 rejected (r = 2)
(2)H0 rejected (r = 2)-H0 accepted (r = 48)
(3)H0 accepted (r = 48)H0 rejected (r = 2)-
PAP/DAMrcr = 30.51
(1)-H0 rejected (r = 9)H0 rejected (r = 11)
(2)H0 rejected (r = 9)-H0 accepted (r = 40)
(3)H0 rejected (r = 11)H0 accepted (r = 40)-
NOS/DAMrcr = 30.51
(1)-H0 rejected (r = 13)H0 rejected (r = 10)
(2)H0 rejected (r = 13)-H0 accepted (r = 41)
(3)H0 rejected (r = 10)H0 accepted(r = 41)-

Analytical procedure.

Heroin samples were homogenized first in a mortar. Mass of 0.15–0.25 g of the sample was quantitatively transferred to vials, together with 200 μL of chloroform + pyridine (1:1) solution in order to dissolve the samples and 200 μL of silylating reagent (MSTFA). Prepared samples are heated for 1 h at 60 °C and then injected in the gas chromatograph with flame ionization detector GC-FID Agilent 6890N. Injected volume was 2 μL and split mode 50:1. Chromatographic separation was achieved on a capillary column DB-1 (length 30 m, internal diameter 0.25 mm, film thickness 0.25 μM). Carrier gas was nitrogen at a pressure of 66.6 kPa. The samples were heated for one minute at 150 °C, then up to 250 °C with heating rate of 10 ºC min–1. This temperature was maintained for 10 min.

Statistical Methods.

The complete regression and classification analyses (MLR, HCA, WWR test) were carried out by PASS 2005, GESS 2006, NCSS Statistical Software, MS Excel and Statistica v. 10 (1012).

The general purpose of MLR analysis is to quantitate the relationship between several independent or predictor variables and a dependent variable. MLR model is built with descriptive variables using the least squares methods to minimize the residuals (13). General MLR model is:

y = a + b1∙x1 + b2∙x2 +∙∙∙+ bn∙xn                    (1)

where y is the quantitative property to predict (dependent variable), xn an independent (descriptive) variable, a the intercept, and bn the regression coefficient for xn.

HCA is a method for dividing a group of objects into classes so that similar objects are in the same class (cluster). The groups of entities are not known prior to the mathematical analysis and no assumptions are made about the distribution of the variables. Cluster analysis searches for objects which are close together in the variable space. The data in each cluster share some common trait, often proximity according to some defined distance measure (14).

Wald-Wolfowitz run test (WWR) can be applied to examine if two random samples come from populations with the same distribution. WWR test can detect differences in averages or spread or any other important aspect between the two populations (15). This test is efficient when each sample size is greater than or equal to 10 (15). This method includes testing the null hypothesis - H0: two samples come from populations having the same distribution. At the start it is necessary to define the critical value for “run” number (rcr). We can calculate this value based on the following equations (15):

rcr = μ – 1.96 σ (at 5% level of significance)                     (2)

where:

μ = 1 + ((2 n1 n2) / (n1 + n2))                     (3)

σ = ((2 n1 n2 (2 n1 n2 – n1 – n2)) / ((n1 + n2)2 (n1 + n2 – 1)))½                     (4)

n1 – size of sample 1

n2 – size of sample 2

So-called “run” number (r) can be obtained from the list of n1 + n2 observations from two samples in order of magnitude. It represents the number of sections of consecutive values which belong to the same sample and it can be counted from the list of n1 + n2 observations. Observations from sample 1 should be denoted as Xs and other as Ys, and then the number of runs can be counted. Afterwards, the r and rcr numbers can be compared. The H0 hypothesis has to be rejected if r ≤ rcr (15).

Results and discussion

In the first step of the present study, gas chromatography (GC) analyses were applied on thirty eight different samples of heroin from three locations in Serbia. The data of gas chromatography (GC) analyses are summarized in Table 1. as the peak area ratio of four secondary components, namely: acetyl thebaol (TEB), 6-monoacetyl morphine (MAM), papaverine (PAP), noscapine (NOS), and the main psychoactive component of diacetyl morphine (DAM). All these components were identified according to their retention times. In the second step, we focused our efforts on developing the MLR models that can determine the geographical origin of heroin samples. A set of twenty eight collected data (samples 1-28) was used for MLR modeling. TEB/DAM was used as a dependent variable in the regression analysis, and MAM/DAM, PAP/DAM and NOS/DAM were used as independent variables.

MLR procedure was used to model the relationships between the data of GC analyses. The stepwise regression (SWR) method was used to derive the most significant models as a calibration models for prediction of TEB/DAM peak ratio of seized heroin samples. The specifications for the best selected MLR models are shown in Table 2.

The statistical quality of the generated models was checked by statistical parameters: correlation coefficient (r), standard error of estimation or standard deviation (s), and F-test (Fisher›s value) for statistical significance (1618). Correlation coefficient r (or coefficient of multiple determination) is a relative measure of the fit by the regression equation. Correspondingly, it represents the part of variation in the observed data that is explained by the regression. Standard deviation is measured by the error mean square, which expresses the variation of the residuals or the variation from the regression line. Thus, standard deviation is an absolute measure of the quality of the fit and should have a low value for the regression to be significant. The F-test reflects the ratio of the variance explained by the model and the variance due to the error in regression. High value of the F-test indicates that the model is statistically significant.

It is well known that there are three important components in any chemometric-regression analysis: the development of the models, validation of the models and the utilization of developed models. Validation is a crucial aspect of any regression analysis (19). For testing the validity of the predictive power of selected models leave one out (LOO) technique was used. The developed models were validated by the calculation of the following statistical parameters: PRESS, SSY, SPRESS, r2CV, and r2adj (Table 3.). These parameters were calculated from the following equations:

PRESS=(Yobs-Ycalc)2

(5)

SSY=(Yobs-Ymean)2

(6)

SPRESS=PRESSn2

(7)

rcv2=1-PRESSSSY

(8)

radj2=1-(r2)n-1n-p-1

(9)

where, Yobs, Ycalc and Ymean are observed, calculated and mean values; n is number of the samples and p is number of independent parameters.

PRESS is an acronym for prediction sum of squares. It is used to validate a regression model regarding to its predictability. To calculate PRESS, each observation is individually omitted. The remaining n-1 observations are used to calculate a regression and estimate the value of the omitted observation. This is done n times, once for each observation. The difference between the actual Y value, Yobs, and the predicted Ycalc, is so-called the prediction error. The sum of the squared prediction errors is the PRESS value. The smaller PRESS is, the better predictability of the model is achieved. SSY are the sums of squares associated with the corresponding sources of variation. These values are in terms of the dependent variable, Y.

The above PRESS value can be used to compute an r2CV statistic, called r2 cross validated parameter, which reflects the prediction ability of the model. This is a good way to validate the prediction of a regression model without selecting another sample or splitting the data. It is very possible to have a high r2 and a very low r2CV. When this occurs, it implies that the fitted model is data dependent. This parameter ranges from below zero to above one. When outside the range of 0-1, it is truncated to stay within this range. Adjusted r-squared (r2adj) is an adjusted version of r2. The adjustment seeks to remove the distortion due to a small sample size.

In many cases r2CV and r2adj are taken as a proof of the high predictive ability of MLR models. A high value of these statistical characteristics (>0.5) is considered as a proof of the high predictive ability of the model. However, some recent reports have proved the opposite (20). Although, the low value of r2CV for the training set can indeed serve as an indicator of a low predictive ability of a model, the opposite is not necessarily true. Thus, the high value of LOO r2CV is the necessary condition for a model to have a high predictive power, but it is not a sufficient one.

Although models showed good internal consistency, they may not be applicable for the analogs which were never used in the generation of the correlation. It is proven that the only way to estimate the true predictive power of a model is to test it on a sufficiently large collection of the samples from an external test set. The test set must include no less than five samples, whose properties and structures must cover the range of properties and structures of the samples from the training set. This application is necessary for obtaining trustful statistics for comparison between the observed and predicted values for these compounds. Therefore, the external extrapolation power of the model was further authenticated by a test set of ten heroin samples.

The values of TEB/DAM peak ratio of an external set of heroin samples (samples 29-38) were calculated by the models. These data are compared with experimentally obtained values of TEB/DAM ratio (Table 4. Figure 1.). From the data presented in Table 4. it is shown that high agreement between experimental and predicted TEB/DAM ratio was obtained (the residual values are small, indicating the good predictability of the established models). According to the reference (16) without the validation of the MLR models by using the external test set, we could not come to a right conclusion about high predictive ability of derived models.

To investigate the existence of a systemic error in developing the MLR models, the residuals of predicted TEB/DAM peak ratio values were plotted against the experimental values in Figure 2. The propagation of the residuals on both sides of zero indicates that no systematic error exists in the development of regression models as suggested by Jalali-Heravi et al. (21). It indicates that these models can be successfully applied to predict the geographic origin of seized heroin samples using the GC results. Therefore, the randomness of the residuals and their low values indicate that the obtained mathematical models can predict the dependent variable with acceptable error. According to the Variance Inflation Factor (VIF), which was lower than 10 for all the obtained models, it can be concluded that there is no multi collinearity present in the established models.

HCA was performed on the TEB/DAM, MAM/DAM, PAP/DAM and NOS/DAM peak ratios of the analysed heroin samples in order to reveal the similarities among them. Clustering was based on the Euclidean distance and single linkage algorithm. The obtained dendrogram is presented in Figure 3. As it can be seen from the presented dendrogram, on the basis of TEB/DAM, MAM/DAM, PAP/DAM and NOS/DAM peak ratios, the most similar heroin samples come from border crossing (2) and Novi Sad municipality (3).These entities are placed into the separate cluster, while the samples that belong to border crossing (1) are significantly different than the others.

WWR test was applied firstly on the TEB/DAM, MAM/DAM, PAP/DAM and NOS/DAM peak ratios together. The established null hypothesis “two samples come from populations having the same distribution” was confirmed for the samples that are seized at border crossing (2) and Novi Sad municipality (Table 5). This result confirms the finding obtained with HCA method. According to WWR test, the samples from Novi Sad municipality and border crossing (1) do not belong to the same population, as well as the samples from border crossings (1) and (2).

Testing the H0 hypothesis for the examined samples on the basis of TEB/DAM, MAM/DAM, PAP/DAM and NOS/DAM peak ratios separately, resulted in the same way as in previous WWR analysis, except in the case which included TEB/DAM peak ratio (Table 6). In this case, all the three types of samples do not belong to the same population. It can indicate that exactly TEB/DAM peak ratio can be used as discriminating factor for the analysed samples. As it is shown in the MLR analysis, this ratio actually is dependent variable which is predicted based on the other determined peak ratios.

Conclusion

Collected data were modeled by MLR, HCA and WWR methods. Mathematical dependences that can determine the geographical origin of heroin samples were obtained. The validity of the models has been evaluated by the determination of suitable statistical parameters. Predictive ability of defined mathematical model was tested by comparing and correlating the experimental and calculated values of TEB/DAM peak ratio. The low residual activity and high cross-validated r2 values (r2CV) indicated the predictive ability of the developed MLR models. Since the correlation was extremely good, our mathematical models can be used to predict geographic origin of seized heroin samples in Serbia, using the GC results. HCA analysis showed that the samples from Novi Sad municipality and border crossing (2) are very similar, while WWR testing explained that mentioned samples belong to the same population according to MAM/DAM, PAP/DAM and NOS/DAM peak ratios. TEB/DAM peak ratio (the dependent variable in MLR model) was discriminating factor in WWR testing and it showed that the analysed samples do not belong to the same population.

Acknowledgements

References

  • 1.

    Dams R, Benijts T, Lambert WE, Massart DL, De Leenheer AP. Heroin impurity profiling: trends throughout a decade of experimenting. Forensic Sci. Int. 2001;123:81-8. [PubMed ID: 11728732].

  • 2.

    Klemenc S. In common batch searching of illicit heroin samples-evaluation of data by chemometrics methods. Forensic Sci. Int. 2001;115:43-52. [PubMed ID: 11056269].

  • 3.

    Dufey V, Dujourdy L, Besacier F, Chaudron H. A quick and automated method for profiling heroin samples for tactical intelligence purposes. Forensic Sci. Int. 2007;169:108-17. [PubMed ID: 16973323].

  • 4.

    Podunavac-Kuzmanović SO, Markov SL, Barna DJ. Relationship between the lipophilicity and antifungal activity of some benzimidazole derivatives. J. Theor. Comp. Chem. 2007;6:687-98.

  • 5.

    Podunavac-Kuzmanović SO, Cvetković DD, Barna DJ. QSAR analysis of 2-amino or 2-methyl-1-substituted benzimidazoles against Pseudomonas aeruginosa. Int. J. Mol. Sci. 2009;10:1670-82. [PubMed ID: 19468332].

  • 6.

    Podunavac-Kuzmanović SO, Cvetković DD. Lipophilicity and antifungal activity of some 2-substituted benzimidazole derivatives. CICEQ. 2011;17:9-15.

  • 7.

    Podunavac-Kuzmanović SO, Cvetković DD. QSAR modeling of antibacterial activity of some benzimidazole derivatives. CICEQ. 2011;17:33-8.

  • 8.

    Thanasoulias NC, Piliouris ET, Kotti MSE, Evmiridis NP. Application of multivariate chemometrics in forensic soil discrimination based on the UV-Vis spectrum of the acid fraction of humus. Forensic Sci. Int. 2002;130:73-82. [PubMed ID: 12477626].

  • 9.

    Thanasoulias NC, Parisis NA, Evmiridis NP. Multivariate chemometrics for the forensic discrimination of blue ball-point pen inks based on their Vis spectra. Forensic Sci. Int. 2003;138:75-84. [PubMed ID: 14642722].

  • 10.

  • 11.

    Microsoft. Microsoft Excel. Redmond. Washington: Microsoft; 2003. Computer Software.

  • 12.

  • 13.

    Esbensen KH. Multivariate Data Analysis. Practice. 5th edition. CAMO Software AS, Oslo; 2009. p. 94-105.

  • 14.

    Miller JN, Miller JC. Statistics and chemometrics for analytical chemistry. 6th edition. Harlow: Pearson Education Limited; 2010.

  • 15.

  • 16.

    Snedecor GW, Cochran WG. Statistical methods. New Delhi: Oxford and IBH; 1967. p. 27-42.

  • 17.

    Chaltterjee S, Hadi AS, Price B. Regression analysis by examples. New York: Wiley VCH; 2000.

  • 18.

    Diudea MV. QSPR/QSAR studies for molecular descriptors. New York: Nova Science; 2000.

  • 19.

    Topliss JG, Edwards RP. Chance factors in studies of quantitative structure-activity relationships. J. Med. Chem. 1979;22:1238-44. [PubMed ID: 513071].

  • 20.

    Golbraikh A, Tropsha J. Beware of q2! J. Mol. Graph. Mod. 2002;20:269-76.

  • 21.

    Jalali-Heravi M, Kyani A. Use of computer-assisted methods for the modeling of the retention time of a variety of volatile organic compounds: A PCA-MLR-ANN Approach. J. Chem. Inf. Comput. Sci. 2004;44:1328-35. [PubMed ID: 15272841].