In Silico Screening of IL-1β Production Inhibitors Using Chemometric Tools

authors:

avatar Amirhossein Sakhteman a , avatar Najmeh Edraki b , avatar Bahram Hemmateenejad c , avatar Ramin Miri b , avatar Alireza Foroumadi d , avatar Abbas Shafiee d , e , avatar Mehdi Khoshneviszadeh a , b , *

Department of Medicinal Chemistry, Faculty of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran.
Medicinal and Natural Products Chemistry Research Center, Shiraz University of Medical Sciences, Shiraz, Iran.
Department of Chemistry, Shiraz University, Shiraz, 71454, Iran.
Pharmaceutical Sciences Research Center, Tehran University of Medical Sciences, Tehran, Iran.
Department of Medicinal Chemistry, Faculty of Pharmacy, Tehran University of Medical Sciences, Tehran, Iran.

how to cite: Sakhteman A, Edraki N, Hemmateenejad B, Miri R, Foroumadi A, et al. In Silico Screening of IL-1β Production Inhibitors Using Chemometric Tools. Iran J Pharm Res. 2017;16(2):e125030. https://doi.org/10.22037/ijpr.2017.2035.

Abstract

The IL-1β plays a major role in inflammatory disorders and IL-1β production inhibitors can be used in the treatment of inflammatory and related diseases. In this study, quantitative relationships between the structures of 46 pyridazine derivatives (inhibitors of IL-1β production) and their activities were investigated by Multiple Linear Regression (MLR) technique Stepwise Regression Method (ES-SWR). The genetic algorithm (GA) has been proposed for improvement of the performance of the MLR modeling by choosing the most relevant descriptors. The results show that eight descriptors are able to describe about 83.70% of the variance in the experimental activity of the molecules in the training set. The physical meaning of the selected descriptors is discussed in detail. Power predictions of the QSAR models developed were evaluated using cross-validation, and validation through an external prediction set. The results showed satisfactory goodness-of-fit, robustness and perfect external predictive performance. The applicability domain was used to define the area of reliable predictions. Furthermore, the in silico screening technique was applied in order to predict the structure and potency of new compounds of this type using the proposed QSAR model.

Introduction

In recent years, many efforts have been done toward the development of new therapeutic agents in the area of inflammatory diseases. Cytokines are multifunctional proteins that responsible for host defense mechanisms such as inflammatory, immune and hematogenic responses (1). Cytokines have been categorized as being proinflammatory (IL-1β, TNF-α, IL-6...) or anti-inflammatory (TGF-β, IL-10, IL-13...) depending on their effects on the immune system. Inflammatory cytokines play an important role in inflammatory disease. Consequently, inhibition the production of IL-1β can reduce levels of these proinflammatory cytokine, and thereby reduce inflammation and prevent destruction effects in diseases such as rheumatoid arthritis (RA), osteoarthritis (OA) and Crohn’s disease (2). For the rational design of novel IL-1β production inhibitors, quantitative structure-activity relationships (QSAR) models and in silico screening could be useful (3). QSAR is a powerful tool to study the relationship between activity and structural parameters and could be used for design new biological active compounds and predict their potency, toxicity, ADME,etc. In this study, a series of 5, 6-bis (4-methoxyphenyl) -pyridazin derivatives with IL-1β production inhibitory activity discovered by Matsuda et al (4, 5). According to our literatures review, no QSAR study was carried out on the mentioned series. In the present investigation, a quantitative structure–activity relationship was explored using different molecular descriptors. Previous studies demonstrated that linear QSAR models had good predictive ability therefore (6-9), a virtual screening study was then carried out to identify novel biologically active patterns by modification of the original molecules. The study led to the identification of novel structures, which are potent IL-1β production inhibitors based on the QSAR model. The structures were filtered using the domain of applicability of the QSAR model.

Materials and methods

Data set

The biological data used in this study are the IL-1β production inhibitory (IC50) activity of a series of 5,6-bis (4-methoxyphenyl)-pyridazin derivatives (4, 5). The structural features and biological activity of these compounds are listed in Table 1. The biological data were converted to logarithmic scale (pIC50) and then used for subsequent QSAR analysis as dependent variables. The original data set was divided into a training set (n = 35) and a prediction set (n = 11) using the popular Kennard and Stones algorithm (10, 11). This algorithm has been widespread used with great success in many QSAR studies (12-14).

Descriptor generation

The chemical structure of molecules was constructed using Hyperchem package (Version 7, Hypercube Inc., Florida 32601 USA). The Z-matrices of the structures were provided by the software and transferred to the Gaussian 98 program. Complete geometry optimization was performed taking the most extended conformations as starting geometries. Semi empirical molecular orbital calculations (AM1) of the structures were performed using Gaussian 98 program. A large numbers of molecular descriptors were calculated using HyperChem, Gaussian 98 and Dragon Packages. Gaussian 98 was employed for calculation of different quantum chemical descriptors including dipole moment (DM), local charges, HOMO and LUMO energies, hardness (g); softness (S); electronegativity (v); and electrophilicity (x). Dragon software calculated different functional groups, topological, geometrical and constitutional descriptors for each molecule, and some chemical parameters including molecular volume (V), molecular surface area (SA), hydrophobicity (log P), hydration energy (HE) were calculated using Hyperchem software.

Data processing and modeling

The calculated descriptors were collected in a data matrix, D. First the descriptors were checked for constant or near-constant values and those detected were removed from the original data matrix. Then, the correlation of descriptors with each other’s and with the activity data was determined. Among the collinear descriptors detected (r > 0.9), one of them that had the highest correlation with activity was retained, and the rest were omitted (15, 16). For the development of QSAR equation stepwise multiple linear regression (MLR) was used. In stepwise regression, a multiple-term linear equation was built step-by-step. The basic procedures involve (i) identifying an initial model, (ii) iteratively ‹stepping›, that is repeatedly altering the model at the previous step by adding or removing a predictor variable in accordance with the ‹stepping criteria› (in our case, probability of F = 0.05 for inclusion; probability of F = 0.1 for exclusion for the forward selection method), and (iii) terminating the search when stepping is no longer possible given the stepping criteria, or when a specified maximum number of steps have been reached. Specifically, at each step, all variables are reviewed and evaluated to determine which one will contribute most to the equation. That variable will then be included in the model, and the process starts again. A limitation of the stepwise regression search approach is that it presumes there is a single ‹best› subset of X variables and seeks to identify it. There is often no unique ‹best› subset, and all possible regression models with a similar number of X variables as in the stepwise regression solution should be fitted subsequently to study whether some other subsets of X variables might be better (17, 18). We used GA for variable selection in MLR regression. The applied GA used a binary representation as the coding technique for the given problem: The presence or absence of a descriptor in a chromosome was coded by 1 or 0. The GA performed its optimization by variation and selection via the evaluation of the fitness function. The operators used here were crossover and mutation. The probability for the application of these operators was varied linearly with the generation renewal (0 ± 1% for mutation and 60 ± 90% for crossover). The population size was 100. For a typical run, the evolution of the generation was stopped when 90% of the generations took the same fitness. The fitness function was the root mean square error of cross-validation (19).

Model validation

The goodness-of-fit of the resulted QSAR models were judged using statistical parameters such as correlation coefficient (R2), standard error of regression (SE), and variance ratio (F) at specified degrees of freedom. The generated QSAR equations were also validated by leave-one-out cross-validation correlation coefficient (Q2) and root mean square error of cross-validation (RMScv)(20). In addition, an external test set composed selected11 molecules was used to judge the overall prediction ability of the resulted models (see Table 1). According to Tropsha et al. (21). The predictive ability of a QSAR model should be tested on an external set of data that has not been taken into account during the process of developing the model. In particular, to assess the predictive power of QSAR models the correlation coefficient between the predicted and observed activities of compounds from an external test (R 2), the correlation coefficients for regressions through the origin (predicted versus observed activities, or observed versus predicted activities, i.e., R02 or Rꞌ02, respectively), and the slope of the regression lines through the origin (K and K ‹, respectively) were calculated. Tropsha et al. (21, 22). considered a QSAR model to be predictive, if all of the following conditions are satisfied: (i) Q 2 > 0.5, (ii) R 2 > 0.6, (iii) R02 or Rꞌ02 is close to R2, such that [(R2 - R02) ⁄ r 2] or [(R2- Rꞌ02) ⁄ R 2] < 0.1and (iv) 0.85 ≤ K ≤1.15 or 0.85 ≤ K ‹ ≤ 1.15. In addition, according to the recommendation of Roy and Roy [18], an additional statistic for external validation (rm2) was calculated as rm2 = R2*[1-(R2-R02)1 ⁄ 2]. For a model with good external predictability, rm2 value should be > 0.5. (23)

Y-randomization test

This technique ensures the robustness of a QSAR model. The dependent variable vector pIC50 is randomly shuffled and a new QSAR model is developed using the original independent variable matrix. The new QSAR models (after several repetitions) are expected to have low R2 and Q2 values. If the opposite happens then an acceptable QSAR model cannot be obtained for the specific modeling method and data (12).

Applicability domain

Finally the QSAR model was used to identify novel active compounds via an in silico screening procedure, and thus the definition of its domain of applicability is of particular importance. The utility of a QSAR model is based on its accurate prediction ability for new compounds. A model is valid only within its training domain, and new compounds must be assessed as belonging to the domain before the model is applied. The applicability domain is assessed by the leverage values for each compound. The threshold h* is generally fixed at 3(k + 1) ⁄ n (n is the number of training set compounds, and k is the number of model parameters), On the other hand, when the leverage value of a compound is lower than the threshold value, the probability of accordance between predicted and observed values is as high as that for the training set compounds (24).

In silico screening

Finally in silico screening procedure was carried out to identify a variety of potential novel lead compounds by presenting structural modifications on the original dataset. Throughout the screening procedure, only the predictions that fall into the domain of applicability were considered reliable.

Results and Discussion

On the basis of Kennard-Stones algorithm, the dataset of the 46 pyridazine derivatives was divided into a training set (35 compounds) and a prediction or test set (11 compounds, see Table 1). Stepwise regression was used on the training data set to develop MLR QSAR model.

pIC50 = -102.168 (± 15.264) + 119.255(± 17.542) MATS4m + 0.106(± 0.026) RDF 105u – 0.168(± 0.024) RDF100u – 5.458(± 1.131) GATS 3v + 0.269(± 0.053) RDF075v + 1.222(± 0.293) C-005 + 0.073(± 0.022) RDF095u – 0.006(± 0.002) Surface area

N = 35 R2 = 0.837 R2adj= 0.780 F = 16.079 S.E.= 0.330 Q2 = 0.733

RMScv = 0.374 R2 pred= 0.754

The regression coefficients in this model are significant at the 95% level. This equation can predict and explain 75.4% and 83.7%, respectively, of the variance of the inhibitory activity.

The possibility of having included outliers in our dataset was investigated by calculating the standard residuals. Standardized residuals greater than 2.5 or less than -2.5 are considered large and indicate the exclusion of the respective data from the data set. The calculated standardized residuals were within the above upper and lower limits for all the compounds, and thus, none of them were excluded from the data set as outlier (3). The Figure1 showed that the good model was obtained with eight descriptors.

The above equation showed that the most significant descriptors are Moran autocorrelation of lag 4 weighted by mass (MATS4m), Radial Distribution Function - 105 / unweighted (RDF105u), Radial Distribution Function - 100 / unweighted (RDF100u), Geary autocorrelation of lag 3 weighted by van der Waals volume (GATS3v), Radial Distribution Function - 075 / weighted by van der Waals volume (RDF075v), CH3X (C-005), Radial Distribution Function - 095 / unweighted (RDF095u) and surface area. The correlation matrix (Table 2) indicated that the eight selected descriptors are not highly correlated. Variance Inflation Factor (VIF) values for the eight descriptors (25-27), also shown in Table 2, demonstrate that the model contains no multicollinearity.

Interpretation of the Selected Descriptors

The variety of factors such as molecular electrostatic potential, polarizability, hydrophobicity, and lipophobicity influence the binding of ligand to its target. MATS4m and GATS3v are Autocorrelation of Topological Structure. The 2D-autocorrelation descriptors explain how the values of certain functions, at intervals equal to the lag, are correlated. The 2D autocorrelation descriptors represent the topological structure of the compounds, but are more complex in nature when compared to the classical topological descriptors. The computation of these descriptors involves the summations of different autocorrelation functions corresponding to different structural lags and leads to different autocorrelation vectors corresponding to the lengths of the sub-structural fragments. Basically, the pool of 2D autocorrelation descriptors defines a wide 2D space. On behalf of a greater applicability, physicochemical properties (atomic masses, atomic van der Waals volumes, atomic Sanderson electronegativities, and atomic polarizabilities) were inserted as weighting components. As a result, these descriptors address the topology of the structure or parts thereof in association with a specific physicochemical property. Bearing in mind this aspect, the interpretation of 2D autocorrelation descriptors was uneasy. Based on derived model the positive regression coefficient of MATS4m shows that small path lengths and branching in the molecule (lag 4 weighted by atomic mass) contribute to higher activity (28). RDF105u, RDF100u, RDF075v and RDF095u are Radial Distribution Function descriptors; the 3D coordinates of the atoms of molecules can be transformed into a structure code that has a fixed number of descriptors irrespective of the size of a molecule. This task is performed by a structure coding technique referred to as Radial Distribution Function code (RDF code). In general, there are some prerequisites for a structure code:

  • Independent from the number of atoms, i.e., the size of a molecule,

  • Unambiguity regarding the three-dimensional arrangement of the atoms, and

  • Invariance against translation and rotation of the entire molecule.

Formally, the Radial Distribution Function of an ensemble of N atoms can be interpreted as the probability distribution to find an atom in a spherical volume of radius r. The equation represents the Radial Distribution Function code as it is used in this investigation:

gr=f×iN-1iNAiAje-B(r-rij)2

Where f is a scaling factor and N is the number of atoms. By including characteristic atomic properties A of the atoms i and j, the RDF codes can be used in different tasks to fit the requirements of the information to be represented.

The exponential term contains the distance rij between the atoms i and j and the smoothing parameter B, which defines the probability distribution of the individual distances. g(r) was calculated at a number of discrete points with defined intervals The radial distribution function in this form meets all the requirements for 3D structure descriptors: it is independent of the number of atoms, i. e., the size of a molecule, it is unique regarding the three-dimensional arrangement of the atoms, and it is invariant against translation and rotation of the entire molecule. Additionally, the RDF descriptors can be restricted to specific atom types or distance ranges to represent specific information in a certain three-dimensional structure space, e.g. to describe steric hindrance or structure/activity properties of a molecule. Finally, the RDF descriptors are interpretable by using simple rules sets, and thus it provides a possibility for conversion of the code back into the corresponding 3D structure. Besides information about interatomic distances in the entire molecule, the RDF descriptors provide further valuable information, e.g. about bond distances, ring types, planar and non-planar systems and atom types. By using different weighting schemes, which include atom types, electronegativity, atom mass or van der Waals radii, RDF can be adjusted to select among those atoms of molecule, which give rise to an important descriptor in deriving an appropriate QSAR (29-31).

Final descriptor C005 is one of the Ghos–Crippen atom-centred fragments related to the methyl group attached to any electronegative atom (O, N, S, P, Se, halogens) fragment. It gives information about the number of predefined structural features in the molecule. Based on the produced QSAR equation a high value of MATS4m, RDF105u, RDF075v, C-005 and RDF095u give a positive contribution to the IL-1 production inhibition. On the other hand, a high value of GATS 3v and Surface area give a negative contribution to the inhibition.

Model Validation

The IL-1β production inhibition predictability of the proposed model was evaluated by using the external set of 11 compounds (Table 1). The proposed QSAR model has all conditions to be considered as predictive models.

R2 pred = 0.754 > 0.6

[(R2pred - R02) ⁄ R2pred] = -0.078 < 0.1

[(R2pred – R׳02) ⁄ r2pred] = -0.061<0.1

rm2 = 0.595> 0.5

K= 0.83, K׳=1.00

This model was further validated by applying the Y-randomization test. Several random shuffles of the Y vector were performed. The low R2 (0.0 < R2 < 0.34) and Q2 (0.0 < Q2 < 0.25) values indicate that the good results in our original model are not due to a chance correlation or structural dependency of the training set. The extrapolation method was applied to the compounds that constitute the test set. The results are presented in Table 1. None of the 11 compounds fell outside from the domain of the model (warning leverage limit = 0.77).

The suggested method, according to the high predictive ability, could be a useful tool to the costly and time consuming experiments for determining the IL-1 β production inhibitory activity of pyridazine derivatives. The method can also be used to the screen virtual compounds in order to identify derivatives with desired activity.

In silico screening

The in silico screening was applied to the design of new structures with potential IL-1 β production inhibitors according to the developed QSAR model. The role of the in silico screen was as a guide to the identification of the most promising new synthetic targets. For this purpose, first we selected potent inhibitors from data set (3, 4, 17, 22, 23, 24, 28, 31, 32, 33, 34, 35, 36, 37) and retain the main scaffold (4-methoxy phenyl moiety) and change the heterocyclic core, aromatic moiety and connection between aromatic moiety and central heterocyclic core. The modifications incorporated in the virtual screening study were chosen based on their synthetic feasibility. The suggested structures were not involving the use of unusual ring fragments or functional groups that cannot be prepared using established protocols. The chemistry of 1,2,4-triazine was well understood and introducing practical modifications here was considered synthetically viable, the in silico screen began with the replacement of the pyridazine core by 1,2,4-triazine (Table 3). The model tolerated the introduction of 1,2,4-triazine since all those studied were within the domain of applicability. The compounds 49 and 51 showed the best activity, 6.83 and 6.56, respectively. In some cases the activities lower and just one compound has higher activity than original compound. The next step was incorporated 1,2,4-triazine instead of pyridazine core in second series of compounds that the different substituted aromatic moiety connect to central heterocyclic by O or S atoms. This model tolerated and the activity was retained (Table 4.).

The statistical parameters according to number of descriptors entered in model.
Table 1

IL-1 β production inhibitory activity—observed and predicted using developed model

Table 2

Correlation matrix for the eight selected descriptors

MATS4m GATS3v RDF095uRDF100u RDF105u RDF075v C-005 Surface area VIF*
MATS4m1 0.598 0.185 0.244 0.129 0.307 -0.273 0.480 3.785
GATS3v 1 0.064 0.143 0.242 0.379 -.005 0.214 3.143
RDF095u1 0.612 0.511 0.235 -0.154 0.548 2.830
RDF100u 1 0.608 0.555 -0.209 0.614 2.719
RDF105u1 0.639 -.099 0.585 2.744
RDF075v 1 -0.305 0.485 1.478
C-005 1 -0.132 2.389
Surface area 1 3.563
Table 3

Predicted activities of selected 1,2,4-triazin-3(2H)-one derivatives

Table 4.

Predicted activities of selected 3-aryl 1,2,4-triazin derivatives

Table 5

Predicted activities of selected 3-amino 1,2,4-triazin derivatives

ID XAr pIC50 predicted Leverage
61 3,4,5-Cl3PhNH 7.51 0.350
62 2,3,4,5,6-F5PhNH 6.67 0.120
63 PhNH 6.89 0.182
64 2,3-F2PhNH 6.72 0.055
65 2-CNPhNH 6.29 0.034
66 3-CNPhNH 6.90 0.101
67 2,5-F2PhNH 5.84 0.099
68 2,3,5,6-F4PhNH 6.36 0.103
Table 6

Predicted activities of selected 3-benzylidenehydrazine 1,2,4-triazin derivatives

Table 7

Predicted activities of selected three aryl pyridine derivatives

The other modification was replacement connective atoms (S or O) to N, this modification gave structures that showed good activity and were within the models domain of applicability (Table 5) but introducing the second nitrogen to linker slightly lead to decrease of predicted activity and was clearly within the domain of applicability (Table 6). Interestingly, replacement of pyridazine to pyridine ring led to enhanced biological activity and comfortably within the domain of applicability (Table 7). The In silico study suggested that presence of heterocyclic ring containing nitrogen group was necessary for inhibition Il-1 β production and increasing number of nitrogen ring diminished

biological activity.

Conclusion

In this study we have identified eight descriptors that successfully model the IL-1 β production inhibitory activity. The validation procedures utilized in this work (separation of data into independent training and validation sets, Y-randomization) illustrated the accuracy of produced QSAR model. Based on the developed QSAR model, we have designed novel structures that could be further investigated as novel effective IL-1 beta production inhibitors. The proposed method, due to the high predictive ability, tenders a useful alternative to the costly and time consuming experiments for cytokine production inhibitory activity.

Acknowledgements

References

  • 1.

    Enomoto H, Sawa A, Suhara H, Yamamoto N, Inoue H, Setoguchi C, Tsuji F, Okamoto M, Sasabuchi Y, Horiuchi M, Ban M. Synthesis and pharmacological evaluation of 1,1,3-substituted urea derivatives as potent TNF-alpha production inhibitors. Bioorg. Med. Chem. Lett. 2010;20:4479-82. [PubMed ID: 20580551].

  • 2.

    Tullis JS, Vanrens JC, Natchus MG, Clark MP, De B, Hsieh LC, Janusz MJ. The development of new triazole based inhibitors of tumor necrosis factor-alpha (TNF-alpha) production. Bioorg. Med. Chem. Lett. 2003;13:1665-8. [PubMed ID: 12729637].

  • 3.

    Melagraki G, Afantitis A, Sarimveis H, Koutentis PA, Kollias G, Igglessi-Markopoulou O. Predictive QSAR workflow for the in silico identification and screening of novel HDAC inhibitors. Mol. Divers. 2009;13:301-11. [PubMed ID: 19205914].

  • 4.

    Matsuda T, Aoki T, Ohgiya T, Koshi T, Ohkuchi M, Shigyo H. Synthesis and bioactivities of novel pyridazine derivatives: inhibitors of interleukin-1 beta (IL-1beta) production. Bioorg. Med. Chem. Lett. 2001;11:2369-72. [PubMed ID: 11527733].

  • 5.

    Matsuda T, Aoki T, Koshi T, Ohkuchi M, Shigyo H. Synthesis and bioactivities of novel 5,6-bis(4-methoxyphenyl)-2H-pyridazin-3-one derivatives: inhibitors of interleukin-1 beta (IL-1beta) production. Bioorg. Med. Chem. Lett. 2001;11:2373-5. [PubMed ID: 11527734].

  • 6.

    Iman MDA. QSAR and QSTR study of pyrimidine derivatives to improve their therapeutic index as antileishmanial agents. Med. Chem. Res. 2013;22:5029-35.

  • 7.

    Pereira F, Latino DA, Gaudencio SP. QSAR-assisted virtual screening of lead-like molecules from marine and microbial natural sources for antitumor and antibiotic drug discovery. Molecules. 2015;20:4848-73. [PubMed ID: 25789820].

  • 8.

    Huang SH, Tung CW, Fulop F, Li JH. Developing a QSAR model for hepatotoxicity screening of the active compounds in traditional Chinese medicines. Food Chem. Toxicol. 2015;78:71-7. [PubMed ID: 25660478].

  • 9.

    Georgey H. Synthesis, cytotoxic activity and 2D-QSAR study of some imidazoquinazoline derivatives. Molecules. 2014;19:3777-92. [PubMed ID: 24662088].

  • 10.

    Rodionova OY, Pomerantsev AL. Subset selection strategy. J. Chemometrics. 2008;22:674-85.

  • 11.

    Daszykowski M, Walczak B, Massart D. Representative subset selection. Anal. Chim. Acta. 2002;468:91-103.

  • 12.

    Afantitis A, Melagraki G, Sarimveis H, Koutentis PA, Markopoulos J, Igglessi-Markopoulou O. A novel QSAR model for predicting induction of apoptosis by 4-aryl-4H-chromenes. Bioorg. Med. Chem. 2006;14:6686-94. [PubMed ID: 16782350].

  • 13.

    Shahlaei M, Pourhossein A. A 2D image-based method for modeling some c-Src tyrosine kinase inhibitors. Med. Chem. Res. 2013;22:1-14.

  • 14.

    Xu J, Zhu L, Fang D, Wang L, Xiao S, Liu L, Xu W. QSPR studies of impact sensitivity of nitro energetic compounds using three-dimensional descriptors. J. Mol. Graph. Model. 2012;36:10-9. [PubMed ID: 22503858].

  • 15.

    Khoshneviszadeh M, Edraki N, Miri R, Foroumadi A, Hemmateenejad B. QSAR study of 4-aryl-4H-chromenes as a new series of apoptosis inducers using different chemometric tools. Chem. Biol. Drug Des. 2012;79:442-58. [PubMed ID: 22136072].

  • 16.

    Khoshneviszadeh M, Edraki N, Miri R, Hemmateenejad B. Exploring QSAR for substituted 2-sulfonyl-phenyl-indol derivatives as potent and selective COX-2 inhibitors using different chemometrics tools. Chem. Biol. Drug Des. 2008;72:564-74. [PubMed ID: 19090923].

  • 17.

    Leonard JT, Roy K. QSAR by LFER model of HIV protease inhibitor mannitol derivatives using FA-MLR, PCRA, and PLS techniques. Bioorg. Med. Chem. 2006;14:1039-46. [PubMed ID: 16213730].

  • 18.

    Darlington RB. Regression and linear models. New York: McGraw-Hill; 1990.

  • 19.

    Cho SJ, Hermsmeier MA. Genetic algorithm guided selection: variable selection and subset selection. J. Chem. Inf. Comput. Sci. 2002;42:927-36. [PubMed ID: 12132894].

  • 20.

    Kumar H, Siddiqui AA, Gupta SP. A Quantitative structure-activity relationship study on some imidazoles and 2-aminopyridines acting as nitric oxide synthase (NOS) inhibitors. Lett. Drug Des. Discov. 2013;10:666-74.

  • 21.

    Tropsha A, Gramatica P, Gombar VK. The importance of being earnest: validation is the absolute essential for successful application and interpretation of QSPR models. QSAR Comb. Sci. 2003;22:69-77.

  • 22.

    Golbraikh A, Tropsha A. Beware of q2! J. Mol. Graph. Model. 2002;20:269-76.

  • 23.

    Roy PP, Roy K. On some aspects of variable selection for partial least squares regression models. QSAR Comb. Sci. 2008;27:302-13.

  • 24.

    Gramatica P. Principles of QSAR models validation: internal and external. QSAR Comb. Sci. 2007;26:694-701.

  • 25.

    Hari Narayana Moorthy NS, Ramos MJ, Fernandes PA. QSAR analysis of isosteviol derivatives as-glucosidase inhibitors with element count and other descriptors. Lett. Drug Des. Discov. 2011;8:14-25.

  • 26.

    Soni LK, Gupta AK, Kaskhedikar SG. Exploration of QSAR modelling techniques and their combination to rationalize the physicochemical characters of nitrophenyl derivatives towards aldose reductase inhibition. J. Enzyme Inhib. Med. Chem. 2009;24:1002-7. [PubMed ID: 19514863].

  • 27.

    Holder AJ, Ye L, Yourtee DM, Agarwal A, Eick JD, Chappelow CC. An application of the QM-QSAR method to predict and rationalize lipophilicity of simple monomers. Dental Materials. 2005;21:591-8. [PubMed ID: 15978267].

  • 28.

    Asadollahi T, Dadfarnia S, Shabani AMH, Ghasemi JB, Sarkhosh M. QSAR models for CXCR2 receptor antagonists based on the genetic algorithm for data preprocessing prior to application of the PLS linear regression method and design of the new compounds using in silico virtual screening. Molecules. 2011;16:1928-55. [PubMed ID: 21358586].

  • 29.

    Gonzalez MP, Teran C, Teijeira M, Helguera AM. Radial distribution function descriptors: an alternative for predicting A2 A adenosine receptors agonists. Eur. J. Med. Chem. 2006;41:56-62. [PubMed ID: 16253394].

  • 30.

    Gonzalez MP, Gandara Z, Fall Y, Gomez G. Radial Distribution Function descriptors for predicting affinity for vitamin D receptor. Eur. J. Med. Chem. 2008;43:1360-5. [PubMed ID: 18068275].

  • 31.

    Fedorowicz A, Zheng L, Singh H, Demchuk E. QSAR study of skin sensitization using local lymph node assay data. Int. J. Mol. Sci. 2004;5:56-66.