Currently, a large amount of experimental data has been accumulated in chemistry. In this regard, there is a need to improve computational methods for storing and processing experimental data. In this approach, organic compounds are represented as a set of descriptors that characterize the features of the chemical structure of molecules. In practice, the transition to the gas phase is thermodynamically possible not only for liquids, but also for substances in the solid state. For a large number of organic compounds, their transition from a solid state to a gaseous state is possible, bypassing the liquid phase. This process is called sublimation. The quantitative characteristic of such a transition is the enthalpy of sublimation, which is denoted as Hsub. This parameter is an important thermodynamic characteristic and is undoubtedly of practical interest. Data on octane numbers for organic compounds were taken from literature data. Information on the enthalpy of sublimation for 845 organic substances was included in the database for this work. To simplify the analysis of the representation of organic compounds, we used 208 RDKit descriptors, as they are among the best descriptors for predicting the properties of chemical compounds. These descriptors are created based on the shared keys of the substructure. In addition, the models were calculated using Morgan's molecular fingerprints, also known as circular prints with a radius of 2. Within the framework of this work, ridge regression, the random forest algorithm, the kNN nearest neighbor method, the support vector machine (SVM) method, and artificial neural networks were implemented. For the training sample, the obtained random forest classification model showed an error-free classification, the prediction error for it is 0. The statistical characteristics of the constructed ridge regression model for the sample have the following values: R2 =0.88 and the prediction error RMSE=13.05 kDj/mol.
BIG DATA, INDUSTRY 4.0, ENTHALPY OF SUBLIMATION, MACHINE LEARNING, ARTIFICIAL INTELLEGENCE