:orphan: .. _adsDensOrganicsAppNote: Density calculation for organics using PCFF+, in HT mode ----------------------------------------------------------- .. sectionauthor:: Marianna .. reviewer: Dave .. status publishable .. expert Marianna, Dave .. keywords density, pcff+, forcefield, HT, LAMMPS .. modules LAMMPS, QSPR, HT In 1964, in his *Lectures on Physics*, Richard Feynman reflected *"If we were to name the most powerful assumption of all, which leads one on and on in an attempt to understand life, it is that all things are made of atoms and that everything that living things do can be understood in terms of the jigglings and wigglings of atoms."* Studying matter at the atomic scale can indeed provide invaluable information on how matter behaves and why it behaves the way it does. Forcefield simulations are widely used nowadays on systems containing hundreds to millions of atoms and for times that extend from a few picoseconds to microseconds. Density is a fundamental macroscopic property that can be calculated by forcefield simulations and which needs to be in excellent agreement with experimental data for any other property prediction to be meaningful. Other macroscopic properties of interest for fluids that may be calculated are saturation pressure, vaporization enthalpy, normal boiling point, critical point, solubility, diffusivity, viscosity, and thermal conductivity. In this work, we use molecular dynamics (MD) simulations, with the PCFF+ forcefield [#Rigby2026]_, to calculate saturated liquid density at several temperatures between the melting and the critical point of pure organic compounds. The calculations are performed in High-Throughput (HT) mode to facilitate and expedite setting up simulations as well as collecting and post-processing simulation output. A well-established and straightforward Group-Contribution QSPR method [#Joback1987]_ is used to select a set of temperatures for performing simulations for each compound. Forcefields ^^^^^^^^^^^^ Forcefield simulations rely—as the name implies—on the use of forcefields to describe the interatomic (or interparticle) interactions. In the case of molecular systems, these interactions are both intra- and inter-molecular. For the past four decades, numerous forcefields have been developed and proposed in the literature. Based on the forcefield development approach and techniques, the terms included, the energy functional forms used, the assumptions and approximations made, they can be classified as, e.g., Class I [#Vanommeslaeghe2010]_ [#Cornell1995]_ [#Jorgensen1996]_, Class II [#Maple1994]_ [#Sun1994]_ [#Sun1998]_, Class III, Embedded Atom (EAM) [#Daw1984]_, ReaxFF [#Senftle2016]_, and Machine Learning (ML) [#Unke2021]_. | Important aspects of any forcefield that define their applicability and their ability to be used for property prediction are: - Quality of property prediction - Chemical compounds' breadth of coverage - Transferability of parameters - Range of properties that can be studied PCFF+ description ^^^^^^^^^^^^^^^^^^^^^ The PCFF+ forcefield [#Rigby2026]_ is an extension of PCFF [#Sun1994]_, a Class II, All-Atom (AA) forcefield that has its origin in the CFF series of forcefields [#Maple1994]_. .. math:: E = \sum{E^{b}} + \sum{E^{a}} + \sum{E^{0}} + \sum{E^{t}} \\ \\ + \sum{E^{bb}} + \sum{E^{ab}} + \sum{E^{aa}} + \sum{E^{at}} + \sum{E^{bt}} \\ \\ + \sum{E^{elec}} + \sum{E^{VDW}} where .. math:: E^{b} = \sum_{i=2}^{4}{k_{i}^{b}(b - b_{0})^{i}} .. math:: E^{a} = \sum_{i=2}^{4}{k_{i}^{a}(\theta - \theta_{0})^{i}} .. math:: E^{t} = \sum_{i=1}^{4}{k_{i}^{t}(1 - cos i \phi)} .. math:: E^{0} = k^{0}(\chi - \chi_{0})^{2} .. math:: \{E^{bb},E^{aa},E^{ab}\} = k^{c}(s-s_{0})(s' - s'_{0}) .. math:: \{E^{bt}\} = (b - b_{0}) \sum_{i=1}^{3}{k_{i}^{c}(1 - cos i \phi)} \} .. math:: \{ E_{at} = (\theta - \theta_{0}) \sum_{i=1}^{3}{k_{i}^{c}(1 - cos i \phi)} \} .. math:: E^{el} = \sum_{ij}{\dfrac{q_{i}q_{j}}{r_{ij}}} .. math:: E^{VDW} = \sum_{ij}{\epsilon_{ij} \left[ 2 \left( \dfrac{r_{ij}^{*}}{r_{ij}} \right)^9 - 3 \left( \dfrac{r_{ij}^{*}}{r_{ij}} \right)^6 \right]} are the bond :math:`E^{b}`, angle :math:`E^{a}`, torsion :math:`E^{t}`, out-of-plane angles :math:`E^{0}`, bond-bond :math:`E^{bb}` / angle-angle :math:`E^{aa}` / bond-angle :math:`E^{ba}`, bond-torsion :math:`E^{bt}`, electrostatic :math:`E^{el}`, and van der Waals terms :math:`E^{VDW}`, respectively. The nonbond terms (van der Waals and electrostatic) are pairwise interactions between atoms that are separated by three or more bonds (intramolecular interactions) that belong to different molecules (intermolecular interactions). Amongst the different energy contributions, the nonbond interactions are often the most difficult to parameterize, as they cannot be directly obtained from ab initio calculations. PCFF+ extensions and improvements on the original PCFF forcefield focus on precisely this point, i.e. the optimization of existing nonbond parameters as well as the introduction and parameterization of new atom types to achieve the highest possible coverage and accuracy for organic (and some inorganic) species, while maintaining forcefield transferability and a manageable number of distinct atom types. PCFF+ provides coverage for all major groups of organic species, shown in figure :numref:`Figure %s `. As is shown, there is availability of all required forcefield parameters for 95% of the species included in DIPPR [#Wilding1998]_, one of the largest molecular databases. The vast majority of the database consists of organic compounds, but there are also many inorganic compounds. .. _pcff+coverage: .. figure:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/pcff+coverage.png Chemical families of compounds tested for forcefield coverage with PCFF+. A total of 2,150 compounds are included in the coverage analysis. Molecular Simulations ^^^^^^^^^^^^^^^^^^^^^^^^ A significant benefit of the use of forcefield simulations for property prediction is the fact that the amount of human time needed to set up the simulations can be minimized with the use of robust High-Throughput (HT) computation protocols and reliable compute engines. Here we use a robust MedeA HT computational protocol to calculate saturated liquid density at five (5) temperatures for twenty-eight (28) compounds. The only input provided by the user is a list of compounds, together with their SMILES codes and the number of temperatures at which to run the simulations. Everything is processed in a single job, a single workflow, which sets up different tasks for creating the appropriate input (including the creation of amorphous liquid-like configurations of each compound and the choice of temperatures), submitting the computations, retrieving the output once a task is finished, post-processing the output and printing the results in the desired format. Depending on the availability of computing resources, the different tasks can run in a serial or parallel way, locally or remotely. The schematic workflow of the entire job is shown in :numref:`Figure %s `. .. _workflow4denscalcHT: .. figure:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/Workflow.png :width: 3.1in Workflow of a single job for calculating saturated liquid densities at five temperatures and 1 atm, for 28 organic compounds. Computational Details ^^^^^^^^^^^^^^^^^^^^^^ The first step is the definition of a list of compounds containing the name and SMILES string. Then, for each compound in this list, the Joback Group-Contribution method is used to estimate the melting temperature (:math:`T_{m}`) and the compound's critical temperature (:math:`T_{c}`). Five temperatures are then selected, in the range :math:`(T_{m},0.8*T_{c}]`. An amorphous configuration containing several molecules is created for each compound often at a relatively low density of 0.3 g/ml, though higher densities may optionally be used. The number of molecules is set so that at least 1,200 atoms are present in the system. For each temperature, an MD simulation is started, including a short NVT run (100 ps) followed by a longer NPT run (1 ns). Density is calculated from the simulation output for the time interval for which convergence has been reached after a convergence analysis is performed. Results are collected, and the appropriate tables are printed, together with a summary of the computation. Human intervention is not required after submitting the workflow (as a single job) to the compute resources. Moreover, the number of compounds may be increased, as appropriate, without any change to the protocol or the human time needed for setting up the workflow. Results and Discussion ^^^^^^^^^^^^^^^^^^^^^^ PCFF+ based density calculations over wide ranges of temperature are presented in the figures below and compared with curves fitted to curated experimental data as developed within the DIPPR 801 database project [#Wilding1998]_ . .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/1_ethane.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/2_propane.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/3_pentane.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/4_octane.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/5_iso-octane.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/6_decane.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/7_dodecane.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/8_methanol.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/9_ethanol.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/10_propanol.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/11_isopropanol.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/12_n-pentanol.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/13_propanone.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/14_butanone.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/15_2-pentanone.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/16_methyl-propanoate.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/17_methyl-butanoate.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/18_benzene.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/19_naphthalene.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/20_anthracene.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/21_toluene.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/23_cyclohexane.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/24_methyl-cyclohexane.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/25_thiophene.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/26_pyridine.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/27_pyrole.png :width: 3.1in .. image:: /AppNotes/images/ApplicationNote-DensityOfOrganics-pcff+/28_phenol.png :width: 3.1in Overall, agreement with the correlated experimental data is excellent, generally exhibiting average absolute deviations better than the parameterization target of 1% between the melting temperature and normal boiling point imposed for all new parameter development in PCFF+. .. only:: html :download: :download:`pdf ` .. [#Rigby2026] Rigby, D. “The PCFF+ forcefield for condensed matter simulation: Overview and validation”, in preparation. .. [#Joback1987] Joback, K. G.; Reid, R. C. "Estimation of pure-component properties from Group-Contributions", *Chem. Eng. Comm.* **57**, 233–243, (1987) https://doi.org/10.1080/00986448708960487 .. [#Sun1994] Sun, H.; Mumby, S. J.; Maple, J. R.; Hagler, A. T., "An Ab Initio CFF93 All-Atom Force Field for Polycarbonates", *J. Am. Chem. Soc.* **116**, 2978–2987, (1994) https://doi.org/10.1021/ja00086a030 .. [#Sun1998] Sun, H. "COMPASS An Ab Initio Force-Field Optimized for Condensed-Phase Applications Overview with Details on Alkane and Benzene Compounds". *J. Phys. Chem. B*, **102**, 7338–7364, (1998) https://doi.org/10.1021/jp980939v .. [#Daw1984] Daw, M. S.; Baskes, M. I., "Embedded-Atom Method: Derivation and Application to Impurities, Surfaces, and Other Defects in Metals", *Phys. Rev. B*, **29**, 6443–6453, (1984) https://doi.org/10.1103/PhysRevB.29.6443 .. [#Senftle2016] Senftle, T. P.; Hong, S.; Islam, M. M.; Kylasa, S. B.; Zheng, Y.; Shin, Y. K.; Junkermeier, C.; Engel-Herbert, R.; Janik, M. J.; Aktulga, H. M.; Verstraelen, T.; Grama, A.; van Duin, A. C. T., "The ReaxFF Reactive Force-Field: Development, Applications and Future Directions", *npj Computational Materials* **2**, 15011, (2016) https://doi.org/10.1038/npjcompumats.2015.11 .. [#Vanommeslaeghe2010] Vanommeslaeghe, K.; Hatcher, E.; Acharya, C.; Kundu, S.; Zhong, S.; Shim, J.; Darian, E.; Guvench, O.; Lopes, P.; Vorobyov, I.; Mackerell, A. D., Jr., "CHARMM General Force Field: A Force Field for Drug-like Molecules Compatible with the CHARMM All-Atom Additive Biological Force Fields", *J Comput Chem*, **31**, 671–690, (2010) https://doi.org/10.1002/jcc.21367 .. [#Cornell1995] Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, K. M.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman, P. A., "A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules", *J. Am. Chem. Soc.* **117**, 5179–5197, (1995) https://doi.org/10.1021/ja00124a002 .. [#Jorgensen1996] Jorgensen, W. L.; Maxwell, D. S.; Tirado-Rives, J., "Development and Testing of the OPLS All-Atom Force Field on Conformational Energetics and Properties of Organic Liquids", *J. Am. Chem. Soc.* **118**, 11225–11236, (1996) https://doi.org/10.1021/ja9621760 .. [#Wilding1998] Wilding, W. V.; Rowley, R. L.; Oscarson, J. L., "DIPPR® Project 801 Evaluated Process Design Data", *Fluid Phase Equilibria* **150–151**, 413–420, (1998) https://doi.org/10.1016/S0378-3812(98)00341-0 .. [#Maple1994] Maple, J. R.; Hwang, M.-J.; Stockfisch, T. P.; Dinur, U.; Waldman, M.; Ewig, C. S.; Hagler, A. T., "Derivation of Class II Force Fields. I. Methodology and Quantum Force Field for the Alkyl Functional Group and Alkane Molecules", *Journal of Computational Chemistry* **15**, 162–182, (1994) https://doi.org/10.1002/jcc.540150207 .. [#Gong2018] Gong, Z.; Wu, Y.; Wu, L.; Sun, H., "Predicting Thermodynamic Properties of Alkanes by High-Throughput Force Field Simulation and Machine Learning", *J. Chem. Inf. Model.*, **58**, 2502–2516, (2018) https://doi.org/10.1021/acs.jcim.8b00407 .. [#Unke2021] Unke, O. T.; Chmiela, S.; Sauceda, H. E.; Gastegger, M.; Poltavsky, I.; Schütt, K. T.; Tkatchenko, A.; Müller, K.-R., "Machine Learning Force Fields", *Chem. Rev.* 2021. https://doi.org/10.1021/acs.chemrev.0c01111