<?xml version="1.0" encoding="utf-8"?>
  <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
      <title>BIOVIA</title>
      <link>https://blog--3ds--com.apsulis.fr/brands/biovia/feed.xml</link>
      <description>BIOVIA</description>
      <lastBuildDate>Thu, 05 Mar 2026 16:10:04 GMT</lastBuildDate>
      <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
      <generator>3DExperience Works</generator>
      <atom:link href="https://blog--3ds--com.apsulis.fr/brands/biovia/feed.xml" rel="self" type="application/rss+xml"/>

      <item>
      <title>
      <![CDATA[ blockquote TEST ]]>
      </title>
      <link>https://blog--3ds--com.apsulis.fr/brands/biovia/blockquote/</link>
      <guid>https://blog--3ds--com.apsulis.fr/guid/275346</guid>
      <pubDate>Thu, 27 Feb 2025 09:12:58 GMT</pubDate>
      <description>
      <![CDATA[  ]]>
      </description>
      <content:encoded>
      <![CDATA[ 

Tempus gravida condimentum sed torquent class donec scelerisque rutrum. Elit cras interdum habitant dis porta. Risus odio inceptos tristique ullamcorper in sed mi.









Malesuada feugiat commodo 3DEXPERIENCE urna hac ligula facilisis dapibus nascetur eu. Nibh taciti lacinia turpis pede eleifend urna italic si phasellus imperdiet. Donec augue highlight urna ad curabitur aliquam convallis fringilla imperdiet magna. Internal link and external link.









2 Malesuada feugiat commodo 3DEXPERIENCE urna hac ligula facilisis dapibus nascetur eu. Nibh taciti lacinia turpis pede eleifend urna italic si phasellus imperdiet. Donec augue highlight urna ad curabitur aliquam convallis fringilla imperdiet magna. Internal link and external link.





NEW Malesuada feugiat commodo 3DEXPERIENCE urna hac ligula facilisis dapibus nascetur eu. Nibh taciti lacinia turpis pede eleifend urna italic si phasellus imperdiet. Donec augue highlight urna ad curabitur aliquam convallis fringilla imperdiet magna. Internal link and external link.

 ]]>
      </content:encoded>
      </item>
<item>
      <title>
      <![CDATA[ The Shape of Water ]]>
      </title>
      <link>https://blog--3ds--com.apsulis.fr/brands/biovia/the-shape-of-water/</link>
      <guid>https://blog--3ds--com.apsulis.fr/guid/273202</guid>
      <pubDate>Fri, 22 Nov 2024 01:16:25 GMT</pubDate>
      <description>
      <![CDATA[ Discover how BIOVIA Solvation Chemistry and COSMOtherm uncover the hydrophobic effect of water, a phenomenon fundamental to life and industry.
 ]]>
      </description>
      <content:encoded>
      <![CDATA[ 
Understanding the Hydrophobic Effect



Although it seems quite abstract, the interactions of molecules and atoms on a microscopic level often have profound effects on the macroscopic properties of materials and life itself. BIOVIA Solvation Chemistry provides the scientific link to understand these connections between microscopic, molecular interactions and industry-relevant experimental properties of liquids such as solubilities, vapor pressures, partition coefficients and many more. Exemplarily, in this blog post, we want to discuss the anomalous hydrophobic effect of water on a molecular level and its implications on industrial applications and life on Earth in general.



The phenomenon known as the hydrophobic effect, originating from the Greek words ύδωρ (ydor, water) and φόβος (phobos, fear), describes how molecules that “fear” water, tend to come together to avoid interacting with it. This effect explains, why oil and water do not mix but form separate phases, or why some compounds are better soluble in water than others.



It is not just dry textbook knowledge, but a concept fundamental to the existence of life on our planet, serving as a key driver behind biological processes such as the formation of cells and the folding of proteins in active or inactive structures. Proteins are essential building blocks of our body, consisting of chains of amino acids. In many cases, proteins only function properly when they are in a certain folded state, a state that is typically achieved within a specific range of temperatures. Among other factors, increased temperatures can lead to the denaturation of proteins, rendering them inactive. This process can be easily observed when e.g. boiling an egg, where egg albumen turns into a white, opaque substance upon denaturation of proteins (mostly ovalbumin) above 60°C. Also at lower temperatures, proteins can undergo reversible unfolding, a process known as cold denaturation. This behavior is directly linked to the temperature sensitivity of the hydrophobic effect and in order to understand it, we need to investigate the hydrophobicity of water itself.



The Surprising Hydrophobicity of Water



It might sound surprising, but water itself can be hydrophobic, at least to a certain degree.



Water molecules can be arranged in specific shapes or clusters. Here, molecules are connected by, what chemists call, a hydrogen bond. In water, this special type of bond occurs between a hydrogen atom and the oxygen atom of another water molecule. Since water has two hydrogen atoms and each oxygen atom can accommodate two hydrogen-bonds, complex networks can be formed, that stabilize different shapes of water clusters. Interestingly, the surface of these clusters has different properties than the surface of individual water molecules.



Relative screening charge density profile of the surface of a single water molecule (blue, structure top left) and a cluster of 20 molecules (orange, structure top right). Clearly the peak of charge neutral area in the center of the orange curve is visible, that is absent for isolated water molecules.







The picture shows the charge density surface of water and a cluster of water molecules connected by hydrogen bonds, easily computed with BIOVIA Turbomole Blue and red areas denote surface areas with large positive or negative screening charge densities, whereas green areas denote small screening charge densities close to zero. Generally, as a direct consequence of Coulomb’s law, opposing charges attract each other. Therefore, compounds with large positive or negative screening charge densities, prefer to be in contact with compounds of matching opposing screening charge densities. Compounds with a lot of nonpolar surface area (green) prefer other nonpolar compounds. As can be seen, water itself has quite a lot of positive and negative screening charge densities (blue, red). In contrast to this, the cluster structure, has more neutral area with small screening charge densities. For this reason, these clusters partly behave like a nonpolar hydrophobic substance.



Therefore, at low temperatures, when these clusters become more stable, surface properties of water change. Water itself becomes increasingly hydrophobic and in turn, hydrophobic molecules become more soluble in water. When the temperature increases, the clusters break apart and the solubility of hydrophobic molecules decreases. At even higher temperatures, other thermodynamic effects increase the solubility again, leading to a minimum in solubility, that is typically found somewhere in the range of 20 to 80 °C, i.e. near room and body temperature.



The accurate assessment of this temperature dependence is of importance in many applications, ranging from the solubility of additives in oil processing or carbon capture applications, to computation of partition coefficients of active pharmaceutical ingredients.



Advancing Innovation with BIOVIA COSMOtherm



BIOVIA COSMOtherm allows to simulate the temperature dependent hydrophobic effect of water in an efficient and accurate way, as is shown in a recent publication by M. P. Andersson and M. Richter



Aqueous solubilities of hexanol and benzaldehyde over a wide temperature range are shown. Experimental solubilities (blue curves) show a minimum around 320 K (hexanol) and 290 K (benzaldehyde) due to the hydrophobic effect of water. COSMOtherm FINE 2023 is able to recover this minima with high accuracy (325 K for hexanol and 290 K for benzaldehyde).







The picture shows the temperature dependent solubility of hexanol and benzaldehyde in water. Since both compounds are rather hydrophobic, the solubility is generally quite small, but simulation and experiment clearly show a minimum of solubility around 290 K (17 °C) for benzaldehyde and 320 K (47 °C) for hexanol. With BIOVIA COSMOtherm the temperature dependent solubility of any compound in water can be easily assessed out-of-the-box, including anomalies like the hydrophobic effect of water. This ability allows performing large-scale in-silico screenings of molecular properties with high accuracy that complement experimental efforts, speed-up innovation and reduce time-to-market for our customers.  



Learn more about BIOVIA COSMO RS.







References




Andersson, M.P. &amp; Richter, M. (2024). Comment on: The shape of water &#8211; how cluster formation explains the hydrophobic effect. J. Mol. Liq., 409, 125465 https://doi.org/10.1016/j.molliq.2024.125465








Interested in staying up to date on all the latest news of BIOVIA?




Subscribe to our newsletter





 ]]>
      </content:encoded>
      </item>
<item>
      <title>
      <![CDATA[ Interpretable Machine Learning in Pipeline Pilot ]]>
      </title>
      <link>https://blog--3ds--com.apsulis.fr/brands/biovia/interpretable-machine-learning-in-pipeline-pilot/</link>
      <guid>https://blog--3ds--com.apsulis.fr/guid/272613</guid>
      <pubDate>Fri, 15 Nov 2024 17:21:04 GMT</pubDate>
      <description>
      <![CDATA[ Explore how interpretable machine learning models like SISSO are advancing scientific insight in chemistry
 ]]>
      </description>
      <content:encoded>
      <![CDATA[ 
Introduction



The majority of machine learning algorithms applied in chemistry and biology are black box models used to make predictions on given target properties.1–3



The model receives input features and generates an output but the inner workings of how the model arrived at the output is unknown or extremely difficult to understand due to the complexity of the models. Therefore, extracting meaningful scientific insights from these models has proven to be a challenge.1,4



Interpretable ML models that offer predictive capabilities combined with interpretable physical equations are gaining traction in many areas of science.1,2,5,6



The goal here is to have what is termed a glass-box model where simple physical equations relate the input features to the target properties. In this way relationships in the data can be understood and improved scientific insight can be gained from the model.







Figure 1: Schematic of Black-box models and Interpretable glass-box models



Sure Independence Screening and Sparsifying Operator – SISSO



Among the methods developed for interpretable machine learning the Sure Independence Screening and Sparsifying Operator – SISSO methodology has been widely applied in heterogeneous catalysis and organic chemistry.7–11 SISSO is part of the symbolic regression class of models and can be used to find mathematical functions to predict the target property.



In simple terms SISSO consists of 2 parts:




Creation of a large feature space by combining the feature columns or descriptors with user selected operators (e.g., multiplication, division, ln, sqrt etc.).



Using sure-independence screening (SIS) to select the descriptors with highest correlation to the target property. Finally applying regularization (ℓ0) to select low-dimension linear models with the lowest error.




With this approach, the aim is to use SISSO to find interpretable equations, that make scientific sense, from a range of input features. The input columns or descriptors can be experimental and/or those obtained from molecular modelling studies, including those conducted in BIOVIA Materials Studio® or BIOVIA Turbomole®.







Figure 2: Example of the types of inputs that have been used with SISSO



The original SISSO code was implemented in FORTRAN12 and does not contain direct Python support. However, a newer C++ implementation (SISSO++) has been released by the NOMAD Laboratory which has native Python integration.13,14



Let’s say I wanted to apply the SISSO algorithm to some chemistry datasets in order to expand my scientific insight. How could I go about deploying this ML method as part of my data science pipelines?



The answer is to use BIOVIA Pipeline Pilot15 &nbsp;to wrap the Python code and extend access to this glass box models.



SISSO++ integration with Pipeline Pilot



Using the strong integration between BIOVIA Pipeline Pilot and Python there are a range of options one can take to incorporate Python code into existing data pipelines.16



In this example we are going to use the Jupyter Notebook components to handle the Python portion and use the native PLP components to read, write and clean data ready for input.



We will take two sets of data, one published by bp17 and one published by Sigman and co-workers.10 The bp data covers the use of benzaldehyde promoters in H-ZSM-5 dehydration of methanol to dimethyl ether (DME) and the Sigman data a diastereoselective Rh catalysed C-H insertion.



Both datasets are small (22 rows and 84 rows) by the standards of most AI methods but reflect the realistic acquisition of small high-quality datasets typically found in industry and academia.



Inside the Python Jupyter Notebook component, SISSO++ can be set up by selecting the operators, target column and the desired train/test split. In addition, hyperparameters can be set and the calculation type can be toggled between regression and classification.



We apply the model to the bp dataset where the target property is DME STY (space time yield – a measure of catalytic performance) and the 10 descriptor columns are density functional theory (DFT) derived features for the organic promoter aldehydes (other reaction parameters are kept constant).



We obtain an interpretable equation that provides scientific insight and outputs are displayed using the Pipeline Pilot reporting components. The SISSO++ model is comparable to the reported model and makes chemical sense as it relates steric and electronic features of the aldehyde promoter to catalytic performance.







Figure 3: Output of SISSO++ regression model for bp dataset run through BIOVIA Pipeline Pilot



One potential limitation of the SISSO++ code is that it can become computationally expensive with datasets that contain a large number of features. To that end, the Materials AI team at BIOVIA along with Felix Hanke (formerly at BIOVIA) developed a BIOVIA Pipeline Pilot native version of SISSO++ for regression problems.



Native SISSO++ in Pipeline Pilot



By using the parallelisation and simplicity of BIOVIA Pipeline Pilot we can increase the speed of finding interpretable equations for scientific datasets and run the models without any coding expertise.



The new protocol applies the same SISSO++ methodology whereby a vast number of features are generated and then parsed to give the best performing equations, but this is performed in a different way inside Pipeline Pilot.







Figure 4: Native SISSO protocol in BIOVIA Pipeline Pilot



The resulting output is comparable to the SISSO++ Python package but simplifies usage for the scientist as you do not need to interact with any code. In fact, the protocol can be run through the Pipeline Pilot Web Port with users selecting parameters through drop-down menus making it ideal for scientists with no coding experience.



In this example we show the output for the dataset from Sigman and co-workers10 where the target is ΔΔG‡ (a measure of diastereoselectivity) and there are 19 DFT derived chemical descriptors. Again, we obtain an interpretable equation for the data that is comparable to the reported model which related steric and electronic properties of the catalyst/ligand to the diastereoselectivity.



Due to the ability of Pipeline Pilot to handle large amounts of data effectively obtaining models with larger datasets (&gt;50 billion generated features) is also possible.







Figure 5: Example of BIOVIA Pipeline Pilot Web Port being used to run native SISSO algorithm.



Conclusion



The simple integration of Python in BIOVIA Pipeline Pilot enables us to incorporate SISSO++ and other Python packages easily into new and existing data pipelines.



We can also make use of the flexibility and speed of BIOVIA Pipeline Pilot to incorporate new methods for interpretable machine learning into data science workflows. In this way, BIOVIA Pipeline Pilot can be used to help scientists gain meaningful scientific insights from predictive models. With BIOVIA Pipeline Pilot these types of models can be deployed in a low/no-code environment to aid understanding and further innovation in scientific challenges.



References



(1)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Esterhuizen, J. A.; Goldsmith, B. R.; Linic, S. Interpretable Machine Learning for Knowledge Generation in Heterogeneous Catalysis. Nat Catal 2022, 5.



(2)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Azodi, C. B.; Tang, J.; Shiu, S.-H. Opening the Black Box: Interpretable Machine Learning for Geneticists. Trends in Genetics 2020, 36 (6), 442–455.



(3)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Jiménez-Luna, J.; Grisoni, F.; Schneider, G. Drug Discovery with Explainable Artificial Intelligence. Nature Machine Intelligence. Nature Research October 1, 2020, pp 573–584.



(4)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Molnar, C. Interpretable Machine Learning. A Guide for Making Black Box Models Explainable; https://christophm.github.io/interpretable-ml-book/., 2019, accessed 04/11/2025



(5)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; La Cava, W. G.; Lee, P. C.; Ajmal, I.; Ding, X.; Solanki, P.; Cohen, J. B.; Moore, J. H.; Herman, D. S. A Flexible Symbolic Regression Method for Constructing Interpretable Clinical Prediction Models. NPJ Digit Med 2023, 6 (1).



(6)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Rudin, C. Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. Nat Mach Intell 2019, 1 (5), 206–215.



(7)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Foppa, L.; Rüther, F.; Geske, M.; Koch, G.; Girgsdies, F.; Kube, P.; Carey, S. J.; Hävecker, M.; Timpe, O.; Tarasov, A. V.; Scheffler, M.; Rosowski, F.; Schlögl, R.; Trunschke, A. Data-Centric Heterogeneous Catalysis: Identifying Rules and Materials Genes of Alkane Selective Oxidation. J Am Chem Soc 2023, 145 (6), 3427–3442.



(8)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Miyazaki, R.; Belthle, K. S.; Tüysüz, H.; Foppa, L.; Scheffler, M. Materials Genes of CO 2 Hydrogenation on Supported Cobalt Catalysts: An Artificial Intelligence Approach Integrating Theoretical and Experimental Data. J Am Chem Soc 2024, 146 (8), 5433–5444.



(9)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Wang, J.; Xie, H.; Wang, Y.; Ouyang, R. Distilling Accurate Descriptors from Multi-Source Experimental Data for Discovering Highly Active Perovskite OER Catalysts. J Am Chem Soc 2023, 145 (20), 11457–11465.



(10)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Souza, L. W.; Miller, B. R.; Cammarota, R. C.; Lo, A.; Lopez, I.; Shiue, Y.-S.; Bergstrom, B. D.; Dishman, S. N.; Fettinger, J. C.; Sigman, M. S.; Shaw, J. T. Deconvoluting Nonlinear Catalyst–Substrate Effects in the Intramolecular Dirhodium-Catalyzed C–H Insertion of Donor/Donor Carbenes Using Data Science Tools. ACS Catal 2023, 104–115.



(11)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Park, J.; Oh, J.; Kim, J. S.; Shin, J. H.; Jeon, N.; Chang, H.; Yun, Y. Catalyst Discovery for Propane Dehydrogenation through Interpretable Machine Learning: Leveraging Laboratory-Scale Database and Atomic Properties. ACS Sustain Chem Eng 2024, 12 (28), 10376–10386.



(12)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Ouyang, R.; Curtarolo, S.; Ahmetcik, E.; Scheffler, M.; Ghiringhelli, L. M. SISSO: A Compressed-Sensing Method for Identifying the Best Low-Dimensional Descriptor in an Immensity of Offered Candidates. Phys Rev Mater 2018, 2 (8), 1–12.



(13)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Purcell, T. A. R.; Scheffler, M.; Carbogno, C.; Ghiringhelli, L. M. SISSO++: A C++ Implementation of the Sure-Independence Screening and Sparsifying Operator Approach. J Open Source Softw 2022, 7 (71), 3960.



(14)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Purcell, T. A. R.; Scheffler, M.; Ghiringhelli, L. M. Recent Advances in the SISSO Method and Their Implementation in the SISSO++ Code. J Chem Phys 2023, 159 (11), 114110.



(15)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Pipeline Pilot. https://www.3ds.com/products-services/biovia/products/data-science/pipeline-pilot/.



(16)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Pipeline Pilot | Integration of Python and Jupyter Notebook. https://www.youtube.com/watch?v=1sFaA7Fj0oM, accessed on 18/09/2024.



(17)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Yang, Z.; Dennis-Smither, B. J.; Buda, C.; Easey, A.; Jackson, F.; Price, G. A.; Sainty, N.; Tan, X.; Xu, Z.; Sunley, G. J. Aromatic Aldehydes as Tuneable and Ppm Level Potent Promoters for Zeolite Catalysed Methanol Dehydration to DME. Catal Sci Technol 2023, 13 (12), 3590–3605.







Interested in staying up to date on all the latest news of BIOVIA?




Subcribe to our newsletter!

 ]]>
      </content:encoded>
      </item>
<item>
      <title>
      <![CDATA[ Next-Generation User Assistance: Making Your Life Easier ]]>
      </title>
      <link>https://blog--3ds--com.apsulis.fr/brands/biovia/next-generation-user-assistance-making-your-life-easier/</link>
      <guid>https://blog--3ds--com.apsulis.fr/guid/271901</guid>
      <pubDate>Thu, 07 Nov 2024 12:56:07 GMT</pubDate>
      <description>
      <![CDATA[ Materials Management is BIOVIA’s new cloud-based solution that enables you to capture, define, and locate any proprietary or commercially sourced chemical or biological material.
 ]]>
      </description>
      <content:encoded>
      <![CDATA[ 
Materials Management is BIOVIA’s new cloud-based solution that enables you to capture, define, and locate any proprietary or commercially sourced chemical or biological material. Built on the powerful 3DEXPERIENCE platform, this new app provides seamless access to critical information for efficient materials management. Take a look at past blogs here and here to find out more about its capabilities and ability to capture a diverse range of entities including PROTACs and molecular glues.



Materials Management and other BIOVIA apps exist as convenient, modern web apps on the 3DEXPERIENCE platform. Interoperable and frequently updated they exploit the latest approaches and best practices available in the world of user assistance to help make usage as easy as possible.



Discover Dynamic “What’s New” Content



Gone are the days of static release notes! With Materials Management, you can now quickly see the new capabilities delivered in each release. Linked directly from the app home page, What&#8217;s New lets you very quickly get up to speed with the new features provided.







What’s New content for the Materials Management R2024x FD03 release, viewable as a set of interactive tiles.



Tiles, illustrated with an image or animated gif provide easy-to-digest highlights that explain the value of each new feature. For each one, you can click on the “Learn more” link to open a panel that provides more information about the new feature that you’re interested in, as well as a link to the detailed related topics in the Materials Management User’s Guide.







Click the “Learn More” link to open a detailed description of the selected feature.



Context Sensitive Help



Have you ever wanted to see help information tailored to a specific operation?



In Materials Management, the User Assistance panel is context-sensitive, showing you the most relevant information for the operation you’re currently working on.



Are you creating a substance and you need to generate it from the BIOVIA Basics Substances list? No problem, simply click the help icon to bring up context-sensitive guidance that explains just how to do that.







Context-sensitive help (on the right) helps you with the task in hand.



If you keep the User Assistance panel open, the context-sensitive help guidance changes as you navigate through the app, so you can always see the content you need to help you accomplish the task in hand.



Role Cockpits



BIOVIA organizes apps into roles enabling you to execute and complete key business processes, such as formulation development. The Formulation Scientist role brings together apps such as Materials Management, Scientific Notebook, Laboratory Operations, and others, so that you can undertake formulation design studies, helping to evolve and respond to changing market needs. A key element of roles like Formulation Scientist is their Role Cockpit.







The Formulation Scientist role cockpit provides scientists with access to key apps and introductory information, to enable them to be productive within minutes.



This role cockpit is designed to provide everything that a formulation scientist needs to accomplish their daily work, out-of-the-box. It provides information on what the role enables along with introductory explanations on all its apps. What’s New information is provided for each app, particularly highlighting those whose key capabilities impact the role the most.



For the Formulation Scientist role, dedicated Design, Make, and Manage Materials tabs offer dashboard based, quick access to the key apps required to enable scientists to rapidly become productive with formulation development capabilities.



Through role cockpits such as this one, scientists can understand and become proficient with formula design within minutes!



Increase Productivity Faster



Courtesy of the 3DEXPERIENCE platform, BIOVIA apps and roles embrace the latest approaches to user assistance to make it as easy as possible for you to accomplish your daily tasks!







Interested in staying up to date on all the latest news from BIOVIA?




Subscribe to our Newsletter!

 ]]>
      </content:encoded>
      </item>
<item>
      <title>
      <![CDATA[ Vratin Srivastava’s BIOVIA Internship: Iterative Protein Design Using Generative Models ]]>
      </title>
      <link>https://blog--3ds--com.apsulis.fr/brands/biovia/internship-iterative-protein-design-using-generative-models/</link>
      <guid>https://blog--3ds--com.apsulis.fr/guid/270030</guid>
      <pubDate>Fri, 25 Oct 2024 11:44:46 GMT</pubDate>
      <description>
      <![CDATA[ Vratin Srivastava just completed a successful summer internship on computational protein design within the Protein Modeling and Simulation team at BIOVIA….
 ]]>
      </description>
      <content:encoded>
      <![CDATA[ 
Vratin Srivastava just completed a successful summer internship on computational protein design within the Protein Modeling and Simulation team at BIOVIA.



During his internship, Vratin leveraged state-of-the-art generative machine learning models to create an end-to-end Python pipeline for designing novel peptide binders to protein targets. He utilized Bayesian optimization techniques and iteratively refined binder designs.&nbsp; This approach aimed to develop a pipeline that could generate peptide designs with minimal parameter input from users. The project explored whether Bayesian optimization could potentially improve the efficiency of the design process compared to methods that generate and filter large numbers of designs.




Everyone in the organization, especially Rohith Mohan and Reed Harrison who guided me through my internship and other scientists working in the modeling and simulation team, were extremely generous, extremely helpful. It was a very rewarding experience, and I&#8217;m really thankful for the opportunity to work on this project.




Curious to learn more about Vratin’s project? Watch the video below where Vratin shares details about his project and his experience working with the BIOVIA team.












 ]]>
      </content:encoded>
      </item>
<item>
      <title>
      <![CDATA[ Biotherapeutics: What Do We Make Next? ]]>
      </title>
      <link>https://blog--3ds--com.apsulis.fr/brands/biovia/biotherapeutics-what-do-we-make-next/</link>
      <guid>https://blog--3ds--com.apsulis.fr/guid/270366</guid>
      <pubDate>Wed, 09 Oct 2024 18:39:22 GMT</pubDate>
      <description>
      <![CDATA[ For years, computational methods for small molecule drug design have offered numerous algorithms and methodologies to help generate new ideas and guide the iterative process of lead design and…
 ]]>
      </description>
      <content:encoded>
      <![CDATA[ 
The question of what should we make next has challenged the world of drug discovery for decades. For years, computational methods for small molecule drug design have offered numerous algorithms and methodologies to help generate new ideas and guide the iterative process of lead design and optimization. For a particular drug target, these methods help to identify high-quality candidates that may eventually advance to clinical development with less experiments and time in the lab. From the early days of combinatorial chemistry and bioisosteric replacement to ligand-, fragment- and structure-based design, there have been many tools, leveraging numerous algorithms that suit your project constraints and design criteria.&nbsp; More recently, AI and machine learning algorithms have been popular in allowing researchers to rapidly explore more ideas in the chemical space and propose novel structures that a medicinal chemist may not have considered trying out when looking for new drugs.



Until recently, the computational design tools for biotherapeutics seemed to require more expertise, and to be more sparse and application-specific compared to the tools that exist for small molecule therapeutics. Of course, there are computational design algorithms available such as homology modeling, protein-protein docking and combinatorial scanning mutagenesis for general protein modeling and binder design, which are used in biotherapeutics lead discovery and optimization. For designing certain types of biological therapies, such as monoclonal antibodies, there are methods such as affinity maturation, humanization and immunogenicity prediction algorithms. However, to help answer directly what variation of our biotherapeutic we should make and test next, two recent AI methods, RFDiffusion and ProteinMPNN, have totally changed the nature of biotherapeutics discovery. These tools have the potential change the way we design biotherapeutics by helping to identify novel candidates that the computational and molecular biologists may not have considered.



Generating Proteins with AI: RFDiffusion and ProteinMPNN



RFDiffusion is a cutting-edge generative AI algorithm that can &#8220;diffuse&#8221; a collection of amino acids into a protein structure. The diffusion process starts with a random, noisy collection of atoms and, through a series of controlled refinements the algorithm makes adjustments to the structure to reduce the noise and move closer to a biologically realistic and functional protein structure. One common analogy for the diffusion process is developing a photo from a blurry image; iterative processing steps can take an initial grainy image and refine the detail and clarity to produce a final clear picture.



RFDiffusion can be utilized for a number of different biotherapeutic design challenges, such as engineering a biologic that can bind to a viral protein to neutralize the virus. With antibody structures or other protein-protein systems, RFDiffusion can be used to design new protein scaffolds that may improve binding affinities or enhance the stability of the binding partners. RFDiffusion can be also used to generate enzyme therapeutics that may break down a specific substrate to treat metabolic disorders. Beyond biotherapeutics, RFDiffusion has potential to help design proteins for industrial and biotechnological applications such as making enzymes that catalyze specific chemical reactions or proteins that suit very specific conditions including low or high temperature, pH, etc.



ProteinMPNN is a state-of-the-art neural network that can predict one or more probable protein sequences given a protein structure. This algorithm has been published with success in one of the most critical aspects of protein sequence design – generating sequences that fold into a stable protein/peptide with propensity to crystallize, facilitating the structure determination of these proteins. ProteinMPNN can be used in conjunction with RFDiffusion to generate new protein designs such as new enzymes or antibodies that can be further evaluated for desired properties such as stability, activity, affinity, and specificity. One of the strengths of ProteinMPNN is its ability to generate multiple sequence variants. This ability is invaluable as different variants provide more options to test and identify candidates with the best performance in terms of efficacy, safety, and manufacturability. Just as significantly, these variants also provide alternative leads when candidates encounter unforeseen issues in protein optimization, during protein expression, or ADMET challenges such as solubility and immunogenicity.



Together, RFDiffusion and ProteinMPNN significantly expand the biological space that can be explored in silico before biologists need to commit to expensive and time-consuming physical experimentation.  They have the potential to open up exciting avenues for more intelligent, model- and data-driven workflows driving innovation in biotherapeutic design.



Generating Proteins with RFDiffusion and ProteinMPNN in Discovery Studio Simulation



In BIOVIA Discovery Studio Simulation, a new Generate Protein Scaffolds protocol now provides easy access to RFDiffusion workflows, the first of which is motif scaffolding. Users can start with a specific part of an existing protein (the motif) and design a complete new protein scaffold that incorporates this motif. This approach allows precise control over the functional regions of the protein, as well as control over the protein scaffold design, via different model weights that suit particular proteins and complexes.



Figure 1- Discovery Studio Simulation users now have access to motif scaffolding with RFDiffusion.



A second new protocol, Generate Protein Sequences, allows users access to not only ProteinMPNN, where they can easily define sequence residues for design, but also to LigandMPNN and SolubleMPNN models. LigandMPNN is an extension to ProteinMPNN that is able to consider protein, small-molecule, nucleic acid, and metal ion ligands as additional context for designing sequences, with the potential to improve the chemical properties of the designed sequences. SolubleMPNN could be a better model to use when protein solubility is part of your design criteria. Users can determine the degree of sequence diversity and confidence desired, as part of the generative design, and have the ability to control the bias of particular amino acids.



Figure 2- Discovery Studio Simulation users can now generate new sequences using ProteinMPNN models and use AlphaFold/OpenFold to generate their 3D structures for further applications.&nbsp;







These two significant new enhancements are exciting additions to the biotherapeutics and protein design tools in Discovery Studio Simulation in the 3DEXPERIENCE® Cloud, which already includes AlphaFold and OpenFold AI structure prediction. They expand the ever-growing arsenal of powerful AI tools for molecular modelers and biologists to help answer the question of “what to make and test next” and accelerate the rational design of biologics. In combination with the existing physics-based methods in Discovery Studio Simulation, users can rapidly explore many more possibilities in silico before arriving at the final handful of candidates that are ready to become a successful commercial biotherapeutic or a biological to be used in agriculture, food and beverage, or environmental industries.



Nobel Prizes in Chemistry and Physics



This year’s Nobel Prizes in Chemistry and Physics celebrate how AI is pushing the boundaries of scientific research. John J. Hopfield and Geoffrey E. Hinton were awarded the Nobel Prize in Physics for their foundational discoveries in machine learning with artificial neural networks, while David Baker, Demis Hassabis, and John Jumper received the Nobel Prize in Chemistry for breakthroughs in computational protein design and protein structure prediction.



At BIOVIA, we are proud to be part of this AI revolution. By integrating AlphaFold2, OpenFold, RFDiffusion, and the ProteinMPNN family of models into our platform, we empower researchers with cutting-edge tools for protein structure prediction and protein design.



Watch the video to learn more how Discovery Studio Simulation now helps users generate novel biologics with RFDiffusion and LigandMPNN models.
















 ]]>
      </content:encoded>
      </item>
<item>
      <title>
      <![CDATA[ Precision Polymer Modeling: Leveraging Materials Studio and Scripting Innovations ]]>
      </title>
      <link>https://blog--3ds--com.apsulis.fr/brands/biovia/precision-polymer-modeling-leveraging-materials-studio-and-scripting-innovations/</link>
      <guid>https://blog--3ds--com.apsulis.fr/guid/270066</guid>
      <pubDate>Wed, 09 Oct 2024 15:04:39 GMT</pubDate>
      <description>
      <![CDATA[ Materials Studio offers a user-friendly yet powerful platform for modeling a huge range of systems, with this report focusing specifically on polymers and polymer networks…
 ]]>
      </description>
      <content:encoded>
      <![CDATA[ 
Materials Studio offers a user-friendly yet powerful platform for modeling a huge range of systems, with this report focusing specifically on polymers and polymer networks. In conjunction with Pipeline Pilot, various methods are available to model such complex materials ranging from the widely reported cross-linked polyethylene (XLPE), to custom networks with unique reaction mechanisms.&nbsp; United atom forcefields are developed to efficiently model such systems involving large numbers of particles, providing a balance between computational efficiency and accuracy.



The Importance of Forcefields in Molecular Simulation



A critical element of molecular simulation is the forcefield.&nbsp; Forcefields govern the behaviour of particles in a system through both bonded and non-bonded interactions, covering contributions including bond stretching, angle bending, dihedral torsions, etc.&nbsp;In polymer models, the forcefield is essential for determining how polymer chains fold, entangle with one another and respond to external stimuli. By accurately modelling these atomic-level interactions, forcefields enable the prediction of macroscopic properties such as density, radius of gyration, and glass transition temperature. Thus, an effective forcefield is essential for both the accuracy of simulations and the reliability of the predicted properties.



There are a number of forcefields available depending on the type of system being modelled.&nbsp; Some forcefields are optimized for specific materials while others, like COMPASSIII [1], are designed for a broader range of materials. The field of forcefield development is continually developing as exemplified by the recent emergence of MACE [2]; a machine learning software, which is used, among other things, to generate forcefields.&nbsp; Of particular significance in polymer modelling are OPLS and OPLS-UA.&nbsp; OPLS, Optimised Potentials for Liquid Simulations [3], was specifically designed to simulate liquids and later expanded to cover a wide range of organic molecules, biomolecules, and polymers.&nbsp; OPLS-UA is a modified version of this forcefield for a united atom approach.&nbsp; Given that molecular simulations can be computationally expensive, especially for systems with large numbers of particles, it is often desirable to simplify these systems by reducing particle counts while simultaneously minimizing the loss of accuracy. This simplification is achieved through united atom or coarse-graining methods These approaches also provide a further benefit of longer time steps, allowing users to conduct simulations with more steps within a set timeframe.



Custom Forcefield Development in Materials Studio



In Materials Studio, users create custom forcefields by inputting parameters in various functional forms, gaining full control. Through literature review and modification of OPLS atomistic parameters, followed by iterative testing and refinement, an OPLS-UA custom forcefield was developed [4]. Torsion parameters, for instance, were validated using conformational analysis on single molecules. The integration of Perl scripting within Materials Studio provided precise control, and a simple script was created to extract torsion energy as a function of torsion angle. These results were then compared to literature data or DMol³calculations. Once refined, the forcefield was tested on six polymer systems, with an Amorphous Cell constructed for each system and equilibrated through molecular dynamics simulations. Physical properties of the equilibrated systems were extracted and compared to literature values. One key test was the radius of gyration as a function of the degree of polymerization, which is well documented for many polymers. To automate the process of generating cells, equilibration, and calculating the radius of gyration for systems with incrementally larger degrees of polymerization, custom Pipeline Pilot protocols were developed [5]. This integration significantly reduced time and minimized human error. It allowed for the radius of gyration data to be compared with theoretical values and ultimately validate the accuracy of the developed forcefield, as exemplified in Figure 1.







Alongside the creation and validation of the forcefield, a simple method was developed to convert a polymer from full atomistic to a united atom representation.&nbsp; This was achieved by removing hydrogen atoms, adjusting the mass numbers of the corresponding beads, and assigning the appropriate forcefield type to each bead. A custom Perl script was created for this process, using a Study Table to perform a pattern search through the included fragments.&nbsp; Each fragment corresponds to a different atom type, such as sp³ CH₂, sp² CH, or aromatic CH, along with their associated forcefield types.&nbsp; As a result, only the name of the input 3D Atomistic Structure file needs to be inserted into the script for the polymer to be automatically converted to united atom and typed.



With a method in place to convert and type polymers, along with the validated OPLS-UA forcefield, polymer and polymer network modelling using this forcefield could be further explored. Depending on the system being constructed, various reactions need to be modelled—most notably addition, cycloaddition, and condensation reactions.&nbsp; Materials Studio offers several methods for this purpose, with the choice depending on the central reaction. Three main methods were employed to model these systems: a customizable networking Perl script, a Pipeline Pilot protocol, and the Reaction Finder tool.&nbsp; Examples of each are discussed below.



Cross-linked polyethylene (XLPE) is a widely studied polymer network, used in numerous applications ranging from electrical insulation in wires to components in fuel gaskets, owing to its high chemical and heat resistance.&nbsp; Users can model this network in Materials Studio using the Pipeline Pilot connector with the modified Network Protocol, as demonstrated in Figure 2. The process begins by creating an Amorphous Cell containing both ethane molecules and polyethylene chains, with reactive atoms defined. These reactive atoms dictate the reactions that build the network, allowing the polymer chains to grow linearly, branch out, or cross-link with other chains. The probability of each reaction is specified, providing additional control over the end system.&nbsp; By altering the probability of the defined reactions, cross-linking density is modified and the effect on physical properties after equilibration can be investigated, as exemplified in Figure 3.







Figure 2: Fragment from XLPE displaying cross-links formed between chains.  The atoms are coloured by forcefield type.



Figure 3: Effect of the extent of cross-linking in equilibrated XLPE models on the density. 



Diels-Alder reactions are prominent cycloaddition reactions between a diene and an alkene, and can be used to join monomers in more complex polymer systems. The Reaction Finder[6] tool is employed to precisely define these reactions by drawing both the reactants and products, mapping the atoms between them, and identifying close contacts between the reactive atoms. This is exemplified in Figure 4.&nbsp; This approach offers significant control, making it particularly useful for modelling specific structures with atypical reactive sites.



Figure 4: Reactants (left) with close contacts defined for a Diels-Alder reaction to form products (right).



To model condensation reactions, a customizable networking Perl script[7]&nbsp; was adapted and used effectively. Reactive atoms are defined in both the monomer and curing agent, which are then organised into an Amorphous Cell. The script specifies these reactive atoms along with the input structure document. An additional subroutine within the script is used to define specific bond-breaking events during the reaction, producing the desired product as shown in Figure 5. This versatile script supports the cross-linking of various monomers and curing agents and can be tailored to meet different system requirements.



Figure 5: Reactants (left) with reactive atoms R1 and R2 defined to form the product (right).



Conclusion: The Power of Materials Studio for Modeling Polymer Networks using United Atom Forcefields



In conclusion, Materials Studio proves to be an exceptional platform for modelling complex polymer systems and networks, offering a wide range of tools and customization options. Its integration with Pipeline Pilot, the ability to develop and refine custom forcefields, and the flexibility provided through Perl scripting enable precise control over model building and simulations.&nbsp;&nbsp;The platform&#8217;s versatility and the high degree of control it offers open up exciting possibilities for future research, allowing users to push the boundaries of polymer design, optimization, and property prediction with great accuracy and efficiency.







[1] Akkermans, R. L. C., Spenley, N. A. and Robertson, S. H. (2020) ‘COMPASS III: automated fitting workflows and extension to ionic liquids’,&nbsp;Molecular Simulation, 47(7), pp. 540–551, doi: 10.1080/08927022.2020.1808215.



[2] https://mace-docs.readthedocs.io/en/latest/ (accessed 09/2024).



[3] William L. Jorgensen, David S. Maxwell, and Julian Tirado-Rives (1996), ‘Development and Testing of the OPLS All-Atom Force Field on Conformational Energetics and Properties of Organic Liquids’, Journal of the American Chemical Society&nbsp;,&nbsp;118&nbsp;(45), pp. 11225-11236, doi: 10.1021/ja9621760.



[4] Available in the BIOVIA Materials Studio community.



[5] Available in the BIOVIA Materials Studio community.



[6] J.W. Abbott and F. Hanke, &#8216;Kinetically Corrected Monte Carlo-Molecular Dynamics Simulations of Solid Electrolyte Interphase Growth&#8217;,&nbsp;J Chem Theor Comput,&nbsp;18, 925 (2022).



[7] Developed by Jason DeJoannis, Stephen Todd &amp; James Wescott (available in the BIOVIA Materials Studio community).












 ]]>
      </content:encoded>
      </item>
<item>
      <title>
      <![CDATA[ Miko Stulajter’s BIOVIA Internship: Prototyping and New Features in BIOVIA Molecular Design ]]>
      </title>
      <link>https://blog--3ds--com.apsulis.fr/brands/biovia/miko-stulajters-biovia-internship-prototyping-and-new-features-in-biovia-molecular-design/</link>
      <guid>https://blog--3ds--com.apsulis.fr/guid/269774</guid>
      <pubDate>Tue, 01 Oct 2024 08:28:00 GMT</pubDate>
      <description>
      <![CDATA[ Miko Stulajter completed a second three-month summer 3DEXPERIENCE® scientific visualization internship with the BIOVIA R&D software engineering team. After a rewarding first internship experience…
 ]]>
      </description>
      <content:encoded>
      <![CDATA[ 
Miko Stulajter completed a second three-month summer 3DEXPERIENCE® scientific visualization internship with the BIOVIA R&amp;D software engineering team. After a rewarding first internship experience, Miko was excited to return for another summer, attracted by the meaningful work he had accomplished through his first internship.



During this internship, Miko focused on implementing and prototyping new features for visualizing molecular trajectories in BIOVIA Molecular Design. Miko developed a new trajectory animation dialog UI, significantly improving user control over trajectory playback. He also prototyped a streaming server to facilitate efficient testing and development for reading remotely hosted trajectory files.&nbsp; Additionally, Miko explored utilizing D3.js to create interactive trajectory visualizations, such as a timeseries of trajectory properties and a potential energy surface visualization of trajectory frames, enhancing the overall trajectory analysis experience in BIOVIA Molecular Design.



Beyond his technical contributions, Miko’s quick learning and adaptability allowed him to rapidly develop and prototype these new features, making him an invaluable part of our intern program this year.




“I really enjoyed my first internship with BIOVA and was excited to return for a second one. It was fulfilling to work on new projects and see some of the work I did last summer now in production. Coming back to BIOVIA felt like I never left.”




Curious to learn more about Miko’s project and his experience working with the BIOVIA team? Watch the video below, where Miko shares his work and the insights he gained.








 ]]>
      </content:encoded>
      </item>
<item>
      <title>
      <![CDATA[ Intern Spotlight: An Investigation into New DFTB + Parameterisation Techniques ]]>
      </title>
      <link>https://blog--3ds--com.apsulis.fr/brands/biovia/intern-spotlight-an-investigation-into-new-dftb-parameterisation-techniques/</link>
      <guid>https://blog--3ds--com.apsulis.fr/guid/269284</guid>
      <pubDate>Wed, 18 Sep 2024 10:32:49 GMT</pubDate>
      <description>
      <![CDATA[ Watch this video to learn about Fred Kirk’s summer internship on the BIOVIA Quantum Mechanics team at Dassault Systemes.
 ]]>
      </description>
      <content:encoded>
      <![CDATA[ 
Fred Kirk recently completed a two-month summer internship as a software engineering intern on the BIOVIA Quantum Mechanics team. During these two months he worked on research and development projects specifically focused on improving Density Functional Tight Binding (DFTB+) parameterization techniques.



Fred&#8217;s primary project centered on using sparse Gaussian process regression to fit empirical splines, enhancing the accuracy and efficiency of a DFT (Density Functional Theory) approximation. In addition to his technical contributions, Fred brought fresh perspectives and creative problem-solving skills to the team, making him an invaluable part of our intern program this year.




What started out as quite an intimidating project quickly became a genuine interest. With endless support from the many brilliant minds in the office, my task steadily progressed. By the end, I had an outcome that I was truly proud of.




Curious to learn more about Fred’s project and his experience working with the BIOVIA team? Watch the video below, where Fred shares insights into his research and the broader impact of his work on the industry.







If you are interested in exploring the challenges and rewards of a career in scientific software development at Dassault Systèmes, visit this website:



 Be the Next Game Changer | Careers &#8211; Dassault Systèmes (3ds.com)
 ]]>
      </content:encoded>
      </item>
<item>
      <title>
      <![CDATA[ Forcefield Optimization for Energetic Materials ]]>
      </title>
      <link>https://blog--3ds--com.apsulis.fr/brands/biovia/forcefield-optimization-for-energetic-materials/</link>
      <guid>https://blog--3ds--com.apsulis.fr/guid/268336</guid>
      <pubDate>Tue, 20 Aug 2024 16:09:54 GMT</pubDate>
      <description>
      <![CDATA[ Discover how BIOVIA’s computational tools advance the study of molecular crystals, including crystal structure prediction, lattice energy calculations, and the latest COMPASSIII forcefield developments.
 ]]>
      </description>
      <content:encoded>
      <![CDATA[ 
Understanding Molecular Crystals and Their Computational Study



Molecular crystals, such as trinitrotoluene, paracetamol or caffeine1, are composed of individual molecules bound by nonbonded interactions2. Intramolecular bonds that hold the atoms within each molecule are much stronger than the intermolecular bonds that bind the molecules together in a crystal. The cohesive forces that bind these molecules may include Van der Waals forces, dipole-dipole interactions, π-π interactions and hydrogen bonding. Furthermore, their properties are closely related to their structure making accurate prediction of crystal structures and properties a crucial challenge in various industries. Researchers study inorganic and organic crystals to develop innovative drugs and energetic materials.



Moreover, computational chemistry has emerged as a valuable tool, enabling the rapid study of these compounds3. By using mathematical models and equations various molecular properties and characteristics can be predicted prior to their synthesis in a lab, minimizing the cost, time and risks associated with their synthesis. Crystal prediction4, including lattice energy prediction, density calculations, and polymorphism prediction are extensively studied, particularly in the pharmaceutical and energy industries.



Forcefields5,6 (or interatomic potentials) are particularly interesting computational models used in classical simulations for studying, among others, crystal structures. Over the years, researchers have developed several forcefields to accurately predict their properties.&nbsp; Moreover, comparative studies have been conducted to evaluate these forcefields through different protocols. Robinson et al.6 presents an extensive comparison of 324 forcefield protocols for calculating lattice energies of molecular crystals. This study examined several well-known forcefields, including DREIDING, Universal, CVFF, PCFF, and COMPASS, using the Materials Studio® collection They highlighted the COMPASS (Condensed-phase Optimized Molecular Potentials for Atomistic Simulation Studies) forcefield for its effective performance in lattice energy calculations on general organic molecular crystals, after analyzing 235 crystals from the Cambridge Structural Database7. The study demonstrated that several protocols based on this forcefield achieved good performances and relatively low standard deviation.



COMPASSIII for Molecular Crystals



Figure 1: Experimental crystal of terephthalamide, data from XRD diffractometer







To further improve COMPASS, the Materials Studio Cambridge team developed COMPASSIII8, an extended version for ionic liquids and other materials of the COMPASS9 forcefield, originally designed for condensed-phase properties predictions with a focus on materials. It was the first ab-initio forcefield that enabled accurate and simultaneous prediction properties for a broad range of molecules. This forcefield largely inherited its structure from the Consistent Force Field (CFF). The new version COMPASSIII introduced 27 new types, covering a wide range of materials, with their own typing rules and valence interactions.



To assess the accuracy of COMPASSIII to predict lattice energies and densities of molecular crystals, calculations were performed, using the Forcite module of Materials Studio 2024, on the 235 structures used by Robinson et al6. To automate calculations and make the analysis more efficient, Pipeline Pilot®10 protocols, and in particular the Materials Collection, were utilized, allowing to consecutively run simulations on all desired molecules. Spearman’s correlation coefficient, R-squared value and Root Mean Squared Error (RMSE) were calculated using experimental data from the Cambridge Structural Database. The results, presented in Table 1, demonstrate that COMPASSIII yields enhanced results compared to various forcefields available in Materials Studio.



Table 1: Result comparison for lattice energy calculations on 235 crystals using, COMPASSIII, Universal, DREIDING, CVFF and PCFF forcefields



COMPASSIIIUniversalDREIDINGCVFFPCFFR20.700.290.400.320.51







Lattice energy and density calculations with COMPASSIII were also performed on a larger dataset containing 821 molecules, which included the structures used by Robinson et al6. Density, another significant property of molecular crystals, was accurately predicted with COMPASSIII, achieving an R-squared of 0.97, as shown in Table 3. Although the lattice energy results were promising, as shown in the previous table, there is still potential for further enhancement. To enhance these results, outliers, corresponding to compounds with relative error for lattice energy over 10%, were identified and further investigated. The main problematic family of compounds was those with amino and nitro functions on aromatic heterocycles.



Table 3: Results for lattice energy and density calculations on 821 molecular crystals using COMPASSIII forcefield



Lattice EnergyDensitySpearman0.870.99R20.740.97RMSE17.6 kJ/mol0.04 g/cm³







Study of Compounds to Enhance COMPASSIII Parameters



Figure 2: Example of compounds with amino and nitro functions on aromatic rings







To improve the COMPASSIII forcefield, problematic compounds were examined, particularly those with amino and nitro functions on aromatic heterocycles. Among these, there are some complex structures, such as 2,6-Diamino-3,5-dinitropyrazine-1-oxide, also known as LLM-105. This molecule, featuring a pyrazine-N-oxide structure with two amino and two nitro substituents, is highly energetic. This material is fascinating because of its great thermal stability and insensitivity to friction, impact and shock. This molecule, as well as simpler ones containing similar chemical functional groups, were studied. In particular, partial charges were recalculated with Density Functional Theory (DFT) using the Electrostatic Potentials (ESP) method in DMol3, and incorporated in COMPASSIII.



Figure 3: Relative errors of Lattice Energy Calculations







For most structures with a high relative error in lattice energy, adjusting the partial charges used by COMPASSIII reduced the error below the 10% threshold, as shown in Figure 3. However, while adjusting the charges significantly decreases the error, LLM-105 presents more complexity and needs further improvement. Indeed, the molecular crystal of LLM-105 has a significant hydrogen bonding network, displayed in Figure 4, which can influence lattice energy and density computations necessitating further investigations.



Figure 4: Hydrogen bonding network for LLM-105 crystal







Conclusion



Accurately predicting the properties of molecular crystals is a significant challenge. As previously demonstrated, the COMPASSIII forcefield outperforms other well-known forcefields for these compounds. However, it is essential to continually optimize and review forcefields to ensure accurate parametrization and simulations. By continuously improving COMPASSIII, precise predictions for a wide range of compounds can be achieved.



Learn more about BIOVIA Software.



References



(1)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Braga, D., Casali, L., &amp; Grepioni, F. (2022). The relevance of crystal forms in the pharmaceutical field: sword of damocles or innovation tools?.&nbsp;International Journal of Molecular Sciences,&nbsp;23(16), 9013.



(2)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Politzer, P., &amp; Murray, J. S. (2003).&nbsp;Energetic materials: part 1. Decomposition, crystal and molecular properties. Theoretical and Computational Chemistry, Volume 12. Elsevier: Amsterdam.



(3)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Jorgensen, W. L., &amp; Tirado–Rives, J. (2005). Molecular modeling of organic and biomolecular systems using BOSS and MCPRO.&nbsp;Journal of Computational Chemistry,&nbsp;26(16), 1689-1700.



(4)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Nyman, J., Pundyke, O. S., &amp; Day, G. M. (2016). Accurate force fields and methods for modelling organic molecular crystals at finite temperatures.&nbsp;Physical Chemistry Chemical Physics,&nbsp;18(23), 15828-15837.



(5)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Ermer, O. (2005). Calculation of molecular properties using force fields. Applications in organic chemistry. In&nbsp;Bonding forces&nbsp;(pp. 161-211). Berlin, Heidelberg: Springer Berlin Heidelberg.



(6)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Marchese Robinson, R. L., Geatches, D., Morris, C., Mackenzie, R., Maloney, A. G., Roberts, K. J., &#8230; &amp; Vatvani, D. R. M. (2019). Evaluation of force-field calculations of lattice energies on a large public dataset, assessment of pharmaceutical relevance, and comparison to density functional theory.&nbsp;Journal of chemical information and modeling,&nbsp;59(11), 4778-4792.



(7)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Groom, C. R., Bruno, I. J., Lightfoot, M. P., &amp; Ward, S. C. (2016). The Cambridge structural database.&nbsp;Structural Science,&nbsp;72(2), 171-179.



(8)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Akkermans, R. L., Spenley, N. A., &amp; Robertson, S. H. (2021). COMPASS III: Automated fitting workflows and extension to ionic liquids.&nbsp;Molecular Simulation,&nbsp;47(7), 540-551.



(9)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Sun, H. (1998). COMPASS: an ab initio force-field optimized for condensed-phase applications overview with details on alkane and benzene compounds.&nbsp;The Journal of Physical Chemistry B,&nbsp;102(38), 7338-7364.



(10)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; BIOVIA, Dassault Systèmes, Pipeline Pilot, version 2024, San Diego: Dassault Systèmes, 2024.




 ]]>
      </content:encoded>
      </item>
    </channel>
   </rss>