Implementation of residue-level coarse-grained models in GENESIS for large-scale molecular dynamics simulations

Cheng Tan; Jaewoon Jung; Chigusa Kobayashi; Diego Ugarte La Torre; Shoji Takada; Yuji Sugita

doi:10.1371/journal.pcbi.1009578

Abstract

Residue-level coarse-grained (CG) models have become one of the most popular tools in biomolecular simulations in the trade-off between modeling accuracy and computational efficiency. To investigate large-scale biological phenomena in molecular dynamics (MD) simulations with CG models, unified treatments of proteins and nucleic acids, as well as efficient parallel computations, are indispensable. In the GENESIS MD software, we implement several residue-level CG models, covering structure-based and context-based potentials for both well-folded biomolecules and intrinsically disordered regions. An amino acid residue in protein is represented as a single CG particle centered at the Cα atom position, while a nucleotide in RNA or DNA is modeled with three beads. Then, a single CG particle represents around ten heavy atoms in both proteins and nucleic acids. The input data in CG MD simulations are treated as GROMACS-style input files generated from a newly developed toolbox, GENESIS-CG-tool. To optimize the performance in CG MD simulations, we utilize multiple neighbor lists, each of which is attached to a different nonbonded interaction potential in the cell-linked list method. We found that random number generations for Gaussian distributions in the Langevin thermostat are one of the bottlenecks in CG MD simulations. Therefore, we parallelize the computations with message-passing-interface (MPI) to improve the performance on PC clusters or supercomputers. We simulate Herpes simplex virus (HSV) type 2 B-capsid and chromatin models containing more than 1,000 nucleosomes in GENESIS as examples of large-scale biomolecular simulations with residue-level CG models. This framework extends accessible spatial and temporal scales by multi-scale simulations to study biologically relevant phenomena, such as genome-scale chromatin folding or phase-separated membrane-less condensations.

Author summary

Molecular dynamics (MD) simulations have been widely used to investigate biological phenomena that are difficult to study only with experiments. Since all-atom MD simulations of large biomolecular complexes are computationally expensive, coarse-grained (CG) models based on different approximations and interaction potentials have been developed so far. There are two practical issues in biological MD simulations with CG models. The first issue is the input file generations of highly heterogeneous systems. In contrast to well-established all-atom models, specific features are introduced in each CG model, making it difficult to generate input data for the systems containing different types of biomolecules. The second issue is how to improve the computational performance in CG MD simulations of heterogeneous biological systems. Here, we introduce a user-friendly toolbox to generate input files of residue-level CG models containing folded and disordered proteins, RNAs, and DNAs using a unified format and optimize the performance of CG MD simulations via efficient parallelization in GENESIS software. Our implementation will serve as a framework to develop novel CG models and investigate various biological phenomena in the cell.

Citation: Tan C, Jung J, Kobayashi C, Torre DUL, Takada S, Sugita Y (2022) Implementation of residue-level coarse-grained models in GENESIS for large-scale molecular dynamics simulations. PLoS Comput Biol 18(4): e1009578. https://doi.org/10.1371/journal.pcbi.1009578

Editor: Dina Schneidman-Duhovny, Hebrew University of Jerusalem, ISRAEL

Received: October 20, 2021; Accepted: March 26, 2022; Published: April 5, 2022

Copyright: © 2022 Tan et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The source code and user manual of GENESIS v1.7.1 is available at https://www.r-ccs.riken.jp/labs/cbrt/ website, and GENESIS-CG-tool is both included as a part of GENESIS v1.7.1 and deployed at https://github.com/noinil/genesis_cg_tool. Tutorials of performing CG simulations with GENESIS v1.7.1 are open on https://www.r-ccs.riken.jp/labs/cbrt/tutorials2019/ website. All the MD simulation files and data to produce the results are available from https://github.com/RikenSugitaLab/cg-development-GENESIS-1.7.0.

Funding: The research was supported in part by MEXT as “Program for Promoting Researches on the Supercomputer Fugaku” (Biomolecular dynamics in a living cell (JPMXP1020200101) / MD-driven Precision Medicine (JPMXP1020200201)), Grant-in-aid for scientific research (Grant Numbers, 19H05645, 21H05249 (to YS), 21H05282 (to JJ and CT) and by RIKEN Pioneering research projects (Glycolipidologue Initiative, biology of intracellular environments) (to YS). The computer resources were provided by RIKEN (HOKUSAI BigWaterfall), RIKEN Center for Computational Science (Fugaku supercomputer), and HPCI (project IDs: hp200028, hp200129, hp200135, hp210172, hp210177). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

This is a PLOS Computational Biology Software paper.

Introduction

Molecular dynamics (MD) simulation of a biomolecule in solution or membrane has been successfully applied to biophysical or biochemical problems at high spatial and temporal resolutions that are unattainable by experimental techniques. Recently, biological phenomena involving many biomolecules at larger scales have attracted much more attention than before in molecular and cellular biology, as well as medical and pharmaceutical studies. Genome-scale chromatin folding [1] and phase-separated membrane-less condensations [2] are well-known examples, which have already been targets of MD simulation studies [3–7]. To obtain biological insights on these phenomena, further methodological and computational challenges need to be met. An enormous number of atoms, for instance, more than 10–100 million atoms, are involved in such target systems, requiring us to improve the computational performance of the simulations. Alongside the size problem, there are also barriers in achieving sufficient conformational sampling in biologically meaningful time scales. Considering these issues, coarse-grained (CG) models of biomolecules are very useful in MD simulations, being a trade-off between modeling accuracy and computational efficiency. The CG models are able to decrease the number of degrees of freedom in a system and to smooth out the ruggedness of free energy landscapes, both of which efficiently accelerate MD simulations of biological systems [8].

Although there are various levels of coarse-graining [9–13], they can be categorized roughly into two classes of models in terms of their theories and parameterizations: physical-based or structure-based CG models. The MARTINI model [13] is one of the representatives in the former, while the Gō model [14–16], which describes the funnel-like energy landscape of protein folding, is probably the most well-known structure-based one. The resolution of each CG model, namely, how many heavy atoms are integrated as a single CG particle, is another crucial point in the choice of CG models. Besides, the solvent is treated differently, either represented by explicit but simplified water models [13] or modeled implicitly through effective interactions [12,17]. In the latter case, Brownian dynamics with hydrodynamic interactions [18,19] or the Lattice Boltzmann method [20,21] can be used to account for the effects of solvent-induced correlations. Here, we focus on residue-level CG models with non-hydrodynamic implicit solvation [22], whose interaction potentials are primarily described via structure-based ones. These models have been utilized in many simulation studies of folding dynamics of single proteins [14], conformational dynamics of nucleic acids [23,24], and complex molecular machinery [25]. In these models, an amino acid residue is usually simplified as one CG particle on the C_α atom, and a nucleotide is represented using three particles corresponding to the phosphate (P), sugar (S), and base (B), respectively (Fig 1). With this representation, proteins and nucleic acids are modeled with a similar resolution of around ten heavy atoms per single CG particle. Furthermore, as an extension from the original Gō model, the atomic interaction-based CG model version 2+ (AICG2+) [26] protein model introduces flexible local potentials and sequence-dependent contact parameters to gain a finer description of the free energy landscapes. The same strategy has been applied to RNA molecules [27]. On the other hand, for DNA, the 3SPN.2C model has been designed to describe the sequence-dependent geometric and mechanical properties of double-stranded DNA (dsDNA) [28].

Download:

Fig 1.

Mapping of a protein (top) and nucleic acids (bottom) from atomistic structures to the residue-level CG models. Each amino acid residue is represented with one CG particle (Cα), and each nucleotide is represented using three CG particles, phosphate (P), sugar (S), and base (B), respectively. Left: three-dimensional structure of protein and nucleic acids. Big and transparent spheres are the CG particles, with Cα in green, P in red, S in yellow, and B in blue. Small and solid spheres represent heavy atoms, with phosphorus in dark blue, nitrogen in pale blue, oxygen in red, Cα in green, and all the other carbon in white. Right: two-dimensional ball-and-stick cartoons of protein and nucleic acids abstracted from the left structures, using the same color scheme.

https://doi.org/10.1371/journal.pcbi.1009578.g001

The AICG2+ and 3SPN.2C models have been used jointly in simulations of protein-DNA complexes [29–31], although these two were developed by different research groups. In these simulations, electrostatic and excluded volume interactions between proteins or between proteins and nucleic acids are necessary for heterogeneous biological systems. In addition to generic and physical interactions, the PWMcos method integrates high-throughput sequencing experimental results in the form of the position weight matrix (PWM) with complex structural information to describe the recognition of DNA bases by proteins [32]. Not only for the well-defined folded structures of proteins, residue-level CG models can also be used to study highly flexible and intrinsically disordered regions (IDR). In particular, two pairwise interaction models, Hydrophobicity scale (HPS) [33] and Kim-Hummer (KH) [34], have been tested and refined to reproduce the phase behaviors of protein IDRs [35,36] and RNAs [37] by Best, Mittal, and their coworkers.

The CG models mentioned above were implemented in general-purpose MD packages such as LAMMPS [38,39], GROMACS [40,41], NAMD [40,42], and OpenMM [43], or specialized CG software such as CafeMol [44]. Although these implementations are convenient tools for performing CG simulations, there are still some barriers for regular users or even developers. Each residue-based CG model is usually developed by different groups in different MD packages. If someone wants to use two of them simultaneously, it is not straightforward to combine two different CG models by treating their input or output data properly. Therefore, there is a necessity to incorporate the latest CG models into a unified working environment and provide a user-friendly tool for preparing MD input files. Furthermore, simultaneous usages of different CG models can be a powerful tool in the cutting-edge studies of protein-nucleic acid complexes in cellular environments. For example, the liquid-liquid phase-separated (LLPS) systems often involve complicated interactions among multi-domain proteins [45], IDRs [46], RNAs [47], and DNAs [48] and reach the length scales of hundreds to thousands of nanometers [49]. Such systems require both proper modeling of biomolecular interactions and efficient handling of computational resources.

In the current work, we present an implementation of several residue-level CG models of protein and nucleic acids in the GENESIS MD software [50,51]. In conjunction, we provide a collection of Julia [52] scripts, which we call GENESIS-CG-tool, to help users generate CG MD input files of complex biomolecular systems. The optimization and parallelization of CG MD simulations are the other focus in the current work. Although all-atom MD simulations in GENESIS have been highly optimized and parallelized, additional efforts are required to improve the performance of CG MD simulations because of the uniqueness of individual interaction potentials. We also noticed that some computations that take negligible fractions of time in all-atom MD simulations could be a bottleneck in CG MD simulations. After the implementation in GENESIS, memory and performance benchmark tests are carried out. CG MD simulations in GENESIS are applied to several cases of different sizes and molecular compositions, such as protein diffusion on DNA and LLPS consisting of protein IDRs and RNAs. The atomistic and CG MD simulations available in GENESIS can be a practical framework to investigate cellular-scale biological phenomena at multi-scale resolutions.

Design and implementation

Basic interaction terms

We first define basic interaction terms that are commonly used in various residue-level CG models. Each model, such as AICG2+, 3SPN.2C, HPS/KH, combines several basic interaction terms to represent intra- and inter-molecular potentials of proteins and/or nucleic acids.

Bond potential.

The most typical bond potential has the following harmonic form: (1) where b_i is the bond length between two neighboring CG particles, b_i,0 is the value of b_i in the reference structure, and is the force constant.

A higher-order term is also used in some CG models, such as the 3SPN.2C DNA model [39]: (2) where b_i and b_i,0 have the same meaning as in Eq (1); and are the force constants in the quadratic and quartic terms, respectively.

Angle potential.

The harmonic angle potential is given by: (3) where θ_i is the bond angle formed by the adjacent three CG particles, θ_i,0 is its reference value, and k_a,i is the force constant.

The AICG2+ protein model [26] uses a knowledge-based angle potential, which is based on the Boltzmann inversion method [53], (4) where k_B is the Boltzmann constant, T is temperature in MD simulation, and P_a(θ_i) is an amino-acid type-dependent probability distribution of the angles analyzed from a set of PDB structures. is further refined via the iterative Boltzmann algorithm. Practically, in MD simulations, we calculate the energies and forces using spline functions fitted to values in a table containing the energy and its derivatives at several points.

Dihedral potential.

The periodic proper dihedral potential is defined as follows: (5) where ϕ_i and ϕ_i,0 are the dihedral angle and its reference value, respectively, n is an integer number that controls the periodicity of the function, and k_ϕ,i,n is the force constant.

We also implement a Gaussian type dihedral potential, which is used in the AICG2+ [26] and the 3SPN.2C models [39]: (6) where ϕ_i and ϕ_i,0 have the same meaning as in Eq (5), σ_ϕ,i controls the Gaussian width and ϵ_ϕ,i is the force constant.

Similar to Eq (4), AICG2+ [26] also uses a knowledge-based dihedral term, which is based on the Boltzmann inversion potential [53]: (7) where P_d(ϕ_i) is the probability distribution of the dihedral angle. The is further refined via the iterative Boltzmann algorithm. This is also calculated as a tabulated function for better performance. Notably, in each of the dihedral potentials described above, we introduce a recently proposed algorithm [54] to improve the robustness and numerical stability in the force evaluations.

Nonbonded interactions.

The structure-based Gō-like models for proteins usually use a 12–10 potential to model a native contact, defined as two residues containing heavy atoms at a distance less than a specific cutoff in the native structure [14,26]. The potential [14] is defined as: (8) where r_i is the distance between two CG particles in a native-contact pair, σ_i is the native value of r_i, and ϵ_Gō,i is the force constant. Note that it is applicable not only to proteins but also to nucleic acids if their experimental structures are known.

The Lennard-Jones (LJ) potential (or 12–6 potential) is also widely used to describe pairwise nonlocal interactions and is commonly defined as: (9) where ϵ_LJ,i represents the minimum energy.

The LJ potential truncated at its energy minimum is a purely repulsive potential and can be used as the excluded volume interaction: (10) where r_C,exv is the cutoff distance and has the same value as σ_i. Note that the LJ forms in Eq (9) and Eq (10) are different in a coefficient of 2 for the 6th-power term. However, these two forms are mathematically equivalent when the parameters (σ and ϵ) are appropriately converted.

A simpler form of the excluded volume interaction is given by: (11) where r_C,exv = 2σ_i and .

Other pairwise nonbonded interactions include the Gaussian potential: (12) and the Morse potential: (13) where ϵ_G,i and ϵ_M,i are the “depth”, and w_i and α_i are the “width” of the Gaussian and Morse potentials, respectively. In the 3SPN.2C model [39], the Morse potential is divided into two parts, the repulsive () and the attractive (), as defined in the following: (13A) and (13B)

The electrostatic interactions are modeled using the Debye-Hückel theory: (14) where r_i is the distance between the charged particles i₁ and i₂, whose charges are and , respectively, ε₀ is the dielectric permittivity of vacuum, λ_D is the Debye screening length, ε_r is the relative permittivity of the solution, which is a function of the solution temperature T and salt molarity C: ε_r = e(T)a(C), where [55], and [56]. The Debye length is defined as: where N_A is the Avogadro’s number, e_c is the elementary charge, and I is the ionic strength of the solution.

Modulating functions.

When residue-level CG models are used to describe specific nonbonded interactions, such as the hydrogen bonds (HBs) or the π-π stacking, it is necessary to introduce multi-body potentials. For instance, in the 3SPN.2C model, the base-stacking, base-pairing, and cross-base-stacking interactions are described with the angle and torsion-angle dependent multi-body potential functions [28,39]. Specifically, the angle-dependent modulating function in the 3SPN.2C model is defined as [39]: (15) where Δθ is the difference between an angle (θ) and its reference value, and γ controls the tuning range.

Residue-level CG models

We next describe the CG models as the combinations of the basic interaction terms defined in the last section. We mainly focus on the energy function forms rather than the detailed parameters, which users can easily change in their MD simulations.

The AICG2+ model for folded proteins.

The energy function of the AICG2+ model [26] is defined as: (16) where Γ and Γ₀ are the simulated and native structures, respectively. The reference values in each term are determined from the native values in Γ₀, except for the non-native contact term. In the non-native contact interactions, σ_i is dependent on the excluded volume radii of the particles in contact, using the combination rule of , where i₁ and i₂ are the indices of the particles in the i-th non-native contact, with and as their residue type-dependent radii.

The HPS and KH models for IDR.

The two IDR models, HPS [33] and KH [34], share the same energy function [35]: (17) where Γ is the conformation of an IDP. E_HPS,KH(r_i) has the Ashbaugh-Hatch form [57]: (18) where λ_i in the HPS model is the hydrophobicity scale, and in the KH model is a sign function to regulate the attractiveness or repulsiveness. All the other quantities, r_i, σ_i, and ϵ_LJ,i, have the same meaning as defined in Eq (9). Specifically, ϵ_LJ,i and λ_i are based on the amino acid hydrophobicity in HPS [33] or the Miyazawa-Jerningan potential in KH [34,58], respectively.

The same potential functions are also used in the HPS RNA model [37], which has been developed to study the co-condensation of RNA and protein IDRs. In the HPS RNA model [37], the resolution is only one bead for one nucleotide, which is lower than the other nucleic acid CG models in GENESIS. Accordingly, the bond lengths and particle radii are also larger than the other models. Note that GENESIS also provides a structure-based three-bead-per-nucleotide RNA model, as described in a later section.

The 3SPN.2C model for DNA.

The potential energy function of the 3SPN.2C model [28,39] is defined as follows: (19) where Γ is the conformation of the DNA molecule, and "exv pairs" are the nonbonded particle pairs that are not involved in either base-pairing or stacking interactions. The three terms for base-base interactions, E_bstk, E_bp, and E_cstk, are defined as multi-body energy functions, using the basic equations described in the previous section (Eq (13A), (13B) and (15)): (20) (21) (22) where r_i represents the distance between the two interacting bases, and the angles (θ_BS,i, θ_1,i, θ_2,i, θ_3,i, and θ_CS,i) and dihedral angles (ϕ_BP,i) are formed by the surrounding sugar and phosphate sites (for details of the definition, please refer to [39]).

The structure-based model for RNA.

In addition to the one-bead (per-nucleotide) HPS model, we also implement a structure-based three-bead model for RNA [27], whose parameters are determined with the fluctuation matching method [59]. The energy function of this model [27] is defined as: (23) where Γ is the conformation of the RNA molecule. The electrostatic term, E_ele, is optional in the intra-RNA interactions but required for protein-RNA interactions [27]. In the later case, r_i in E_ele(r_i) represents the distance between a charged amino-acid residue and an RNA phosphate.

The PWMcos model for protein-DNA interaction.

The protein-DNA binding is generally considered in two parts: the sequence-nonspecific interactions between amino acids with the DNA backbone groups (mainly the electrostatic interactions) and the sequence-specific interactions between amino acids and DNA bases. The PWMcos model can describe the latter, incorporating the PWM information into the structure-based interactions [32]. The model first defines a list of DNA-binding protein residues (DB-C_αs) forming contact with DNA in the native structure, and then the potential energy is then given by [32]: (24) where m′ is the base in the complementary strand that forms a base-pair with base m, b_i is the base type of any base i (b_i∈[A, C, G, T]), and is the coordinates of particles in each conformation. The function is defined as: where r_ij is the distance between the i-th and the j-th C_α, and θ₁, θ₂, and θ₃ are three angles defined by the surrounding particles (for details, please refer to [32]). E_Gaussian(r) and f(Δθ) functions are defined in Eq (12) and (15), respectively. The energy coefficient of E_Gaussian(r) (ϵ_G,i in Eq (12)) is defined as a function of base type b_i based on the PWM: where e_PWM,m(b) is the element in the mth column of the PWM and in the row corresponding to base type b, N_m is the total number of protein contacts formed with the mth base pair, ϵ′ is a hyperparameter that shifts the absolute value of the function, and γ is a scaling factor to change the strength of the interaction.

Notably, a variation of the PWMcos has been used to model sequence-non-specific hydrogen bonds between protein and DNA backbone [30]: (25) where r_ij represents the distance from the i-th phosphate (instead of base) to the j-th C_α, and θ₁ and θ₂ are the angles defined by the local particles around the target phosphate and C_α atom in the configuration with coordinates .

Summary of the potential energy functions available in GENESIS.

Although the models mentioned above are developed independently, some have been proved to work well with each other. For instance, the association of the AICG2+ and the 3SPN.2C models has shown the capability to study the protein-DNA binding systems [29,31]. Between proteins and DNA, electrostatic interactions contribute most to the sequence-nonspecific affinity. For the accuracy of electrostatics modeling, the RESPAC method was developed to calculate the surface charge distribution of proteins [60]. Furthermore, the angle-dependent HB potential enables modeling the rotation-coupled and uncoupled sliding of nucleosomal DNA [30,61]. The PWMcos method also facilitates the target search of transcription factors on both linear DNA and nucleosomes [32,62].

In addition to these well-developed methods, GENESIS also allows the combinations of different models. In Table 1, we list all the available CG models in GENESIS 1.7.0 in mixtures of proteins, DNAs, and RNAs.

Download:

Table 1. Available CG models in heterogeneous biomolecular systems.

https://doi.org/10.1371/journal.pcbi.1009578.t001

Information flow

Fig 2 shows a flowchart for running CG simulations in GENESIS. We roughly divide the whole procedure into three steps: (i) preparation, (ii) simulation, and (iii) analysis. GENESIS provides tools to achieve the task in each step. In the preparation stage, we use a collection of scripts, “GENESIS-CG-tool”, to translate experimental results into CG topology and coordinate files. Together with the MD control file, these files are then parsed by one of the GENESIS MD engines, atdyn, to perform energy/force calculations and time integrations. As the output data from MD simulations, GENESIS produces a log file of system properties including temperature, system size, and potential energy. The output coordinates from MD simulation are written in a trajectory file with the DCD format, which are processed by the GENESIS analysis tools and can be visualized by VMD [63], PyMOL [64], or other molecular graphics software. Here, we mainly focus on our developments in the first two stages, since the analysis using the DCD format files and analysis tools is common to those of all-atom MD simulations.

Download:

Fig 2. Information flow in GENESIS CG MD simulations.

Dashed boxes represent collections of files used together as input or output at a particular step. Solid red boxes are used for the executable tools provided by GENESIS. Small boxes include typical filename extensions.

https://doi.org/10.1371/journal.pcbi.1009578.g002

Preparation.

We next explain the usage of the GENESIS-CG-tool, which can read structural information and generate force field parameters and coordinates for MD simulations. As input, the structural information is in the PDB format, which includes atom coordinates determined by experiments such as X-ray crystallography, NMR, or Cryo-EM. As output, we decided to use a unified GROMACS-like file format for all the CG models. Particularly, the output of GENESIS-CG-tool includes a major topology file (.top) and a bundle of molecule-specific files (.itp), as well as a coordinate file for all the particles in the system (.gro) (Fig 2). The molecule-specific topology (.itp) files contain the information of intra-molecular interactions, such as bond, angle, dihedral, and Gō-type native-contact interactions. Each term includes the indices of the involved particles, reference values of the variables, and the force parameters. Additionally, an integer variable (called the “function type”) is used to distinguish the various potential functions of the same interaction type. For instance, we set the function types of 1 and 21 to (Eq (1)) and (Eq (2)), respectively.

GENESIS-CG-tool was developed with Julia [52], a fast, open-sourced, and reproducible programming language. The project’s main entrance is a script called “aa_2_cg.jl”, which works in the command-line environment of UNIX-like shells. The script takes a PDB file name as the positional argument and several model-related optional arguments. During the processing, the atomistic coordinates are read from PDB files. Then, local and nonlocal interactions, including their native values, are detected from the PDB structure. Other parameters, such as force constants, are determined by the models specified for each molecule.

As a simple example, assuming that one wants to prepare CG MD input files for a protein using a PDB file, “PRO1.pdb”, the command “aa_2_cg.jl PRO1.pdb” will generate “PRO1_cg.top” and “PRO1_cg.itp” topology files and “PRO1_cg.gro” coordinate file (Figs 2 and S1). The AICG2+ model is used for proteins by default, which can be changed with the option “--force-field-protein”.

The 3SPN.2C model of DNA uses reference structures that include sequence-dependent geometric features. These structures can be generated by software such as 3DNA [65]. In GENESIS-CG-tool, with no requirement of external packages, we provide a simple “sequence-to-structure” script, which reads in DNA sequences and directly outputs both atomistic PDB and CG topology/coordinate files for dsDNA, including the necessary sequence-dependent information (see S1 Fig).

For heterogeneous systems that contain more than one type of biomolecules, the molecule-specific “.itp” files can be linked into the “.top” file, using the “#include” keyword.

Notably, recent improvements in experimental techniques such as cryo-EM have fostered more high-resolution structures of the huge biomolecule complexes, which are valuable resources in MD simulations. The spatial scales of these structures, however, outstrip the capacity of the traditional PDB file format. Another format, PDBx/mmCIF [66], has been used to store coordinate information of large structures. GENESIS-CG-tool also accepts files of the PDBx/mmCIF format as input. Besides, we also noticed that detecting the Gō-like native contacts is rather time-consuming for large protein complexes. We utilize the Julia thread parallelization to accelerate this procedure in the GENESIS-CG-tool.

We have packaged GENESIS-CG-tool as a part of GENESIS v1.7.0. We have also deployed the GENESIS-CG-tool in an individual Github repository: https://github.com/noinil/genesis_cg_tools. Users can refer to the online wiki page of this project for more details on the usage and available file formats.

MD algorithms

In GENESIS v1.7.0, there are two MD programs: spdyn and atdyn. Spdyn is parallelized using the mid-point cell method [67], one of the domain-decomposition schemes; while atdyn is parallelized with the atomic decomposition scheme [50,51]. We implement the residue-level CG models only in atdyn because of the feasibility of implementations and the number of CG particles required in most CG MD simulations. In atdyn, the Clementi Gō [14], Karanicolas-Brooks Gō [68], Domain-enhanced Model (DoME) [69], and dual/multi-basin Gō models [70] are also available and use the same topology and coordinate file formats as described in the previous section. Since atdyn is already a well-developed framework, the original source code is reused as much as possible to implement the residue-level CG models. In the following, we describe the newly introduced functions and optimization schemes only.

Cutoff scheme of the nonbonded interactions.

The nonbonded interaction terms are the most time-consuming computations both in all-atom and CG MD simulations. Cutoff schemes are used for the nonbonded terms in CG MD simulations to reduce the computation time. For relatively short-range nonbonded interactions, such as and , cutoff values are already included in the definition (see Eq (10) and (11)). In contrast, the other interactions do not have generally established cutoff values. We provide three parameters to control the cutoff lengths of the E_bp, E_LJ, and E_ele terms in our implementations. We set the default cutoff value of E_bp to 18Å, following the original 3SPN.2C model [28,39]. We use a value of 52Å for E_ele, which assures that the absolute value of the energy at the cutoff distance to be smaller than 10⁻⁴ kcal/mol (calculated at the temperature of 300K and with an ionic concentration of 150mM, for two charges of ±1 e⁻). This cutoff value of E_ele is slightly larger than the one used in the original 3SPN.2C model and guarantees accuracy [39]. As for the LJ potential used in the HPS/KH IDR models, the parameters ϵ_LJ,i and σ_i are dependent on residue-types [35]. Therefore, we determine the LJ cutoff using a strategy that, for the i-th pair, the absolute value of the energy at the cutoff distance is as small as 10⁻⁴ times of the energy minimum: E_LJ(r_C,LJ) = 10⁻⁴ϵ_LJ,i. This gives a value of r_C,LJ≃5.849σ_i. We then choose the largest σ_i from the HPS/KH models to determine a unified value of 39Å as the cutoff for E_LJ. Note that these default values are considered as the “safest” ones. The users may be able to change them to smaller values by carefully evaluating the balance between computational efficiency and accuracy.

Several nonbonded interaction terms are considered only for pre-defined pairs of CG particles, such as the DNA base-stacking (E_bstk) and the Gō-type native contacts (E_Gō). For these terms, we don’t apply the cutoffs but directly calculate all the interactions.

Neighbor lists.

Neighbor lists are used to maintain a list of particles that can participate in nonbonded energy/force interactions within a few MD integration steps. We set the neighbor list distances (r_P, also called “pair-list distance” in GENESIS) to be roughly 5Å larger than the largest cutoff distance (r_C). A complete list of the default cutoff and neighbor list distances can be seen in Table 2.

Download:

Table 2. Default values for the nonbonded cutoff and pair-list distances.

https://doi.org/10.1371/journal.pcbi.1009578.t002

We utilize the cell linked list strategy to construct the neighbor list. Fig 3 illustrates how we determine the neighbor lists of different nonbonded interaction terms. During every update of the neighbor lists, the system is first divided into small cells. In each cell C_i and for each interaction term, we construct a “particle list”, L_nb−term(C_i), which contains the particles located in C_i and that are involved in the interaction (“nb-term”). For example, L_ele(C_i) is a list of all the charged particles in the cell C_i. We then determine the “neighboring cells” for each cell: , where r_min(C_i, C_k) is the minimum distance between the cells C_i and C_k. As shown in Fig 3A, relatively short-range interaction terms such as or have a small number of the neighboring cells, whereas those with longer cutoff distances, such as E_ele, will have more neighboring cells.

Download:

Fig 3. Using the cell linked list method to construct multiple neighbor lists.

(A) An example of determining the neighbor list for E_exv (representing either or ) and E_ele on a two-dimensional space. The gray dotted lines show the edges of the cells. The center cell with dark dashed edges (C_i) contains the particle a (black dot) whose neighbor list is to be determined. r_P,exv and r_P,ele are the neighbor-list distances for E_exv and E_ele, respectively. The neighboring cells for the E_exv term () are represented by the red cells and surrounded by the red dashed lines. Those for the ) are shown as the green area edged by green dashed lines. The red and green circles indicate the range of final neighbor lists for E_exv and E_ele, respectively. (B) Pseudo code of the algorithm to determine a neighbor list of the electrostatic interaction. L_ele(C_k) is a list of all the charged particles in cell C_k.

https://doi.org/10.1371/journal.pcbi.1009578.g003

After defining the “particle list” (L_nb−term(C_i)) and “neighboring cells” () of each cell, the neighbor list for each particle can be determined by the algorithm in Fig 3B (for convenience, here we use electrostatic interaction as an example). The neighbor lists are updated every 20 steps in the CG simulations.

Using individual particle list and neighboring cell for each nonbonded interaction term, we can minimize the pairwise distance calculations. First, as described above, we can avoid do loops over unnecessary cells for short-ranged interactions by using the potential-dependent neighboring cells. Besides, in each cell, we just consider a pre-assigned subgroup of particles (the L_nb−term(C_i) lists defined above) instead of all the particles. For example, in a 3SPN.2C DNA system, to construct the neighbor list for the electrostatic interactions, we only have to scan the charged particles (phosphates), which are only ~1/3 of all the particles. Whereas for the base-pairing interactions, we scan another group of particles, the bases, also ~1/3 of all the particles. Generally speaking, for the pairwise non-local interactions, considering 1/N of the particles means only 1/N² calculations compared to scanning over all the particles. Therefore, using cell linked list and grouping particles according to interaction terms can save a lot of computations in the neighbor list construction. On the other side, however, we also note that when periodic boundary conditions are used with the cell linked list method, the simulation box should have each dimension equal to or larger than three times the largest pair-list distance (57Å for the electrostatic interaction by default).

Time Integration.

GENESIS atdyn has provided various integrators and thermostats/barostats. For the current CG implementation, we employ the velocity-Verlet integrator coupled with the Langevin thermostat. In the velocity-Verlet algorithm, coordinates (r_i) and velocities (v_i) are updated following: (26A) (26B) where t is the simulation time, Δt is the integration step size, m_i, r_i, F_i, and v_i are the mass, coordinate, force, and velocity of the i-th particle, respectively. In the Langevin thermostat, the force F_i is calculated by: (27) where V_total is the total potential energy and γ is the friction constant. ξ_i(t) is the Gaussian noise vector satisfying: (28) where k_B is the Boltzmann constant, T is temperature, δ_t,t′ = 1 when t = t′ and 0 otherwise, and I is an identity matrix of size 3×3. ξ_i(t′)^T is the transpose of ξ_i(t′) and the product follows matrix multiplication.

We use 10 femtosecond (fs) as the literal time step size (Δt in Eq 26), which is determined by the highest vibration frequency of the particles. On the other side, the time scale mapping of large-scale conformational dynamics of biomolecules is system-dependent and requires a careful comparison between simulated and experimental physical quantities [22].

Performance Optimization

Next, we optimize the performance of CG MD simulations implemented in GENESIS. At first, we optimize the nonbonded interactions, which are the most time-consuming parts in both residue-level CG and all-atom force-field models. There are two common strategies in the optimization. The first one is the vectorization of the most inner loops for utilizing as many SIMD (single instruction and multiple data) instructions as possible. The second one is to minimize the number of computations by using conditional statements. In contrast to all-atom models, using “conditional” statements is not avoidable due to the complex potential energy function forms. In our experience, the ratio r_P³:r_C³ gives us a good criteria whether the calculation would be vectorized or not using “conditional” statements. For instance, in the case of E_ele, the ratio is about 1.3, and it is better to apply vectorizations to the inner loop without any “conditional” statements. The ratio in or evaluations are greater than 3.5, and it is better to optimize by minimizing the calculation amount with conditional statements. The ratio in E_LJ or E_bp is less than 3, but we use conditional statements because the energy expression form itself is changed in HPS, KH, or 3SPN.2C models.

Another computational bottleneck in CG MD simulations is the generation of random numbers with Gaussian distributions when using the Langevin thermostat. This is mainly because of the replicated data MPI parallelization scheme used in GENESIS atdyn, in which all processes have a copy of the coordinate data. By keeping the replicated data scheme, we assign MPI parallelization in random number generation and apply “MPI_Allgatherv” to collect the information. We do not apply MPI parallelization to integrate coordinates or momenta because the communication time is comparable or even more time-consuming than the integration itself. Instead, OpenMP parallelization is applied in integrations.

Results

Benchmark of MD simulations with residue-level CG models

We first examine the usage of memory and the performance benchmark in MD simulations with residue-level CG models. The memory usage was examined with a single process and a single thread on a machine with 93GiB RAM (random access memory). The CPU benchmarks were executed with 5 OpenMP threads and various MPI processes on a computer server with Intel Xeon Gold-6148 (2.4GHz) CPU (20 cores per CPU and two CPUs per node) and InfiniBand EDR networking. GENESIS was compiled with the Intel compiler in couple with Intel MPI version 2019.5.281. Fig 4 shows the results of the memory benchmark (Fig 4A) and CPU benchmark using different biological systems (Fig 4B, 4C and 4D).

Download:

Fig 4. Benchmark of the GENESIS CG simulations.

(A) Memory benchmark of the RNA Polymerase II (PolII) systems. A single PolII has 3,695 CG particles, which is then duplicated into a n×n×n (n = 1,…,10) grid space. The insets show the structure of a single PolII (left) and 1000 PolIIs (right). Protein is shown in blue, DNA in red, and RNA in yellow. (B) CPU benchmark of a system containing 120 DPS proteins. The total number of CG particles is 222,360. (C) CPU benchmark for a system of 5000 chains of 100aa IDPs. The total number of CG particles is 500,000. (D) CPU benchmark for a system including 512 nucleosomes. The total number of CG particles is 1,044,480. The insets of (B), (C), and (D) show the initial structure of each system, respectively. N represents the total number of CG particles in each system. For all the CPU benchmarks, we used 5 OpenMP threads.

https://doi.org/10.1371/journal.pcbi.1009578.g004

To test the memory consumption, we built systems consisting of different numbers of RNA Polymerase II (PolII). The smallest system has only one PolII and contains 3695 CG particles (3542 from protein, 124 from DNA, and 29 from RNA), based on the X-ray crystallography structure (PDB ID: 1R9T) [71]. We then duplicated the single PolII to create systems of different sizes, ranging from thousands to millions of particles. In all these tested systems, we used V_AICG2+ for proteins, V_3SPN.2C for DNAs, and V_RNA for RNAs. For proteins, we use integer charges distributed on the charged residues (+e for Arg and Lys, −e for Asp and Glu). For the interactions between different biomolecule types, we applied and E_ele as general nonspecific interactions, and E_Gō for native contacts between protein-RNA, protein-DNA, and RNA-DNA to maintain the whole structure. All the cutoff and pair-list distances are set to the default values (see Table 2). As shown in Fig 4A, we carried out simulations with n³ (n = 1,…10) duplicated PolIIs to track the memory usage. The consumed memory scales with the system sizes, from ~40 MB for a single PolII to ~22 GB for 1000 PolIIs. Notably, on an ordinary computer with 10GB memory, we can run GENESIS CG MD simulations for a system with about 2 million particles.

To examine the strong scaling of GENESIS CG simulations, we prepare three systems comprised of different types of biomolecules:

120 copies of the protein DPSs [72]. In total, 222,360 CG particles modeled with V_AICG2+;
5000 chains of IDPs, each consisting of 100 amino acids. In total, 500,000 CG particles modeled with V_HPS;
512 copies of nucleosomes [73]. In total, 1,044,480 particles, using V_AICG2+ for proteins, V_3SPN.2C for DNAs, and E_ele for protein-DNA interactions.

and E_ele are used as general non-specific interactions for systems 1) and 3); and E_HB is additionally applied between protein and DNA in system 3). For all the cutoff and pair-list distances, we use the default values listed in Table 2. The benchmark results for these three systems are shown in Fig 4B, 4C and 4D, respectively. The results suggest that at least up to 64 MPI processes (8 nodes), the parallel performances are scalable in all three systems. Although this does not show perfect scalability, it is useful for simulating large biological systems. We discuss possible reasons for the insufficient scalabilities of our implementation and future plans to improve the performance in the Discussion section.

Application 1: protein target search on DNA

The AICG2+ protein model (V_AICG2+), and the 3SPN.2C DNA model (V_3SPN.2C), in combination with the sequence-specific (E_PWMcos) or nonspecific (E_HB) protein-DNA interactions, can be used to study the binding, diffusion, and conformational changes of protein-DNA complexes [29,30,61,62]. As a simple example, we performed MD simulations of the sex-determining region Y protein (sry) [74] and its target DNA. The modeling and simulation results are shown in Fig 5.

Download:

Fig 5. CG MD simulations of the DNA target search process by protein sry.

(A) Solution NMR structure of sry binding on DNA (PDB entry: 1J46). (B) Coarse-grained sry with the Cα particles colored by the partial charges (determined using the RESPAC method). DNA structure from the PDB is also shown as lines for reference. (C) Sequence logo of sry’s DNA target. (D) Sequence of DNA used in the CG simulations. The consensus sequence “TAAACAAT” is inserted at the center of a 50bp poly-CG DNA. (E) Time series of sry’s binding position on DNA (bsry). The green region represents the consensus sequence. (F) Time series of the bending angle of DNA (θ). Two representative structures of sry binding at the poly-CG region or the consensus region are shown on top of (E). In these structures, DNA is colored in gray, except for the consensus sequence region in green. Sry is shown in red.

https://doi.org/10.1371/journal.pcbi.1009578.g005

Sry uses a conserved High-Mobility Group (HMG) domain to recognize its consensus DNA sequence, “TAAACAAT” [75]. The HMG domain intercalates into the DNA minor groove and forms contacts with the bases. This type of minor groove binding of sry results in a sharp bending of the DNA [76], which plays a crucial role in regulating the transcription of its target gene [77]. Starting from the solution NMR structure of the sry-DNA complex (PDB entry 1J46 [76], see Fig 5A), we generated the CG model of sry using the GENESIS-CG-tool. We used the AICG2+ model (V_AICG2+) for the whole protein. Specifically, we applied Gō-type native contact interactions (E_Gō) to the folded HMG domain (residue 9–70) and assigned only local flexible potentials (, and ) to the N-terminal (residue 1–8) and C-terminal tails (residue 71–85). We also employed the RESPAC method [60] to calculate the partial charge distribution on the surface residues of sry (see Figs 5B and S2). As for the DNA, we designed a 50bp poly-CG sequence, with the 8-bp piece “TAAACAAT” inserted at the center (Fig 5C and 5D). The atomistic structure of the dsDNA was then constructed from this 50-bp sequence using the 3DNA package [65]. The 3SPN.2C CG topology and coordinate files were then generated from the atomistic structure, using the GENESIS-CG-tool. For sry to find its consensus sequence in the simulations, we utilize the PWMcos model (E_PWMcos) to incorporate information from the position frequency matrix (PFM, downloaded from JASPAR database [78] profile MA0084.1 [75]). For details of the model parameters, please see S1 Text.

After getting the input files, we carried out MD simulations of the sry-DNA complex for 10⁷ steps. In the initial structure, sry was placed around 70Å away from the DNA. Soon after the simulation started, sry was bound onto DNA and began the sliding. We monitored the binding site of sry on DNA (Fig 5E) and the bending of DNA (Fig 5F). As can be seen, sry is preferentially bound to the consensus sequence (represented by the green region in Fig 5E) and occasionally left this target with a quick diffusion on the DNA. As expected, we observed coupling between the sequence-specific recognition and DNA bending caused by the intercalation of the HMG domain. These results show that the CG models (AICG2+, 3SPN.2C, and PWMcos) for protein-DNA interactions have been correctly implemented in GENESIS and can be used to study similar biological systems.

Application 2: phase behaviors of IDR and RNA

The HPS and KH IDR models have been used to study the phase behavior of IDRs [35]. Here, we used the IDR from protein Fused in Sarcoma (FUS) [79] to show that our implementation of these models in GENESIS can simulate the condensation of proteins. We first simulated a single chain of the FUS IDR (163aa, see S1 Text for the sequence). The final structure of the single-chain simulation was then duplicated to create a 120-chains system. We carefully arranged the multiple chains so that the system had a boundary size of 18nm×18nm×200nm, which is for the slab sampling method. We performed a 10⁷-step simulation, as shown in Fig 6A and 6B. During the simulation, we monitored the density change of the CG particles along the z-axis (the longest dimension). As shown in the top panel of Fig 6A, at the beginning of the simulation, the FUS IDRs formed relatively small and low-density clusters. As simulation time increased, we observed several merge events of the condensates. After ~3.5×10⁶ steps, all the FUS IDRs went into the same cluster, and the density finally reached around 5 particles per nm³ (Fig 6A and 6B). This simple example shows the possibility of using the HPS model in GENESIS to explore the physical mechanisms of IDR condensation. Besides, we noticed that there are some recent updates of the parameters of the HPS model [36,80]. These changes of parameters can be easily used with GENESIS by modifying the parameter files, with no need to touch the code. In this sense, GENESIS can also serve as a handy framework for the users to re-calibrate their own set of parameters.

Download:

Fig 6. CG MD simulations of the condensation of IDRs and RNAs.

(A) and (B) show the slab simulation results of 120 FUS IDRs. (A) Time series of FUS density along the long axis (z). The intensity of the blue color represents the density, as shown in the color bar above. (B) The first and last structure of the MD trajectory shown in (A). The FUS IDRs are shown as tubes. Different colors from green to blue are used to distinguish chains. (C), (D), and (E) show the simulation of the mixture of LAF1 IDRs and 15nt poly-A (A15) RNAs. (C) The last structure of a 5×10⁶ steps slab simulation of 300 LAF1 IDRs and 100 A15 RNAs. LAF1 IDRs are in green or blue, whereas RNAs are in red or purple color. (D) The last structure of a 2×10⁶ steps droplet simulation of 750 LAF1 IDRs and 250 A15 RNAs. The color scheme is the same as (C). (E) Radial distribution function of protein and RNA CG particles as a function of the distance from the center of mass of the LAF1 droplet (r). The structure of the condensation is shown as transparent background for reference.

https://doi.org/10.1371/journal.pcbi.1009578.g006

In conjunction with the HPS model for protein IDR, the HPS model for RNA has also been recently developed [37]. The HPS RNA model has a different resolution from the other CG models described above, with one CG particle per nucleotide. Nevertheless, we also implemented this model in GENESIS, preparing for possible applications in the study of biophysical condensations involving both protein and RNA. Here we built and simulated two systems of IDR from the protein LAF1 and 15-nt (nucleotide) poly-A RNA (hereafter called A15). The two systems consisted of 300 LAF1 together with 100 A15 (57,300 particles in total) and 750 LAF1 together with 250 A15 (143,250 particles), respectively. The slab method was used to simulate the first system for 5×10⁶ steps, with the simulation box of 20nm×20nm×200nm (Fig 6C). The second system was simulated in a cubic box of size (100nm)³ for 2×10⁶ steps (Fig 6D). As can be seen in Fig 6C and 6D, in both situations, the A15 RNA entered the condensation of LAF1. We also quantitatively analyzed the co-condensation of LAF1 and RNA by plotting the radial distribution function (RDF) of protein and RNA particles as a function of the distance from the center of the LAF1 droplet (Fig 6E). There is a higher density of RNA inside the LAF1 condensation than in bulk (Fig 6E). These results are consistent with the previous simulation results [37].

Application 3: virus capsid

We also explored the ability of our GENESIS CG implementation to simulate large-scale biological systems, such as a virus capsid. Here we reported our modeling and simulation of the herpes simplex virus (HSV) capsid, whose high-resolution structure has been recently solved with the cryo-electron microscopy (cryo-EM) (PDB entry: 5ZAP) [81]. The capsid comprises about 4000 proteins assembled into 12 pentons (pentameric blocks) and 150 hexons (hexameric blocks), and 320 triplexes gluing all the subunits [81]. The number of atoms and the length scale exceed the PDB format’s upper limit for such a vast structure. Therefore, the structural information of the HSV capsid was recorded in a file with the mmCIF format. We used the GENESIS-CG-tool to parse this file and generate the CG structure, which consists of 1,687,980 CG particles in total (Fig 7A). The AICG2+ potentials were used to model the whole structure. After generating the necessary input files, we then simulated the structure for 2×10⁶ steps. We tracked the radius-of-gyration (R_g), the root-mean-square-deviation (RMSD), and the native-ness (Q, see S1 Text for definition) of the whole structure (shown in Fig 7B). As can be seen, during our simulation, the structure was stably maintained. Although here we only show a short equilibrium simulation of the HSV capsid, to our knowledge, it is one of the largest biological systems simulated with the residue-level CG models. We expect to run CG simulations of similar systems using GENESIS to probe the conformational changes and dynamic assembly or disassembly of the capsid.

Download:

Fig 7. Simulation of the herpes simplex virus (HSV) type 2 B-capsid.

(A) Structure of the coarse-grained HSV capsid, based on PDB entry 5ZAP. The pentameric blocks (pentons) are colored in blue, the hexameric blocks (hexons) are colored in green (VP5) and cyan (VP26). All the other proteins that glue together the pentons and hexons are colored in white. (B) Time series of the radius of gyration (Rg), RMSD, and nativeness (Q) of the whole structure during a 2×10⁶-step simulation.

https://doi.org/10.1371/journal.pcbi.1009578.g007

Application 4: chromatin

Eukaryotic chromatin is the architecture that stores genetic information by packaging DNA into supercoiled structures around the histone octamer proteins. In addition to the function of genome organization, chromatin also hosts many biochemical reactions around the information flow between DNA and RNA. Due to its extraordinary importance, chromatin and related biological phenomena have been widely studied by MD simulations at different length scales and models at different resolutions [4,82–84]. In particular, residue-level CG models, including those we implemented here (V_AICG2+, V_3SPN.2C, E_PWMcos, and E_HB), have shown their abilities to decipher the dynamics of the nucleosome, which is the building block and basic unit of chromatin [30,31,61,62,85]. Here we tried to expand these residue-level CG MD simulations to a larger length scale. Instead of using a natural genomic sequence and constructing a Hi-C experiment-based structure, we chose poly-CG as the DNA sequence and built an artificial chromatin structure by connecting single nucleosome structures (based on PDB 1KX5) [73] with linear linker DNAs. The constructed structure contains 1024 nucleosomes linked by a 219,213-bp dsDNA (see Fig 8A and 8B). More details of the structure modeling can be found in S1 Text.

Download:

Fig 8. MD simulation of artificial chromatin composed of 1024 nucleosomes.

(A) Structure of the whole artificial chromatin. DNA is colored in red, and histones are colored in cyan (for the folded domains) and green (tails). (B) A zoom-in structure of two adjacent nucleosomes as a part of the artificial chromatin. (C) Time series of the radius-of-gyration (Rg) of the N-terminal tails of the histones (top) and the number of wrapped DNA base-pairs on histone (bottom). The solid lines represent the average values, and the shadowed regions show the standard errors.

https://doi.org/10.1371/journal.pcbi.1009578.g008

Similar to previous studies [30,61], we used V_AICG2+ for histones, V_3SPN.2C for DNA, and E_HB for hydrogen bonds between histone and DNA phosphates. In addition, we applied V_HPS to the flexible histone tails (see S1 Text for definition). We then performed 5×10⁵-step MD simulations of this system at a temperature of 300K and 150mM ionic concentration. The time series of R_g of the four types of N-terminal histone tails and the number of wrapped DNA base-pairs are shown in Fig 8C. These results were based on the statistics over the 1024 nucleosomes. This example illustrates the possibility of using our GENESIS CG implementation to carry out studies on the large-scale chromatin dynamics with accurate modeling of DNA and both well-folded and intrinsically disordered parts of proteins.

Discussion

In this paper, we have implemented residue-level CG models of biomolecules in GENESIS MD software and have applied them in CG MD simulations of various biomolecules by combining different CG models. The combination of the AICG2+ model for proteins with the 3SPN.2C model for DNA and the PWMcos potential for protein-DNA sequence-specific interactions has been applied in the previous studies [62]. Whereas the combination of the HPS model for IDR and AICG2+ for folded protein is tested, for the first time, in this new framework. We also noticed several recently published CG models at an equivalent or similar resolution as ours, such as the TIS RNA model [23] and the iSoLF lipid model [86]. We will consider incorporating these models into GENESIS to cover a broader range of target biomolecules. In the applications presented here, we used the default interaction parameters in GENESIS. For more practical applications, such as genome-scale chromatin folding or phase-separated membrane-less condensations, one may need to more carefully optimize interaction terms between the IDR and the other components and their parameters. If necessary, the parameters should be calibrated carefully with experimental information as references. The GENESIS CG framework provides a convenient environment for these attempts.

Similar to the current work, other groups have also released combinatorial implementations of CG methods, such as the OpenAWSEM-Open3SPN2 within the OpenMM framework [43], the 3SPN.2C packages in the LAMMPS framework [38,39], and the CafeMol package [44]. Particularly, the OpenMM package utilized GPU calculations to gain better performances [43]. Compared with its implementation, our development aims to cover more models and provide a more flexible and feasible environment for the studies of larger-scale biomolecular systems. CafeMol has been successfully applied for CG simulations in various biomolecular studies [26,29,30,62,87,88]. However, it is not designed for the high-performance computing of larger-scale systems. We have optimized our code in GENESIS to enable high-performance simulations of systems consisting of millions of particles on regular computers (Figs 4, 7 and 8). There is an upper limit of particle number with the atomic decomposition scheme in atdyn, which essentially limits our the applicability with the current code to huge biological systems. The domain-decomposition strategy and the efficient load balancer suitable to residue-level CG models are required to solve this problem. These two will be implemented in a new MD engine in GENESIS. However, it is worth noting that there are technical difficulties in developing the domain-decomposition scheme and load balancer for CG simulations. The most widely appreciated problem is the inhomogeneous particle distribution in the residue-level CG systems due to the implicit treatment of the solvent. In a conventional uniform spatial decomposition, the non-uniform particle density causes severe load imbalance among different domains. Therefore, special decomposition algorithms should be considered [89]. Another direction is to consider explicit solvent models [13,90], which intrinsically solve the load-balance problem, but introduce significantly more computations on the solvent particles. Besides, the large cutoff value of the non-bonded interactions also sets a lower limit of the target system size. We assume at least two domains in each dimension to apply domain decomposition. Each domain should have at least the size equal to or larger than the longest cutoff (in our implementation, 52Å of the electrostatic term). This requirement essentially limits the applicability of the program to relatively small systems. In developing the new CG MD engine in GENESIS, we will have to solve these problems.

Although the CG models have boosted the computational efficiency to be orders of magnitudes faster than the all-atom force fields [22], the speed-up still looks insufficient when studying systems of the organellar or cellular scales. It is necessary to employ enhanced sampling methods, many of which [91–94] have also been implemented in GENESIS. Taking this advantage, we will combine CG MD simulations with some of the enhanced sampling methods to obtain more statistically reliable data with reasonable computational times and to enable free-energy analysis in the organellar or cellular scales.

In parallel to the advances in computational biophysics, mesoscopic pictures of biomolecules in the organellar or cellular environments have been better described using the latest experimental techniques, such as cryo-electron microscopy (cryo-EM) [95], cryo-electron tomography (cryo-ET) [96], high-speed atomic force microscopy (AFM) [97], and in-cell NMR [98]. Integrating structural information by these experiments with computational biomolecular dynamics is one of the most important tasks in this research field. Recently, there have been several attempts [99–101] to make more realistic structural models of sub-cellular systems by utilizing available experimental results as much as possible. We hope that the implementation of residue-level CG models can provide more realistic structural dynamics of heterogeneous biological systems together with recent experimental information and modeling strategies. For this purpose, multi-scale and multi-resolution simulations are helpful in the trade-off between modeling accuracy and computational efficiency, if we can switch the resolutions of target biological systems easily from this residue-level CG models to other CG models with different approximations (MARTINI [13], SPICA [102], PACE [103], etc.) or all-atom force field models (AMBER [104], CHARMM [105], GROMOS [106], etc.). In the GENESIS software, most of the above models are already available, and some of them are being implemented by us. Mapping from the atomistic structure to CG is deterministic and easy, although information such as amino-acid backbone dihedral angles (φ and ψ) and hydrogen bonds are largely lost. The other direction, namely, reconstructing atomistic structure from CG coordinates, is usually much more difficult [107]. In our current implementation, the structure-based models, such as AICG2+ [26], can uphold the secondary, tertiary, or even higher-order structures in the native structure. Additionally, by using the angle-dependent non-local potentials, the 3SPN.2C [28] and the PWMcos [32] models can partly preserve the base-stacking or hydrogen-bonding interactions in the DNA and protein-DNA interface, respectively. Structures sampled with these CG models can be remapped back to reasonable atomistic structures, provided the reconstruction methods [107,108] are good enough. In the near future, we may be able to investigate the inside of the cell at various resolutions, as Google Earth can do visualizations of both countries and local towns.

In summary, we have implemented several residue-level CG models in the GENESIS MD software, including the AICG2+ model for protein, the 3SPN.2C model for DNA, the PWMcos model for protein-DNA interactions, and the HPS/KH model for IDR and RNA. The MD input data for these CG models are given in a GROMACS-style file format by an easy-to-use toolbox, GENESIS-CG-tool. The computational performance of CG MD simulations is optimized to regular computers by preparing the nonbonded interaction lists suitable for the CG potential functions, vectorizing do-loop structures in the time-consuming subroutines, parallelizing the generation of random numbers in Langevin dynamics, and so on. With these attempts, GENESIS CG MD simulation is applied to large biological systems containing millions of CG particles efficiently on regular computers. The models and programs implemented here would be valuable for investigating heterogeneous biological phenomena involving folded/disordered proteins, RNAs, and DNAs at high concentrations. Implementations for larger systems containing billion CG particles and the use of enhanced sampling in the systems will follow the current implementation in the GENESIS software.

Availability and future directions

The source code and user manual of GENESIS v1.7.1 is available at https://www.r-ccs.riken.jp/labs/cbrt/ website, and GENESIS-CG-tool is both included as a part of GENESIS v1.7.1 and deployed at https://github.com/noinil/genesis_cg_tool. Tutorials of performing CG simulations with GENESIS v1.7.1 are open on https://www.r-ccs.riken.jp/labs/cbrt/tutorials2019/ website. All the MD simulation files and data to produce the results are available from https://github.com/RikenSugitaLab/cg-development-GENESIS-1.7.0. We plan to use GENESIS v1.7.1 to study large-scale biological phenomina such as LLPS of protein and nucleic acids. We also plan to implement more CG models of biomolecules into GENESIS.

Supporting information

S1 Text. Supplementary methods of molecular dynamics simulations.

https://doi.org/10.1371/journal.pcbi.1009578.s001

(DOCX)

S1 Fig. Examples of using GENESIS-CG-tool to prepare CG topology and coordinate files.

(A) Generate CG files (PRO1_cg.gro for coordinates; PRO1_cg.top and PRO1_cg.itp for topology) for a protein, based on its atomistic PDB file (PRO1.pdb). (B) Generate CG files (DNA1_cg.gro for coordinates; DNA1_cg.top and DNA1_cg.itp for topology) as well as an atomistic coordinate file (DNA1.pdb) for a 20bp DNA, from its DNA sequence (DNA1.fasta).

https://doi.org/10.1371/journal.pcbi.1009578.s002

(TIF)

S2 Fig. Charge distribution on the surface residues of protein sry.

The charges of the surface residues of the HMG domain (residue 9–70, dark blue) of sry was determined with the RESPAC method. Whereas the charges on the N-tail and C-tail were the original integer charges (green). The same data are used to plot the CG structure of sry (Fig 5B).

https://doi.org/10.1371/journal.pcbi.1009578.s003

(TIF)

Acknowledgments

The authors thank Ai Shinobu and Giovanni B Brandani for testing the developments of CG models in GENESIS v1.7.0.

References

1. Jerkovic´ I, Cavalli G. Understanding 3D genome organization by multidisciplinary methods. Nat Rev Mol Cell Biol. 2021; 22: 511–528. pmid:33953379
- View Article
- PubMed/NCBI
- Google Scholar
2. Yoshizawa T, Nozawa R-S, Jia TZ, Saio T, Mori E. Biological phase separation: cell biology meets biophysics. Biophys Rev. 2020;12: 519–539. pmid:32189162
- View Article
- PubMed/NCBI
- Google Scholar
3. Di Pierro M, Zhang B, Aiden EL, Wolynes PG, Onuchic JN. Transferable model for chromosome architecture. Proc Natl Acad Sci. 2016;113: 12168–12173. pmid:27688758
- View Article
- PubMed/NCBI
- Google Scholar
4. Nuebler J, Fudenberg G, Imakaev M, Abdennur N, Mirny LA. Chromatin organization by an interplay of loop extrusion and compartmental segregation. Proc Natl Acad Sci. 2018;115: E6697–E6706. pmid:29967174
- View Article
- PubMed/NCBI
- Google Scholar
5. Borgia A, Borgia MB, Bugge K, Kissling VM, Heidarsson PO, Fernandes CB, et al. Extreme disorder in an ultrahigh-affinity protein complex. Nature. 2018;555: 61–66. pmid:29466338
- View Article
- PubMed/NCBI
- Google Scholar
6. Martin EW, Holehouse AS, Peran I, Farag M, Incicco JJ, Bremer A, et al. Valence and patterning of aromatic residues determine the phase behavior of prion-like domains. Science (80-). 2020;367: 694–699. pmid:32029630
- View Article
- PubMed/NCBI
- Google Scholar
7. Conicella AE, Dignon GL, Zerze GH, Schmidt HB, D’Ordine AM, Kim YC, et al. TDP-43 α-helical structure tunes liquid–liquid phase separation and function. Proc Natl Acad Sci. 2020;117: 5883–5894. pmid:32132204
- View Article
- PubMed/NCBI
- Google Scholar
8. Saunders MG, Voth GA. Coarse-Graining Methods for Computational Biology. Annu Rev Biophys. 2013;42: 73–93. pmid:23451897
- View Article
- PubMed/NCBI
- Google Scholar
9. Kmiecik S, Gront D, Kolinski M, Wieteska L, Dawid AE, Kolinski A. Coarse-Grained Protein Models and Their Applications. Chem Rev. 2016;116: 7898–7936. pmid:27333362
- View Article
- PubMed/NCBI
- Google Scholar
10. Gopal SM, Mukherjee S, Cheng Y-M, Feig M. PRIMO/PRIMONA: A coarse-grained model for proteins and nucleic acids that preserves near-atomistic accuracy. Proteins Struct Funct Bioinforma. 2010;78: 1266–1281. pmid:19967787
- View Article
- PubMed/NCBI
- Google Scholar
11. Sterpone F, Melchionna S, Tuffery P, Pasquali S, Mousseau N, Cragnolini T, et al. The OPEP protein model: from single molecules, amyloid formation, crowding and hydrodynamics to DNA/RNA systems. Chem Soc Rev. 2014;43: 4871–4893. pmid:24759934
- View Article
- PubMed/NCBI
- Google Scholar
12. Davtyan A, Schafer NP, Zheng W, Clementi C, Wolynes PG, Papoian GA. AWSEM-MD: Protein Structure Prediction Using Coarse-Grained Physical Potentials and Bioinformatically Based Local Structure Biasing. J Phys Chem B. 2012;116: 8494–8503. pmid:22545654
- View Article
- PubMed/NCBI
- Google Scholar
13. Marrink SJ, Risselada HJ, Yefimov S, Tieleman DP, de Vries AH. The MARTINI Force Field: Coarse Grained Model for Biomolecular Simulations. J Phys Chem B. 2007;111: 7812–7824. pmid:17569554
- View Article
- PubMed/NCBI
- Google Scholar
14. Clementi C, Nymeyer H, Onuchic JN. Topological and energetic factors: what determines the structural details of the transition state ensemble and “en-route” intermediates for protein folding? an investigation for small globular proteins. J Mol Biol. 2000;298: 937–953. pmid:10801360
- View Article
- PubMed/NCBI
- Google Scholar
15. Go N. Theoretical Studies of Protein Folding. Annu Rev Biophys Bioeng. 1983;12: 183–210. pmid:6347038
- View Article
- PubMed/NCBI
- Google Scholar
16. Takada S. Gō model revisited. Biophys Physicobiology. 2019;16: 248–255. pmid:31984178
- View Article
- PubMed/NCBI
- Google Scholar
17. Arnarez C, Uusitalo JJ, Masman MF, Ingólfsson HI, de Jong DH, Melo MN, et al. Dry Martini, a Coarse-Grained Force Field for Lipid Membrane Simulations with Implicit Solvent. J Chem Theory Comput. 2015;11: 260–275. pmid:26574224
- View Article
- PubMed/NCBI
- Google Scholar
18. Ermak DL, McCammon JA. Brownian dynamics with hydrodynamic interactions. J Chem Phys. 1978;69: 1352–1360.
- View Article
- Google Scholar
19. Ando T, Skolnick J. Sliding of Proteins Non-specifically Bound to DNA: Brownian Dynamics Studies with Coarse-Grained Protein and DNA Models. Clore GM, editor. PLoS Comput Biol. 2014;10: e1003990. pmid:25504215
- View Article
- PubMed/NCBI
- Google Scholar
20. Sterpone F, Derreumaux P, Melchionna S. Protein Simulations in Fluids: Coupling the OPEP Coarse-Grained Force Field with Hydrodynamics. J Chem Theory Comput. 2015;11: 1843–1853. pmid:26574390
- View Article
- PubMed/NCBI
- Google Scholar
21. Brandner A F., Timr S, Melchionna S, Derreumaux P, Baaden M, Sterpone F. Modelling lipid systems in fluid with Lattice Boltzmann Molecular Dynamics simulations and hydrodynamics. Sci Rep. 2019;9: 16450. pmid:31712588
- View Article
- PubMed/NCBI
- Google Scholar
22. Takada S, Kanada R, Tan C, Terakawa T, Li W, Kenzaki H. Modeling Structural Dynamics of Biomolecular Complexes by Coarse-Grained Molecular Simulations. Acc Chem Res. 2015;48: 3026–3035. pmid:26575522
- View Article
- PubMed/NCBI
- Google Scholar
23. Nguyen HT, Hori N, Thirumalai D. Theory and simulations for RNA folding in mixtures of monovalent and divalent cations. Proc Natl Acad Sci. 2019;116: 21022–21030. pmid:31570624
- View Article
- PubMed/NCBI
- Google Scholar
24. Hinckley DM, Lequieu JP, de Pablo JJ. Coarse-grained modeling of DNA oligomer hybridization: Length, sequence, and salt effects. J Chem Phys. 2014;141: 035102. pmid:25053341
- View Article
- PubMed/NCBI
- Google Scholar
25. Kubo S, Niina T, Takada S. Molecular dynamics simulation of proton-transfer coupled rotations in ATP synthase FO motor. Sci Rep. 2020;10: 8225. pmid:32427921
- View Article
- PubMed/NCBI
- Google Scholar
26. Li W, Wang W, Takada S. Energy landscape views for interplays among folding, binding, and allostery of calmodulin domains. Proc Natl Acad Sci. 2014;111: 10550–10555. pmid:25002491
- View Article
- PubMed/NCBI
- Google Scholar
27. Hori N, Takada S. Coarse-Grained Structure-Based Model for RNA-Protein Complexes Developed by Fluctuation Matching. J Chem Theory Comput. 2012;8: 3384–3394. pmid:26605744
- View Article
- PubMed/NCBI
- Google Scholar
28. Freeman GS, Hinckley DM, Lequieu JP, Whitmer JK, de Pablo JJ. Coarse-grained modeling of DNA curvature. J Chem Phys. 2014;141: 165103. pmid:25362344
- View Article
- PubMed/NCBI
- Google Scholar
29. Tan C, Terakawa T, Takada S. Dynamic Coupling among Protein Binding, Sliding, and DNA Bending Revealed by Molecular Dynamics. J Am Chem Soc. 2016;138: 8512–8522. pmid:27309278
- View Article
- PubMed/NCBI
- Google Scholar
30. Brandani GB, Niina T, Tan C, Takada S. DNA sliding in nucleosomes via twist defect propagation revealed by molecular simulations. Nucleic Acids Res. 2018;46: 2788–2801. pmid:29506273
- View Article
- PubMed/NCBI
- Google Scholar
31. Lequieu J, Schwartz DC, de Pablo JJ. In silico evidence for sequence-dependent nucleosome sliding. Proc Natl Acad Sci. 2017;114: E9197–E9205. pmid:29078285
- View Article
- PubMed/NCBI
- Google Scholar
32. Tan C, Takada S. Dynamic and Structural Modeling of the Specificity in Protein–DNA Interactions Guided by Binding Assay and Structure Data. J Chem Theory Comput. 2018;14: 3877–3889. pmid:29806939
- View Article
- PubMed/NCBI
- Google Scholar
33. Kapcha LH, Rossky PJ. A Simple Atomic-Level Hydrophobicity Scale Reveals Protein Interfacial Structure. J Mol Biol. 2014;426: 484–498. pmid:24120937
- View Article
- PubMed/NCBI
- Google Scholar
34. Kim YC, Hummer G. Coarse-grained Models for Simulations of Multiprotein Complexes: Application to Ubiquitin Binding. J Mol Biol. 2008;375: 1416–1433. pmid:18083189
- View Article
- PubMed/NCBI
- Google Scholar
35. Dignon GL, Zheng W, Kim YC, Best RB, Mittal J. Sequence determinants of protein phase behavior from a coarse-grained model. Ofran Y, editor. PLOS Comput Biol. 2018;14: e1005941. pmid:29364893
- View Article
- PubMed/NCBI
- Google Scholar
36. Dannenhoffer-Lafage T, Best RB. A Data-Driven Hydrophobicity Scale for Predicting Liquid–Liquid Phase Separation of Proteins. J Phys Chem B. 2021;125: 4046–4056. pmid:33876938
- View Article
- PubMed/NCBI
- Google Scholar
37. Regy RM, Dignon GL, Zheng W, Kim YC, Mittal J. Sequence dependent phase separation of protein-polynucleotide mixtures elucidated using molecular simulations. Nucleic Acids Res. 2020;48: 12593–12603. pmid:33264400
- View Article
- PubMed/NCBI
- Google Scholar
38. Plimpton S. Fast Parallel Algorithms for Short-Range Molecular Dynamics. J Comput Phys. 1995;117: 1–19.
- View Article
- Google Scholar
39. Hinckley DM, Freeman GS, Whitmer JK, de Pablo JJ. An experimentally-informed coarse-grained 3-site-per-nucleotide model of DNA: Structure, thermodynamics, and dynamics of hybridization. J Chem Phys. 2013;139: 144903. pmid:24116642
- View Article
- PubMed/NCBI
- Google Scholar
40. Noel JK, Levi M, Raghunathan M, Lammert H, Hayes RL, Onuchic JN, et al. SMOG 2: A Versatile Software Package for Generating Structure-Based Models. Prlic A, editor. PLOS Comput Biol. 2016;12: e1004794. pmid:26963394
- View Article
- PubMed/NCBI
- Google Scholar
41. Pronk S, Páll S, Schulz R, Larsson P, Bjelkmar P, Apostolov R, et al. GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit. Bioinformatics. 2013;29: 845–854. pmid:23407358
- View Article
- PubMed/NCBI
- Google Scholar
42. Phillips JC, Braun R, Wang W, Gumbart J, Tajkhorshid E, Villa E, et al. Scalable molecular dynamics with NAMD. J Comput Chem. 2005;26: 1781–1802. pmid:16222654
- View Article
- PubMed/NCBI
- Google Scholar
43. Lu W, Bueno C, Schafer NP, Moller J, Jin S, Chen X, et al. OpenAWSEM with Open3SPN2: A fast, flexible, and accessible framework for large-scale coarse-grained biomolecular simulations. Schneidman-Duhovny D, editor. PLOS Comput Biol. 2021;17: e1008308. pmid:33577557
- View Article
- PubMed/NCBI
- Google Scholar
44. Kenzaki H, Koga N, Hori N, Kanada R, Li W, Okazaki K, et al. CafeMol: A Coarse-Grained Biomolecular Simulator for Simulating Proteins at Work. J Chem Theory Comput. 2011;7: 1979–1989. pmid:26596457
- View Article
- PubMed/NCBI
- Google Scholar
45. Li P, Banjade S, Cheng H-C, Kim S, Chen B, Guo L, et al. Phase transitions in the assembly of multivalent signalling proteins. Nature. 2012;483: 336–340. pmid:22398450
- View Article
- PubMed/NCBI
- Google Scholar
46. Borcherds W, Bremer A, Borgia MB, Mittag T. How do intrinsically disordered protein regions encode a driving force for liquid–liquid phase separation? Curr Opin Struct Biol. 2021;67: 41–50. pmid:33069007
- View Article
- PubMed/NCBI
- Google Scholar
47. Roden C, Gladfelter AS. RNA contributions to the form and function of biomolecular condensates. Nat Rev Mol Cell Biol. 2021;22: 183–195. pmid:32632317
- View Article
- PubMed/NCBI
- Google Scholar
48. Turner AL, Watson M, Wilkins OG, Cato L, Travers A, Thomas JO, et al. Highly disordered histone H1−DNA model complexes and their condensates. Proc Natl Acad Sci. 2018;115: 11964–11969. pmid:30301810
- View Article
- PubMed/NCBI
- Google Scholar
49. Banani SF, Lee HO, Hyman AA, Rosen MK. Biomolecular condensates: organizers of cellular biochemistry. Nat Rev Mol Cell Biol. 2017;18: 285–298. pmid:28225081
- View Article
- PubMed/NCBI
- Google Scholar
50. Jung J, Mori T, Kobayashi C, Matsunaga Y, Yoda T, Feig M, et al. GENESIS: a hybrid-parallel and multi-scale molecular dynamics simulator with enhanced sampling algorithms for biomolecular and cellular simulations. Wiley Interdiscip Rev Comput Mol Sci. 2015;5: 310–323. pmid:26753008
- View Article
- PubMed/NCBI
- Google Scholar
51. Kobayashi C, Jung J, Matsunaga Y, Mori T, Ando T, Tamura K, et al. GENESIS 1.1: A hybrid-parallel molecular dynamics simulator with enhanced sampling algorithms on multiple computational platforms. J Comput Chem. 2017;38: 2193–2206. pmid:28718930
- View Article
- PubMed/NCBI
- Google Scholar
52. Bezanson J, Edelman A, Karpinski S, Shah VB. Julia: A Fresh Approach to Numerical Computing. SIAM Rev. 2017;59: 65–98.
- View Article
- Google Scholar
53. Terakawa T, Takada S. Multiscale Ensemble Modeling of Intrinsically Disordered Proteins: p53 N-Terminal Domain. Biophys J. 2011;101: 1450–1458. pmid:21943426
- View Article
- PubMed/NCBI
- Google Scholar
54. Tan C, Jung J, Kobayashi C, Sugita Y. A singularity-free torsion angle potential for coarse-grained molecular dynamics simulations. J Chem Phys. 2020;153: 044110. pmid:32752657
- View Article
- PubMed/NCBI
- Google Scholar
55. Catenaccio A, Daruich Y, Magallanes C. Temperature dependence of the permittivity of water. Chem Phys Lett. 2003;367: 669–671.
- View Article
- Google Scholar
56. Stogryn A. Equations for Calculating the Dielectric Constant of Saline Water. IEEE Trans Microw Theory Tech. 1971;19: 733–736.
- View Article
- Google Scholar
57. Ashbaugh HS, Hatch HW. Natively Unfolded Protein Stability as a Coil-to-Globule Transition in Charge/Hydropathy Space. J Am Chem Soc. 2008;130: 9536–9542. pmid:18576630
- View Article
- PubMed/NCBI
- Google Scholar
58. Miyazawa S, Jernigan RL. Residue–Residue Potentials with a Favorable Contact Pair Term and an Unfavorable High Packing Density Term, for Simulation and Threading. J Mol Biol. 1996;256: 623–644. pmid:8604144
- View Article
- PubMed/NCBI
- Google Scholar
59. Chu J-W, Voth GA. Coarse-Grained Modeling of the Actin Filament Derived from Atomistic-Scale Simulations. Biophys J. 2006;90: 1572–1582. pmid:16361345
- View Article
- PubMed/NCBI
- Google Scholar
60. Terakawa T, Takada S. RESPAC: Method to Determine Partial Charges in Coarse-Grained Protein Model and Its Application to DNA-Binding Proteins. J Chem Theory Comput. 2014;10: 711–721. pmid:26580048
- View Article
- PubMed/NCBI
- Google Scholar
61. Niina T, Brandani GB, Tan C, Takada S. Sequence-dependent nucleosome sliding in rotation-coupled and uncoupled modes revealed by molecular simulations. Panchenko ARR, editor. PLOS Comput Biol. 2017;13: e1005880. pmid:29194442
- View Article
- PubMed/NCBI
- Google Scholar
62. Tan C, Takada S. Nucleosome allostery in pioneer transcription factor binding. Proc Natl Acad Sci. 2020;117: 20586–20596. pmid:32778600
- View Article
- PubMed/NCBI
- Google Scholar
63. Humphrey W, Dalke A, Schulten K. VMD: Visual molecular dynamics. J Mol Graph. 1996;14: 33–38. pmid:8744570
- View Article
- PubMed/NCBI
- Google Scholar
64. Schrödinger L. The PyMOL Molecular Graphics System, Version 1.8. 2015 Nov.
- View Article
- Google Scholar
65. Lu X-J, Olson WK. 3DNA: a versatile, integrated software system for the analysis, rebuilding and visualization of three-dimensional nucleic-acid structures. Nat Protoc. 2008;3: 1213–1227. pmid:18600227
- View Article
- PubMed/NCBI
- Google Scholar
66. Adams PD, Afonine P V., Baskaran K, Berman HM, Berrisford J, Bricogne G, et al. Announcing mandatory submission of PDBx/mmCIF format files for crystallographic depositions to the Protein Data Bank (PDB). Acta Crystallogr Sect D Struct Biol. 2019;75: 451–454. pmid:30988261
- View Article
- PubMed/NCBI
- Google Scholar
67. Jung J, Mori T, Sugita Y. Midpoint cell method for hybrid (MPI+OpenMP) parallelization of molecular dynamics simulations. J Comput Chem. 2014;35: 1064–1072. pmid:24659253
- View Article
- PubMed/NCBI
- Google Scholar
68. Karanicolas J, Brooks CL. The origins of asymmetry in the folding transition states of protein L and protein G. Protein Sci. 2009;11: 2351–2361. pmid:12237457
- View Article
- PubMed/NCBI
- Google Scholar
69. Kobayashi C, Matsunaga Y, Koike R, Ota M, Sugita Y. Domain Motion Enhanced (DoME) Model for Efficient Conformational Sampling of Multidomain Proteins. J Phys Chem B. 2015;119: 14584–14593. pmid:26536148
- View Article
- PubMed/NCBI
- Google Scholar
70. Best RB, Chen Y-G, Hummer G. Slow Protein Conformational Dynamics from Multiple Experimental Structures: The Helix/Sheet Transition of Arc Repressor. Structure. 2005;13: 1755–1763. pmid:16338404
- View Article
- PubMed/NCBI
- Google Scholar
71. Westover KD, Bushnell DA, Kornberg RD. Structural Basis of Transcription. Cell. 2004;119: 481–489. pmid:15537538
- View Article
- PubMed/NCBI
- Google Scholar
72. Grant RA, Filman DJ, Finkel SE, Kolter R, Hogle JM. The crystal structure of Dps, a ferritin homolog that binds and protects DNA. Nat Struct Biol. 1998;5: 294–303. pmid:9546221
- View Article
- PubMed/NCBI
- Google Scholar
73. Davey CA, Sargent DF, Luger K, Maeder AW, Richmond TJ. Solvent Mediated Interactions in the Structure of the Nucleosome Core Particle at 1.9Å Resolution. J Mol Biol. 2002;319: 1097–1113. pmid:12079350
- View Article
- PubMed/NCBI
- Google Scholar
74. Sekido R, Lovell-Badge R. Sex determination involves synergistic action of SRY and SF1 on a specific Sox9 enhancer. Nature. 2008;453: 930–934. pmid:18454134
- View Article
- PubMed/NCBI
- Google Scholar
75. Harley VR, Lovell-Badge R, Goodfellow PN. Definition of a consensus DNA binding site for SRY. Nucleic Acids Res. 1994;22: 1500–1501. pmid:8190643
- View Article
- PubMed/NCBI
- Google Scholar
76. Murphy EC, Zhurkin VB, Louis JM, Cornilescu G, Clore GM. Structural Basis for SRY-dependent 46-X,Y Sex Reversal: Modulation of DNA Bending by a Naturally Occurring Point Mutation. J Mol Biol. 2001;312: 481–499. pmid:11563911
- View Article
- PubMed/NCBI
- Google Scholar
77. Pontiggia A, Rimini R, Harley VR, Goodfellow PN, Lovell-Badge R, Bianchi ME. Sex-reversing mutations affect the architecture of SRY-DNA complexes. EMBO J. 1994;13: 6115–6124. pmid:7813448
- View Article
- PubMed/NCBI
- Google Scholar
78. Fornes O, Castro-Mondragon JA, Khan A, van der Lee R, Zhang X, Richmond PA, et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 2019; 48: D87–D92. pmid:31701148
- View Article
- PubMed/NCBI
- Google Scholar
79. Patel A, Lee HO, Jawerth L, Maharana S, Jahnel M, Hein MY, et al. A Liquid-to-Solid Phase Transition of the ALS Protein FUS Accelerated by Disease Mutation. Cell. 2015;162: 1066–1077. pmid:26317470
- View Article
- PubMed/NCBI
- Google Scholar
80. Regy RM, Thompson J, Kim YC, Mittal J. Improved coarse-grained model for studying sequence dependent phase separation of disordered proteins. Protein Sci. 2021;30: 1371–1379. pmid:33934416
- View Article
- PubMed/NCBI
- Google Scholar
81. Yuan S, Wang J, Zhu D, Wang N, Gao Q, Chen W, et al. Cryo-EM structure of a herpesvirus capsid at 3.1 Å. Science (80-). 2018;360: eaao7283. pmid:29622627
- View Article
- PubMed/NCBI
- Google Scholar
82. Öztürk MA, De M, Cojocaru V, Wade RC. Chromatosome Structure and Dynamics from Molecular Simulations. Annu Rev Phys Chem. 2020;71: 101–119. pmid:32017651
- View Article
- PubMed/NCBI
- Google Scholar
83. Wedemann G, Langowski J. Computer Simulation of the 30-Nanometer Chromatin Fiber. Biophys J. 2002;82: 2847–2859. pmid:12023209
- View Article
- PubMed/NCBI
- Google Scholar
84. Buckle A, Brackley CA, Boyle S, Marenduzzo D, Gilbert N. Polymer Simulations of Heteromorphic Chromatin Predict the 3D Folding of Complex Genomic Loci. Mol Cell. 2018;72: 786–797.e11. pmid:30344096
- View Article
- PubMed/NCBI
- Google Scholar
85. Takada S, Brandani GB, Tan C. Nucleosomes as allosteric scaffolds for genetic regulation. Curr Opin Struct Biol. 2020;62: 93–101. pmid:31901887
- View Article
- PubMed/NCBI
- Google Scholar
86. Ugarte La Torre D, Takada S. Coarse-grained implicit solvent lipid force field with a compatible resolution to the Cα protein representation. J Chem Phys. 2020;153: 205101. pmid:33261497
- View Article
- PubMed/NCBI
- Google Scholar
87. Terakawa T, Kenzaki H, Takada S. p53 Searches on DNA by Rotation-Uncoupled Sliding at C-Terminal Tails and Restricted Hopping of Core Domains. J Am Chem Soc. 2012;134: 14555–14562. pmid:22880817
- View Article
- PubMed/NCBI
- Google Scholar
88. Dai L, Xu Y, Du Z, Su X, Yu J. Revealing atomic-scale molecular diffusion of a plant-transcription factor WRKY domain protein along DNA. Proc Natl Acad Sci. 2021;118: e2102621118. pmid:34074787
- View Article
- PubMed/NCBI
- Google Scholar
89. Grime JMA, Voth GA. Highly Scalable and Memory Efficient Ultra-Coarse-Grained Molecular Dynamics Simulations. J Chem Theory Comput. 2014;10: 423–431. pmid:26579921
- View Article
- PubMed/NCBI
- Google Scholar
90. Garaizar A, Espinosa JR. Salt dependent phase behavior of intrinsically disordered proteins from a coarse-grained model with explicit water and ions. J Chem Phys. 2021;155: 125103. pmid:34598583
- View Article
- PubMed/NCBI
- Google Scholar
91. Sugita Y, Okamoto Y. Replica-exchange molecular dynamics method for protein folding. Chem Phys Lett. 1999;314: 141–151.
- View Article
- Google Scholar
92. Sugita Y, Kitao A, Okamoto Y. Multidimensional replica-exchange method for free-energy calculations. J Chem Phys. 2000;113: 6042–6051.
- View Article
- Google Scholar
93. Miao Y, Feher VA, McCammon JA. Gaussian Accelerated Molecular Dynamics: Unconstrained Enhanced Sampling and Free Energy Calculation. J Chem Theory Comput. 2015;11: 3584–3595. pmid:26300708
- View Article
- PubMed/NCBI
- Google Scholar
94. Kamiya M, Sugita Y. Flexible selection of the solute region in replica exchange with solute tempering: Application to protein-folding simulations. J Chem Phys. 2018;149: 072304. pmid:30134668
- View Article
- PubMed/NCBI
- Google Scholar
95. Fernandez-Leiro R, Scheres SHW. Unravelling biological macromolecules with cryo-electron microscopy. Nature. 2016;537: 339–346. pmid:27629640
- View Article
- PubMed/NCBI
- Google Scholar
96. Hylton RK, Swulius MT. Challenges and triumphs in cryo-electron tomography. iScience. 2021;24: 102959. pmid:34466785
- View Article
- PubMed/NCBI
- Google Scholar
97. Uchihashi T, Ganser C. Recent advances in bioimaging with high-speed atomic force microscopy. Biophys Rev. 2020;12: 363–369. pmid:32172451
- View Article
- PubMed/NCBI
- Google Scholar
98. Luchinat E, Banci L. In-Cell NMR in Human Cells: Direct Protein Expression Allows Structural Studies of Protein Folding and Maturation. Acc Chem Res. 2018;51: 1550–1557. pmid:29869502
- View Article
- PubMed/NCBI
- Google Scholar
99. Johnson GT, Autin L, Al-Alusi M, Goodsell DS, Sanner MF, Olson AJ. cellPACK: a virtual mesoscope to model and visualize structural systems biology. Nat Methods. 2015;12: 85–91. pmid:25437435
- View Article
- PubMed/NCBI
- Google Scholar
100. Jewett AI, Stelter D, Lambert J, Saladi SM, Roscioni OM, Ricci M, et al. Moltemplate: A Tool for Coarse-Grained Modeling of Complex Biological Matter and Soft Condensed Matter Physics. J Mol Biol. 2021;433: 166841. pmid:33539886
- View Article
- PubMed/NCBI
- Google Scholar
101. Martínez L, Andrade R, Birgin EG, Martínez JM. PACKMOL: A package for building initial configurations for molecular dynamics simulations. J Comput Chem. 2009;30: 2157–2164. pmid:19229944
- View Article
- PubMed/NCBI
- Google Scholar
102. Seo S, Shinoda W. SPICA Force Field for Lipid Membranes: Domain Formation Induced by Cholesterol. J Chem Theory Comput. 2019;15: 762–774. pmid:30514078
- View Article
- PubMed/NCBI
- Google Scholar
103. Han W, Wan C-K, Jiang F, Wu Y-D. PACE Force Field for Protein Simulations. 1. Full Parameterization of Version 1 and Verification. J Chem Theory Comput. 2010;6: 3373–3389. pmid:26617092
- View Article
- PubMed/NCBI
- Google Scholar
104. Tian C, Kasavajhala K, Belfon KAA, Raguette L, Huang H, Migues AN, et al. ff19SB: Amino-Acid-Specific Protein Backbone Parameters Trained against Quantum Mechanics Energy Surfaces in Solution. J Chem Theory Comput. 2020;16: 528–552. pmid:31714766
- View Article
- PubMed/NCBI
- Google Scholar
105. Huang J, Rauscher S, Nawrocki G, Ran T, Feig M, de Groot BL, et al. CHARMM36m: an improved force field for folded and intrinsically disordered proteins. Nat Methods. 2017;14: 71–73. pmid:27819658
- View Article
- PubMed/NCBI
- Google Scholar
106. Oostenbrink C, Villa A, Mark AE, Van Gunsteren WF. A biomolecular force field based on the free enthalpy of hydration and solvation: The GROMOS force-field parameter sets 53A5 and 53A6. J Comput Chem. 2004;25: 1656–1676. pmid:15264259
- View Article
- PubMed/NCBI
- Google Scholar
107. Badaczewska-Dawid AE, Kolinski A, Kmiecik S. Computational reconstruction of atomistic protein structures from coarse-grained models. Comput Struct Biotechnol J. 2020;18: 162–176. pmid:31969975
- View Article
- PubMed/NCBI
- Google Scholar
108. Shimizu M, Takada S. Reconstruction of Atomistic Structures from Coarse-Grained Models for Protein–DNA Complexes. J Chem Theory Comput. 2018;14: 1682–1694. pmid:29397721
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Jerkovic´ I, Cavalli G. Understanding 3D genome organization by multidisciplinary methods. Nat Rev Mol Cell Biol. 2021; 22: 511–528. pmid:33953379
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Yoshizawa T, Nozawa R-S, Jia TZ, Saio T, Mori E. Biological phase separation: cell biology meets biophysics. Biophys Rev. 2020;12: 519–539. pmid:32189162
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Di Pierro M, Zhang B, Aiden EL, Wolynes PG, Onuchic JN. Transferable model for chromosome architecture. Proc Natl Acad Sci. 2016;113: 12168–12173. pmid:27688758
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Nuebler J, Fudenberg G, Imakaev M, Abdennur N, Mirny LA. Chromatin organization by an interplay of loop extrusion and compartmental segregation. Proc Natl Acad Sci. 2018;115: E6697–E6706. pmid:29967174
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref5] 5. Borgia A, Borgia MB, Bugge K, Kissling VM, Heidarsson PO, Fernandes CB, et al. Extreme disorder in an ultrahigh-affinity protein complex. Nature. 2018;555: 61–66. pmid:29466338
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref6] 6. Martin EW, Holehouse AS, Peran I, Farag M, Incicco JJ, Bremer A, et al. Valence and patterning of aromatic residues determine the phase behavior of prion-like domains. Science (80-). 2020;367: 694–699. pmid:32029630
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref7] 7. Conicella AE, Dignon GL, Zerze GH, Schmidt HB, D’Ordine AM, Kim YC, et al. TDP-43 α-helical structure tunes liquid–liquid phase separation and function. Proc Natl Acad Sci. 2020;117: 5883–5894. pmid:32132204
View Article
PubMed/NCBI
Google Scholar

[26] View Article

[27] PubMed/NCBI

[28] Google Scholar

[ref8] 8. Saunders MG, Voth GA. Coarse-Graining Methods for Computational Biology. Annu Rev Biophys. 2013;42: 73–93. pmid:23451897
View Article
PubMed/NCBI
Google Scholar

[30] View Article

[31] PubMed/NCBI

[32] Google Scholar

[ref9] 9. Kmiecik S, Gront D, Kolinski M, Wieteska L, Dawid AE, Kolinski A. Coarse-Grained Protein Models and Their Applications. Chem Rev. 2016;116: 7898–7936. pmid:27333362
View Article
PubMed/NCBI
Google Scholar

[34] View Article

[35] PubMed/NCBI

[36] Google Scholar

[ref10] 10. Gopal SM, Mukherjee S, Cheng Y-M, Feig M. PRIMO/PRIMONA: A coarse-grained model for proteins and nucleic acids that preserves near-atomistic accuracy. Proteins Struct Funct Bioinforma. 2010;78: 1266–1281. pmid:19967787
View Article
PubMed/NCBI
Google Scholar

[38] View Article

[39] PubMed/NCBI

[40] Google Scholar

[ref11] 11. Sterpone F, Melchionna S, Tuffery P, Pasquali S, Mousseau N, Cragnolini T, et al. The OPEP protein model: from single molecules, amyloid formation, crowding and hydrodynamics to DNA/RNA systems. Chem Soc Rev. 2014;43: 4871–4893. pmid:24759934
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref12] 12. Davtyan A, Schafer NP, Zheng W, Clementi C, Wolynes PG, Papoian GA. AWSEM-MD: Protein Structure Prediction Using Coarse-Grained Physical Potentials and Bioinformatically Based Local Structure Biasing. J Phys Chem B. 2012;116: 8494–8503. pmid:22545654
View Article
PubMed/NCBI
Google Scholar

[46] View Article

[47] PubMed/NCBI

[48] Google Scholar

[ref13] 13. Marrink SJ, Risselada HJ, Yefimov S, Tieleman DP, de Vries AH. The MARTINI Force Field: Coarse Grained Model for Biomolecular Simulations. J Phys Chem B. 2007;111: 7812–7824. pmid:17569554
View Article
PubMed/NCBI
Google Scholar

[50] View Article

[51] PubMed/NCBI

[52] Google Scholar

[ref14] 14. Clementi C, Nymeyer H, Onuchic JN. Topological and energetic factors: what determines the structural details of the transition state ensemble and “en-route” intermediates for protein folding? an investigation for small globular proteins. J Mol Biol. 2000;298: 937–953. pmid:10801360
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref15] 15. Go N. Theoretical Studies of Protein Folding. Annu Rev Biophys Bioeng. 1983;12: 183–210. pmid:6347038
View Article
PubMed/NCBI
Google Scholar

[58] View Article

[59] PubMed/NCBI

[60] Google Scholar

[ref16] 16. Takada S. Gō model revisited. Biophys Physicobiology. 2019;16: 248–255. pmid:31984178
View Article
PubMed/NCBI
Google Scholar

[62] View Article

[63] PubMed/NCBI

[64] Google Scholar

[ref17] 17. Arnarez C, Uusitalo JJ, Masman MF, Ingólfsson HI, de Jong DH, Melo MN, et al. Dry Martini, a Coarse-Grained Force Field for Lipid Membrane Simulations with Implicit Solvent. J Chem Theory Comput. 2015;11: 260–275. pmid:26574224
View Article
PubMed/NCBI
Google Scholar

[66] View Article

[67] PubMed/NCBI

[68] Google Scholar

[ref18] 18. Ermak DL, McCammon JA. Brownian dynamics with hydrodynamic interactions. J Chem Phys. 1978;69: 1352–1360.
View Article
Google Scholar

[70] View Article

[71] Google Scholar

[ref19] 19. Ando T, Skolnick J. Sliding of Proteins Non-specifically Bound to DNA: Brownian Dynamics Studies with Coarse-Grained Protein and DNA Models. Clore GM, editor. PLoS Comput Biol. 2014;10: e1003990. pmid:25504215
View Article
PubMed/NCBI
Google Scholar

[73] View Article

[74] PubMed/NCBI

[75] Google Scholar

[ref20] 20. Sterpone F, Derreumaux P, Melchionna S. Protein Simulations in Fluids: Coupling the OPEP Coarse-Grained Force Field with Hydrodynamics. J Chem Theory Comput. 2015;11: 1843–1853. pmid:26574390
View Article
PubMed/NCBI
Google Scholar

[77] View Article

[78] PubMed/NCBI

[79] Google Scholar

[ref21] 21. Brandner A F., Timr S, Melchionna S, Derreumaux P, Baaden M, Sterpone F. Modelling lipid systems in fluid with Lattice Boltzmann Molecular Dynamics simulations and hydrodynamics. Sci Rep. 2019;9: 16450. pmid:31712588
View Article
PubMed/NCBI
Google Scholar

[81] View Article

[82] PubMed/NCBI

[83] Google Scholar

[ref22] 22. Takada S, Kanada R, Tan C, Terakawa T, Li W, Kenzaki H. Modeling Structural Dynamics of Biomolecular Complexes by Coarse-Grained Molecular Simulations. Acc Chem Res. 2015;48: 3026–3035. pmid:26575522
View Article
PubMed/NCBI
Google Scholar

[85] View Article

[86] PubMed/NCBI

[87] Google Scholar

[ref23] 23. Nguyen HT, Hori N, Thirumalai D. Theory and simulations for RNA folding in mixtures of monovalent and divalent cations. Proc Natl Acad Sci. 2019;116: 21022–21030. pmid:31570624
View Article
PubMed/NCBI
Google Scholar

[89] View Article

[90] PubMed/NCBI

[91] Google Scholar

[ref24] 24. Hinckley DM, Lequieu JP, de Pablo JJ. Coarse-grained modeling of DNA oligomer hybridization: Length, sequence, and salt effects. J Chem Phys. 2014;141: 035102. pmid:25053341
View Article
PubMed/NCBI
Google Scholar

[93] View Article

[94] PubMed/NCBI

[95] Google Scholar

[ref25] 25. Kubo S, Niina T, Takada S. Molecular dynamics simulation of proton-transfer coupled rotations in ATP synthase FO motor. Sci Rep. 2020;10: 8225. pmid:32427921
View Article
PubMed/NCBI
Google Scholar

[97] View Article

[98] PubMed/NCBI

[99] Google Scholar

[ref26] 26. Li W, Wang W, Takada S. Energy landscape views for interplays among folding, binding, and allostery of calmodulin domains. Proc Natl Acad Sci. 2014;111: 10550–10555. pmid:25002491
View Article
PubMed/NCBI
Google Scholar

[101] View Article

[102] PubMed/NCBI

[103] Google Scholar

[ref27] 27. Hori N, Takada S. Coarse-Grained Structure-Based Model for RNA-Protein Complexes Developed by Fluctuation Matching. J Chem Theory Comput. 2012;8: 3384–3394. pmid:26605744
View Article
PubMed/NCBI
Google Scholar

[105] View Article

[106] PubMed/NCBI

[107] Google Scholar

[ref28] 28. Freeman GS, Hinckley DM, Lequieu JP, Whitmer JK, de Pablo JJ. Coarse-grained modeling of DNA curvature. J Chem Phys. 2014;141: 165103. pmid:25362344
View Article
PubMed/NCBI
Google Scholar

[109] View Article

[110] PubMed/NCBI

[111] Google Scholar

[ref29] 29. Tan C, Terakawa T, Takada S. Dynamic Coupling among Protein Binding, Sliding, and DNA Bending Revealed by Molecular Dynamics. J Am Chem Soc. 2016;138: 8512–8522. pmid:27309278
View Article
PubMed/NCBI
Google Scholar

[113] View Article

[114] PubMed/NCBI

[115] Google Scholar

[ref30] 30. Brandani GB, Niina T, Tan C, Takada S. DNA sliding in nucleosomes via twist defect propagation revealed by molecular simulations. Nucleic Acids Res. 2018;46: 2788–2801. pmid:29506273
View Article
PubMed/NCBI
Google Scholar

[117] View Article

[118] PubMed/NCBI

[119] Google Scholar

[ref31] 31. Lequieu J, Schwartz DC, de Pablo JJ. In silico evidence for sequence-dependent nucleosome sliding. Proc Natl Acad Sci. 2017;114: E9197–E9205. pmid:29078285
View Article
PubMed/NCBI
Google Scholar

[121] View Article

[122] PubMed/NCBI

[123] Google Scholar

[ref32] 32. Tan C, Takada S. Dynamic and Structural Modeling of the Specificity in Protein–DNA Interactions Guided by Binding Assay and Structure Data. J Chem Theory Comput. 2018;14: 3877–3889. pmid:29806939
View Article
PubMed/NCBI
Google Scholar

[125] View Article

[126] PubMed/NCBI

[127] Google Scholar

[ref33] 33. Kapcha LH, Rossky PJ. A Simple Atomic-Level Hydrophobicity Scale Reveals Protein Interfacial Structure. J Mol Biol. 2014;426: 484–498. pmid:24120937
View Article
PubMed/NCBI
Google Scholar

[129] View Article

[130] PubMed/NCBI

[131] Google Scholar

[ref34] 34. Kim YC, Hummer G. Coarse-grained Models for Simulations of Multiprotein Complexes: Application to Ubiquitin Binding. J Mol Biol. 2008;375: 1416–1433. pmid:18083189
View Article
PubMed/NCBI
Google Scholar

[133] View Article

[134] PubMed/NCBI

[135] Google Scholar

[ref35] 35. Dignon GL, Zheng W, Kim YC, Best RB, Mittal J. Sequence determinants of protein phase behavior from a coarse-grained model. Ofran Y, editor. PLOS Comput Biol. 2018;14: e1005941. pmid:29364893
View Article
PubMed/NCBI
Google Scholar

[137] View Article

[138] PubMed/NCBI

[139] Google Scholar

[ref36] 36. Dannenhoffer-Lafage T, Best RB. A Data-Driven Hydrophobicity Scale for Predicting Liquid–Liquid Phase Separation of Proteins. J Phys Chem B. 2021;125: 4046–4056. pmid:33876938
View Article
PubMed/NCBI
Google Scholar

[141] View Article

[142] PubMed/NCBI

[143] Google Scholar

[ref37] 37. Regy RM, Dignon GL, Zheng W, Kim YC, Mittal J. Sequence dependent phase separation of protein-polynucleotide mixtures elucidated using molecular simulations. Nucleic Acids Res. 2020;48: 12593–12603. pmid:33264400
View Article
PubMed/NCBI
Google Scholar

[145] View Article

[146] PubMed/NCBI

[147] Google Scholar

[ref38] 38. Plimpton S. Fast Parallel Algorithms for Short-Range Molecular Dynamics. J Comput Phys. 1995;117: 1–19.
View Article
Google Scholar

[149] View Article

[150] Google Scholar

[ref39] 39. Hinckley DM, Freeman GS, Whitmer JK, de Pablo JJ. An experimentally-informed coarse-grained 3-site-per-nucleotide model of DNA: Structure, thermodynamics, and dynamics of hybridization. J Chem Phys. 2013;139: 144903. pmid:24116642
View Article
PubMed/NCBI
Google Scholar

[152] View Article

[153] PubMed/NCBI

[154] Google Scholar

[ref40] 40. Noel JK, Levi M, Raghunathan M, Lammert H, Hayes RL, Onuchic JN, et al. SMOG 2: A Versatile Software Package for Generating Structure-Based Models. Prlic A, editor. PLOS Comput Biol. 2016;12: e1004794. pmid:26963394
View Article
PubMed/NCBI
Google Scholar

[156] View Article

[157] PubMed/NCBI

[158] Google Scholar

[ref41] 41. Pronk S, Páll S, Schulz R, Larsson P, Bjelkmar P, Apostolov R, et al. GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit. Bioinformatics. 2013;29: 845–854. pmid:23407358
View Article
PubMed/NCBI
Google Scholar

[160] View Article

[161] PubMed/NCBI

[162] Google Scholar

[ref42] 42. Phillips JC, Braun R, Wang W, Gumbart J, Tajkhorshid E, Villa E, et al. Scalable molecular dynamics with NAMD. J Comput Chem. 2005;26: 1781–1802. pmid:16222654
View Article
PubMed/NCBI
Google Scholar

[164] View Article

[165] PubMed/NCBI

[166] Google Scholar

[ref43] 43. Lu W, Bueno C, Schafer NP, Moller J, Jin S, Chen X, et al. OpenAWSEM with Open3SPN2: A fast, flexible, and accessible framework for large-scale coarse-grained biomolecular simulations. Schneidman-Duhovny D, editor. PLOS Comput Biol. 2021;17: e1008308. pmid:33577557
View Article
PubMed/NCBI
Google Scholar

[168] View Article

[169] PubMed/NCBI

[170] Google Scholar

[ref44] 44. Kenzaki H, Koga N, Hori N, Kanada R, Li W, Okazaki K, et al. CafeMol: A Coarse-Grained Biomolecular Simulator for Simulating Proteins at Work. J Chem Theory Comput. 2011;7: 1979–1989. pmid:26596457
View Article
PubMed/NCBI
Google Scholar

[172] View Article

[173] PubMed/NCBI

[174] Google Scholar

[ref45] 45. Li P, Banjade S, Cheng H-C, Kim S, Chen B, Guo L, et al. Phase transitions in the assembly of multivalent signalling proteins. Nature. 2012;483: 336–340. pmid:22398450
View Article
PubMed/NCBI
Google Scholar

[176] View Article

[177] PubMed/NCBI

[178] Google Scholar

[ref46] 46. Borcherds W, Bremer A, Borgia MB, Mittag T. How do intrinsically disordered protein regions encode a driving force for liquid–liquid phase separation? Curr Opin Struct Biol. 2021;67: 41–50. pmid:33069007
View Article
PubMed/NCBI
Google Scholar

[180] View Article

[181] PubMed/NCBI

[182] Google Scholar

[ref47] 47. Roden C, Gladfelter AS. RNA contributions to the form and function of biomolecular condensates. Nat Rev Mol Cell Biol. 2021;22: 183–195. pmid:32632317
View Article
PubMed/NCBI
Google Scholar

[184] View Article

[185] PubMed/NCBI

[186] Google Scholar

[ref48] 48. Turner AL, Watson M, Wilkins OG, Cato L, Travers A, Thomas JO, et al. Highly disordered histone H1−DNA model complexes and their condensates. Proc Natl Acad Sci. 2018;115: 11964–11969. pmid:30301810
View Article
PubMed/NCBI
Google Scholar

[188] View Article

[189] PubMed/NCBI

[190] Google Scholar

[ref49] 49. Banani SF, Lee HO, Hyman AA, Rosen MK. Biomolecular condensates: organizers of cellular biochemistry. Nat Rev Mol Cell Biol. 2017;18: 285–298. pmid:28225081
View Article
PubMed/NCBI
Google Scholar

[192] View Article

[193] PubMed/NCBI

[194] Google Scholar

[ref50] 50. Jung J, Mori T, Kobayashi C, Matsunaga Y, Yoda T, Feig M, et al. GENESIS: a hybrid-parallel and multi-scale molecular dynamics simulator with enhanced sampling algorithms for biomolecular and cellular simulations. Wiley Interdiscip Rev Comput Mol Sci. 2015;5: 310–323. pmid:26753008
View Article
PubMed/NCBI
Google Scholar

[196] View Article

[197] PubMed/NCBI

[198] Google Scholar

[ref51] 51. Kobayashi C, Jung J, Matsunaga Y, Mori T, Ando T, Tamura K, et al. GENESIS 1.1: A hybrid-parallel molecular dynamics simulator with enhanced sampling algorithms on multiple computational platforms. J Comput Chem. 2017;38: 2193–2206. pmid:28718930
View Article
PubMed/NCBI
Google Scholar

[200] View Article

[201] PubMed/NCBI

[202] Google Scholar

[ref52] 52. Bezanson J, Edelman A, Karpinski S, Shah VB. Julia: A Fresh Approach to Numerical Computing. SIAM Rev. 2017;59: 65–98.
View Article
Google Scholar

[204] View Article

[205] Google Scholar

[ref53] 53. Terakawa T, Takada S. Multiscale Ensemble Modeling of Intrinsically Disordered Proteins: p53 N-Terminal Domain. Biophys J. 2011;101: 1450–1458. pmid:21943426
View Article
PubMed/NCBI
Google Scholar

[207] View Article

[208] PubMed/NCBI

[209] Google Scholar

[ref54] 54. Tan C, Jung J, Kobayashi C, Sugita Y. A singularity-free torsion angle potential for coarse-grained molecular dynamics simulations. J Chem Phys. 2020;153: 044110. pmid:32752657
View Article
PubMed/NCBI
Google Scholar

[211] View Article

[212] PubMed/NCBI

[213] Google Scholar

[ref55] 55. Catenaccio A, Daruich Y, Magallanes C. Temperature dependence of the permittivity of water. Chem Phys Lett. 2003;367: 669–671.
View Article
Google Scholar

[215] View Article

[216] Google Scholar

[ref56] 56. Stogryn A. Equations for Calculating the Dielectric Constant of Saline Water. IEEE Trans Microw Theory Tech. 1971;19: 733–736.
View Article
Google Scholar

[218] View Article

[219] Google Scholar

[ref57] 57. Ashbaugh HS, Hatch HW. Natively Unfolded Protein Stability as a Coil-to-Globule Transition in Charge/Hydropathy Space. J Am Chem Soc. 2008;130: 9536–9542. pmid:18576630
View Article
PubMed/NCBI
Google Scholar

[221] View Article

[222] PubMed/NCBI

[223] Google Scholar

[ref58] 58. Miyazawa S, Jernigan RL. Residue–Residue Potentials with a Favorable Contact Pair Term and an Unfavorable High Packing Density Term, for Simulation and Threading. J Mol Biol. 1996;256: 623–644. pmid:8604144
View Article
PubMed/NCBI
Google Scholar

[225] View Article

[226] PubMed/NCBI

[227] Google Scholar

[ref59] 59. Chu J-W, Voth GA. Coarse-Grained Modeling of the Actin Filament Derived from Atomistic-Scale Simulations. Biophys J. 2006;90: 1572–1582. pmid:16361345
View Article
PubMed/NCBI
Google Scholar

[229] View Article

[230] PubMed/NCBI

[231] Google Scholar

[ref60] 60. Terakawa T, Takada S. RESPAC: Method to Determine Partial Charges in Coarse-Grained Protein Model and Its Application to DNA-Binding Proteins. J Chem Theory Comput. 2014;10: 711–721. pmid:26580048
View Article
PubMed/NCBI
Google Scholar

[233] View Article

[234] PubMed/NCBI

[235] Google Scholar

[ref61] 61. Niina T, Brandani GB, Tan C, Takada S. Sequence-dependent nucleosome sliding in rotation-coupled and uncoupled modes revealed by molecular simulations. Panchenko ARR, editor. PLOS Comput Biol. 2017;13: e1005880. pmid:29194442
View Article
PubMed/NCBI
Google Scholar

[237] View Article

[238] PubMed/NCBI

[239] Google Scholar

[ref62] 62. Tan C, Takada S. Nucleosome allostery in pioneer transcription factor binding. Proc Natl Acad Sci. 2020;117: 20586–20596. pmid:32778600
View Article
PubMed/NCBI
Google Scholar

[241] View Article

[242] PubMed/NCBI

[243] Google Scholar

[ref63] 63. Humphrey W, Dalke A, Schulten K. VMD: Visual molecular dynamics. J Mol Graph. 1996;14: 33–38. pmid:8744570
View Article
PubMed/NCBI
Google Scholar

[245] View Article

[246] PubMed/NCBI

[247] Google Scholar

[ref64] 64. Schrödinger L. The PyMOL Molecular Graphics System, Version 1.8. 2015 Nov.
View Article
Google Scholar

[249] View Article

[250] Google Scholar

[ref65] 65. Lu X-J, Olson WK. 3DNA: a versatile, integrated software system for the analysis, rebuilding and visualization of three-dimensional nucleic-acid structures. Nat Protoc. 2008;3: 1213–1227. pmid:18600227
View Article
PubMed/NCBI
Google Scholar

[252] View Article

[253] PubMed/NCBI

[254] Google Scholar

[ref66] 66. Adams PD, Afonine P V., Baskaran K, Berman HM, Berrisford J, Bricogne G, et al. Announcing mandatory submission of PDBx/mmCIF format files for crystallographic depositions to the Protein Data Bank (PDB). Acta Crystallogr Sect D Struct Biol. 2019;75: 451–454. pmid:30988261
View Article
PubMed/NCBI
Google Scholar

[256] View Article

[257] PubMed/NCBI

[258] Google Scholar

[ref67] 67. Jung J, Mori T, Sugita Y. Midpoint cell method for hybrid (MPI+OpenMP) parallelization of molecular dynamics simulations. J Comput Chem. 2014;35: 1064–1072. pmid:24659253
View Article
PubMed/NCBI
Google Scholar

[260] View Article

[261] PubMed/NCBI

[262] Google Scholar

[ref68] 68. Karanicolas J, Brooks CL. The origins of asymmetry in the folding transition states of protein L and protein G. Protein Sci. 2009;11: 2351–2361. pmid:12237457
View Article
PubMed/NCBI
Google Scholar

[264] View Article

[265] PubMed/NCBI

[266] Google Scholar

[ref69] 69. Kobayashi C, Matsunaga Y, Koike R, Ota M, Sugita Y. Domain Motion Enhanced (DoME) Model for Efficient Conformational Sampling of Multidomain Proteins. J Phys Chem B. 2015;119: 14584–14593. pmid:26536148
View Article
PubMed/NCBI
Google Scholar

[268] View Article

[269] PubMed/NCBI

[270] Google Scholar

[ref70] 70. Best RB, Chen Y-G, Hummer G. Slow Protein Conformational Dynamics from Multiple Experimental Structures: The Helix/Sheet Transition of Arc Repressor. Structure. 2005;13: 1755–1763. pmid:16338404
View Article
PubMed/NCBI
Google Scholar

[272] View Article

[273] PubMed/NCBI

[274] Google Scholar

[ref71] 71. Westover KD, Bushnell DA, Kornberg RD. Structural Basis of Transcription. Cell. 2004;119: 481–489. pmid:15537538
View Article
PubMed/NCBI
Google Scholar

[276] View Article

[277] PubMed/NCBI

[278] Google Scholar

[ref72] 72. Grant RA, Filman DJ, Finkel SE, Kolter R, Hogle JM. The crystal structure of Dps, a ferritin homolog that binds and protects DNA. Nat Struct Biol. 1998;5: 294–303. pmid:9546221
View Article
PubMed/NCBI
Google Scholar

[280] View Article

[281] PubMed/NCBI

[282] Google Scholar

[ref73] 73. Davey CA, Sargent DF, Luger K, Maeder AW, Richmond TJ. Solvent Mediated Interactions in the Structure of the Nucleosome Core Particle at 1.9Å Resolution. J Mol Biol. 2002;319: 1097–1113. pmid:12079350
View Article
PubMed/NCBI
Google Scholar

[284] View Article

[285] PubMed/NCBI

[286] Google Scholar

[ref74] 74. Sekido R, Lovell-Badge R. Sex determination involves synergistic action of SRY and SF1 on a specific Sox9 enhancer. Nature. 2008;453: 930–934. pmid:18454134
View Article
PubMed/NCBI
Google Scholar

[288] View Article

[289] PubMed/NCBI

[290] Google Scholar

[ref75] 75. Harley VR, Lovell-Badge R, Goodfellow PN. Definition of a consensus DNA binding site for SRY. Nucleic Acids Res. 1994;22: 1500–1501. pmid:8190643
View Article
PubMed/NCBI
Google Scholar

[292] View Article

[293] PubMed/NCBI

[294] Google Scholar

[ref76] 76. Murphy EC, Zhurkin VB, Louis JM, Cornilescu G, Clore GM. Structural Basis for SRY-dependent 46-X,Y Sex Reversal: Modulation of DNA Bending by a Naturally Occurring Point Mutation. J Mol Biol. 2001;312: 481–499. pmid:11563911
View Article
PubMed/NCBI
Google Scholar

[296] View Article

[297] PubMed/NCBI

[298] Google Scholar

[ref77] 77. Pontiggia A, Rimini R, Harley VR, Goodfellow PN, Lovell-Badge R, Bianchi ME. Sex-reversing mutations affect the architecture of SRY-DNA complexes. EMBO J. 1994;13: 6115–6124. pmid:7813448
View Article
PubMed/NCBI
Google Scholar

[300] View Article

[301] PubMed/NCBI

[302] Google Scholar

[ref78] 78. Fornes O, Castro-Mondragon JA, Khan A, van der Lee R, Zhang X, Richmond PA, et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 2019; 48: D87–D92. pmid:31701148
View Article
PubMed/NCBI
Google Scholar

[304] View Article

[305] PubMed/NCBI

[306] Google Scholar

[ref79] 79. Patel A, Lee HO, Jawerth L, Maharana S, Jahnel M, Hein MY, et al. A Liquid-to-Solid Phase Transition of the ALS Protein FUS Accelerated by Disease Mutation. Cell. 2015;162: 1066–1077. pmid:26317470
View Article
PubMed/NCBI
Google Scholar

[308] View Article

[309] PubMed/NCBI

[310] Google Scholar

[ref80] 80. Regy RM, Thompson J, Kim YC, Mittal J. Improved coarse-grained model for studying sequence dependent phase separation of disordered proteins. Protein Sci. 2021;30: 1371–1379. pmid:33934416
View Article
PubMed/NCBI
Google Scholar

[312] View Article

[313] PubMed/NCBI

[314] Google Scholar

[ref81] 81. Yuan S, Wang J, Zhu D, Wang N, Gao Q, Chen W, et al. Cryo-EM structure of a herpesvirus capsid at 3.1 Å. Science (80-). 2018;360: eaao7283. pmid:29622627
View Article
PubMed/NCBI
Google Scholar

[316] View Article

[317] PubMed/NCBI

[318] Google Scholar

[ref82] 82. Öztürk MA, De M, Cojocaru V, Wade RC. Chromatosome Structure and Dynamics from Molecular Simulations. Annu Rev Phys Chem. 2020;71: 101–119. pmid:32017651
View Article
PubMed/NCBI
Google Scholar

[320] View Article

[321] PubMed/NCBI

[322] Google Scholar

[ref83] 83. Wedemann G, Langowski J. Computer Simulation of the 30-Nanometer Chromatin Fiber. Biophys J. 2002;82: 2847–2859. pmid:12023209
View Article
PubMed/NCBI
Google Scholar

[324] View Article

[325] PubMed/NCBI

[326] Google Scholar

[ref84] 84. Buckle A, Brackley CA, Boyle S, Marenduzzo D, Gilbert N. Polymer Simulations of Heteromorphic Chromatin Predict the 3D Folding of Complex Genomic Loci. Mol Cell. 2018;72: 786–797.e11. pmid:30344096
View Article
PubMed/NCBI
Google Scholar

[328] View Article

[329] PubMed/NCBI

[330] Google Scholar

[ref85] 85. Takada S, Brandani GB, Tan C. Nucleosomes as allosteric scaffolds for genetic regulation. Curr Opin Struct Biol. 2020;62: 93–101. pmid:31901887
View Article
PubMed/NCBI
Google Scholar

[332] View Article

[333] PubMed/NCBI

[334] Google Scholar

[ref86] 86. Ugarte La Torre D, Takada S. Coarse-grained implicit solvent lipid force field with a compatible resolution to the Cα protein representation. J Chem Phys. 2020;153: 205101. pmid:33261497
View Article
PubMed/NCBI
Google Scholar

[336] View Article

[337] PubMed/NCBI

[338] Google Scholar

[ref87] 87. Terakawa T, Kenzaki H, Takada S. p53 Searches on DNA by Rotation-Uncoupled Sliding at C-Terminal Tails and Restricted Hopping of Core Domains. J Am Chem Soc. 2012;134: 14555–14562. pmid:22880817
View Article
PubMed/NCBI
Google Scholar

[340] View Article

[341] PubMed/NCBI

[342] Google Scholar

[ref88] 88. Dai L, Xu Y, Du Z, Su X, Yu J. Revealing atomic-scale molecular diffusion of a plant-transcription factor WRKY domain protein along DNA. Proc Natl Acad Sci. 2021;118: e2102621118. pmid:34074787
View Article
PubMed/NCBI
Google Scholar

[344] View Article

[345] PubMed/NCBI

[346] Google Scholar

[ref89] 89. Grime JMA, Voth GA. Highly Scalable and Memory Efficient Ultra-Coarse-Grained Molecular Dynamics Simulations. J Chem Theory Comput. 2014;10: 423–431. pmid:26579921
View Article
PubMed/NCBI
Google Scholar

[348] View Article

[349] PubMed/NCBI

[350] Google Scholar

[ref90] 90. Garaizar A, Espinosa JR. Salt dependent phase behavior of intrinsically disordered proteins from a coarse-grained model with explicit water and ions. J Chem Phys. 2021;155: 125103. pmid:34598583
View Article
PubMed/NCBI
Google Scholar

[352] View Article

[353] PubMed/NCBI

[354] Google Scholar

[ref91] 91. Sugita Y, Okamoto Y. Replica-exchange molecular dynamics method for protein folding. Chem Phys Lett. 1999;314: 141–151.
View Article
Google Scholar

[356] View Article

[357] Google Scholar

[ref92] 92. Sugita Y, Kitao A, Okamoto Y. Multidimensional replica-exchange method for free-energy calculations. J Chem Phys. 2000;113: 6042–6051.
View Article
Google Scholar

[359] View Article

[360] Google Scholar

[ref93] 93. Miao Y, Feher VA, McCammon JA. Gaussian Accelerated Molecular Dynamics: Unconstrained Enhanced Sampling and Free Energy Calculation. J Chem Theory Comput. 2015;11: 3584–3595. pmid:26300708
View Article
PubMed/NCBI
Google Scholar

[362] View Article

[363] PubMed/NCBI

[364] Google Scholar

[ref94] 94. Kamiya M, Sugita Y. Flexible selection of the solute region in replica exchange with solute tempering: Application to protein-folding simulations. J Chem Phys. 2018;149: 072304. pmid:30134668
View Article
PubMed/NCBI
Google Scholar

[366] View Article

[367] PubMed/NCBI

[368] Google Scholar

[ref95] 95. Fernandez-Leiro R, Scheres SHW. Unravelling biological macromolecules with cryo-electron microscopy. Nature. 2016;537: 339–346. pmid:27629640
View Article
PubMed/NCBI
Google Scholar

[370] View Article

[371] PubMed/NCBI

[372] Google Scholar

[ref96] 96. Hylton RK, Swulius MT. Challenges and triumphs in cryo-electron tomography. iScience. 2021;24: 102959. pmid:34466785
View Article
PubMed/NCBI
Google Scholar

[374] View Article

[375] PubMed/NCBI

[376] Google Scholar

[ref97] 97. Uchihashi T, Ganser C. Recent advances in bioimaging with high-speed atomic force microscopy. Biophys Rev. 2020;12: 363–369. pmid:32172451
View Article
PubMed/NCBI
Google Scholar

[378] View Article

[379] PubMed/NCBI

[380] Google Scholar

[ref98] 98. Luchinat E, Banci L. In-Cell NMR in Human Cells: Direct Protein Expression Allows Structural Studies of Protein Folding and Maturation. Acc Chem Res. 2018;51: 1550–1557. pmid:29869502
View Article
PubMed/NCBI
Google Scholar

[382] View Article

[383] PubMed/NCBI

[384] Google Scholar

[ref99] 99. Johnson GT, Autin L, Al-Alusi M, Goodsell DS, Sanner MF, Olson AJ. cellPACK: a virtual mesoscope to model and visualize structural systems biology. Nat Methods. 2015;12: 85–91. pmid:25437435
View Article
PubMed/NCBI
Google Scholar

[386] View Article

[387] PubMed/NCBI

[388] Google Scholar

[ref100] 100. Jewett AI, Stelter D, Lambert J, Saladi SM, Roscioni OM, Ricci M, et al. Moltemplate: A Tool for Coarse-Grained Modeling of Complex Biological Matter and Soft Condensed Matter Physics. J Mol Biol. 2021;433: 166841. pmid:33539886
View Article
PubMed/NCBI
Google Scholar

[390] View Article

[391] PubMed/NCBI

[392] Google Scholar

[ref101] 101. Martínez L, Andrade R, Birgin EG, Martínez JM. PACKMOL: A package for building initial configurations for molecular dynamics simulations. J Comput Chem. 2009;30: 2157–2164. pmid:19229944
View Article
PubMed/NCBI
Google Scholar

[394] View Article

[395] PubMed/NCBI

[396] Google Scholar

[ref102] 102. Seo S, Shinoda W. SPICA Force Field for Lipid Membranes: Domain Formation Induced by Cholesterol. J Chem Theory Comput. 2019;15: 762–774. pmid:30514078
View Article
PubMed/NCBI
Google Scholar

[398] View Article

[399] PubMed/NCBI

[400] Google Scholar

[ref103] 103. Han W, Wan C-K, Jiang F, Wu Y-D. PACE Force Field for Protein Simulations. 1. Full Parameterization of Version 1 and Verification. J Chem Theory Comput. 2010;6: 3373–3389. pmid:26617092
View Article
PubMed/NCBI
Google Scholar

[402] View Article

[403] PubMed/NCBI

[404] Google Scholar

[ref104] 104. Tian C, Kasavajhala K, Belfon KAA, Raguette L, Huang H, Migues AN, et al. ff19SB: Amino-Acid-Specific Protein Backbone Parameters Trained against Quantum Mechanics Energy Surfaces in Solution. J Chem Theory Comput. 2020;16: 528–552. pmid:31714766
View Article
PubMed/NCBI
Google Scholar

[406] View Article

[407] PubMed/NCBI

[408] Google Scholar

[ref105] 105. Huang J, Rauscher S, Nawrocki G, Ran T, Feig M, de Groot BL, et al. CHARMM36m: an improved force field for folded and intrinsically disordered proteins. Nat Methods. 2017;14: 71–73. pmid:27819658
View Article
PubMed/NCBI
Google Scholar

[410] View Article

[411] PubMed/NCBI

[412] Google Scholar

[ref106] 106. Oostenbrink C, Villa A, Mark AE, Van Gunsteren WF. A biomolecular force field based on the free enthalpy of hydration and solvation: The GROMOS force-field parameter sets 53A5 and 53A6. J Comput Chem. 2004;25: 1656–1676. pmid:15264259
View Article
PubMed/NCBI
Google Scholar

[414] View Article

[415] PubMed/NCBI

[416] Google Scholar

[ref107] 107. Badaczewska-Dawid AE, Kolinski A, Kmiecik S. Computational reconstruction of atomistic protein structures from coarse-grained models. Comput Struct Biotechnol J. 2020;18: 162–176. pmid:31969975
View Article
PubMed/NCBI
Google Scholar

[418] View Article

[419] PubMed/NCBI

[420] Google Scholar

[ref108] 108. Shimizu M, Takada S. Reconstruction of Atomistic Structures from Coarse-Grained Models for Protein–DNA Complexes. J Chem Theory Comput. 2018;14: 1682–1694. pmid:29397721
View Article
PubMed/NCBI
Google Scholar

[422] View Article

[423] PubMed/NCBI

[424] Google Scholar

Figures

Abstract

Author summary

Introduction

Design and implementation

Basic interaction terms

Bond potential.

Angle potential.

Dihedral potential.

Nonbonded interactions.

Modulating functions.

Residue-level CG models

The AICG2+ model for folded proteins.

The HPS and KH models for IDR.

The 3SPN.2C model for DNA.

The structure-based model for RNA.

The PWMcos model for protein-DNA interaction.

Summary of the potential energy functions available in GENESIS.

Information flow

Preparation.

MD algorithms

Cutoff scheme of the nonbonded interactions.

Neighbor lists.

Time Integration.

Performance Optimization

Results

Benchmark of MD simulations with residue-level CG models

Application 1: protein target search on DNA

Application 2: phase behaviors of IDR and RNA

Application 3: virus capsid

Application 4: chromatin

Discussion

Availability and future directions

Supporting information

S1 Text. Supplementary methods of molecular dynamics simulations.

S1 Fig. Examples of using GENESIS-CG-tool to prepare CG topology and coordinate files.

S2 Fig. Charge distribution on the surface residues of protein sry.

Acknowledgments

References