The wwPDB is planning to introduce in 2017 a new procedure for the management by the Depositor of Record (where the Depositor of Record is defined as the Principal Investigator for the entry) of substantial revisions to previously released PDB archival entries.
At present, revised atomic coordinates for an existing released PDB entry are assigned a new accession code, and the prior entry is obsoleted. This long-standing wwPDB policy had the unintended consequence of breaking connections with publications and usage of the prior set of atomic coordinates, resulting in a non-trivial barrier to submission of atomic coordinate revisions by our Depositors of Record.
The wwPDB is introducing a file versioning system that allows Depositors of Record to update their own previously released entries. Please note, in the first phase, file versioning will be applied to the atomic coordinates refined versus unchanged experimental data.
Version numbers of each PDB archive entry will be designated using a #-# identifier. The first digit specifies the major version, and the second designates the minor version. The Structure of Record (i.e., the initial set of released atomic coordinates) is designated as Version 1-0. Thereafter, the major version digit is incremented with each substantial revision of a given entry (e.g., Version 2-0, when the atomic coordinates are replaced for the first time by the Depositor of Record). “Major version changes” are defined as updates to the atomic coordinates, polymer sequence(s), and/or chemical identify of a ligand. All other changes are defined as “minor changes”. When a major change is made, the minor version number is reset to 0 (e.g., 1-0 to 1-1 to 2-0). For the avoidance of doubt, the wwPDB will retain all major versions with the latest minor versions of an entry within the PDB archive.
Current wwPDB policies governing the deposition of independently refined structures based on the data generated by a research group or laboratory separate from that of the Depositor of Record remain unchanged. Versioning of atomic coordinates will be strictly limited to substitutions made by the Depositor of Record.
Upon introduction of the file versioning system, the wwPDB will revise each PDB accession code by extending its length and prepending “PDB” (e.g., "1abc" will become "pdb_00001abc"). This process will enable text mining detection of PDB entries in the published literature and allow for more informative and transparent delivery of revised data files. For example, the atomic coordinates for the second major version of PDB entry 1abc would have the following form under the new file-naming schema:
The wwPDB is mindful of the importance of continuity in providing services and supporting User activities. For as long as practicable, the wwPDB will continue assigning PDB codes that can be truncated losslessly to the current four-character style. In the same spirit, initial implementation of entry file versioning will appear in a new, parallel branch of the PDB archive FTP tree. More details on the new FTP tree organization and accessibility of version information will be forthcoming. Data files in the current archive location ftp://ftp.wwpdb.org/pub/pdb/data/structures/ will continue to use the familiar naming style and will contain the latest version in the corresponding versioned archive.
On July 12, 2017, the wwPDB partners plan to update the PDB FTP archive with PDB structure entry files conforming to V5.0 of the PDBx/mmCIF dictionary, which already supports the global wwPDB system for Deposition, Biocuration, and Validation of PDB data - OneDep.
In preparation for this update, to allow the community ample time to test the planned update and to provide feedback, the wwPDB is now delivering PDBx/mmCIF and XML structure entry files for all entries in the PDB archive conforming to the new data standards via a new FTP repository (ftp://ftp-beta.wwpdb.org/). This collection of test files will be updated in concert with regular weekly updates of the PDB archive.
Complete lists of changes can be found at the wwPDB website (https://www.wwpdb.org/documentation/remediation).
The wwPDB strongly encourages the community to review and test the updated files.
Users should report V5.0 data issues to firstname.lastname@example.org
We will attempt to address the reported issues incrementally as User feedback is received in advance of the rollout on July 12, 2017.
Other derived data and experimental data files of ftp-beta tree will be delivered incrementally to the ftp-beta tree between May 3 and July 12, 2017.
The test FTP area (ftp://ftp.wwpdb.org/pub/pdb/test_data/EM/) containing previously updated 3DEM model files (previously made available in December 2016) is to be retired effective May 3 2017.
The wwPDB is preparing the update of PDBx/mmCIF model files for all entries in the PDB archive to V5 version of the PDBx/mmCIF dictionary. When completed, all PDB model files will have better organized content and will conform to the revised data model used within the wwPDB OneDep System. A list of changes will be available at the wwPDB website (https://www.wwpdb.org/documentation/remediation). Since January 2016, the OneDep system (https://www.wwpdb.org/deposition/system-information) has supported Deposition, Biocuration, and Validation of structures determined by experimental methods currently accepted by the PDB.
The updated model files for all experimental methods will be made available in a new PDB FTP server (ftp://ftp-beta.wwpdb.org/pub/pdb/data/structures/), and the corresponding PDBx/mmCIF dictionary will be released in May 2017. The test FTP area (ftp://ftp.wwpdb.org/pub/pdb/test_data/EM/) containing previously updated 3DEM model files (previously made available in December 2016) will be simultaneously retired.
The current PDB FTP archive will be updated with new files corresponding to the V5 PDBx/mmCIF dictionary in July 2017. Users are strongly encouraged to review and test the updated data files.
The wwPDB partners are pleased to announce that updated validation reports for all X-ray, NMR, and 3DEM structures deposited in the PDB archive are now available on March 15, 2017.
The updates include new percentile statistics reflecting the state of the PDB archive on December 31th 2016 and updated versions of the Mogul software (2017) and CSD archive (as538be).
The updated reports are accessible from the following FTP sites:
A copy of the previous version is archived at RCSB PDB and PDBj.
These updated wwPDB validation reports provide an assessment of structure quality using widely accepted standards and criteria, recommended by community experts serving in the Validation Task Force. The wwPDB partners strongly encourage journal editors and referees to request them from authors as part of the manuscript submission and review process. The reports are date-stamped and display the wwPDB logo, and contain the same information, regardless of which wwPDB site processed the entry. Provision of wwPDB validation reports is already required by Nature, eLife, The Journal of Biological Chemistry, the International Union of Crystallography (IUCr) journals, FEBS journals, Journal of Immunology and Angew Chem Int Ed Engl as part of their manuscript-submission process.
Validation reports are also provided to depositors through OneDep - the wwPDB portal for validation, deposition and biocuration of structure data. The wwPDB partners encourage the use of the stand-alone validation server and the webservice API at any time prior to data deposition. Depositors are required to review and accept the reports as part of the data submission process. Validation reports will continue to be developed and improved as we receive recommendations from the expert Validation Task Forces (VTF) for X-ray, NMR, EM, and on ligand validation, and as we collect feedback from depositors and users.
Further information and sample validation reports are available.
Your feedback, comments, and questions are welcome at email@example.com.
On November 18-19, 2016, the Human Frontier Science Program Organization (HFSPO) hosted a meeting of senior managers of key data resources (including members of the Worldwide Protein Data Bank) and leaders of several major funding organizations to discuss the challenges associated with sustaining biological and biomedical (i.e., life sciences) data resources and associated infrastructure.
A strong consensus emerged from the group that core data resources for the life sciences should be supported through a coordinated international effort(s) that better ensure long-term sustainability and that appropriately align funding with scientific impact. Ideally, funding for such data resources should allow for access at no charge, as is presently the usual (and preferred) mechanism.
The Global Life Sciences Data Resources (GLSDR) Working Group has published a letter in Nature and preprint in bioRxiv on Data Management: A global coalition to sustain core data:
The paper describing wwPDB OneDep system is now available. The wwPDB has deployed a unified system for deposition, biocuration, and validation of macromolecular structures globally across all wwPDB, EMDB, and BMRB deposition sites to meet the evolving requirements of the scientific community to archive structural data over the coming decades.
The OneDep system provides a user-friendly deposition interface and improved structure validation with the benefit of recommendations from expert task forces representing the respective methodological communities. The processing efficiency in biocuration is improved as OneDep supports a more automated workflow.
As Milka Kostic, the Senior Editor at Structure and Cell Chemical Biology described, OneDep is a step in the right direction and offers a single point of entry into the atomic coordinate deposition process, as well as improving processes of structure validation and data biocuration.
A snapshot of the PDB archive (ftp://ftp.wwpdb.org) as of January 1, 2017 has been added to ftp://snapshots.wwpdb.org/. Snapshots have been archived annually since January 2005 to provide readily identifiable data sets for research on the PDB archive.
The directory 20170101 includes the 125,463 experimentally-determined coordinate files and related experimental data available at that time. Coordinate data are available in PDBx/mmCIF, PDB, and XML file formats. The date and time stamp of each file indicates the last time the file was modified. The snapshot is 757 GB.