Guidelines for protein data
Make your COVID-19 research data useful and accessible for the rest of the research community by publishing in a public repository together with descriptive metadata.
ELIXIR.SI can support you with Data Management Planning early on in the projects to make data sharing more efficient, both through personal consultations and by providing a customised tool to create Data Management Plans. We can also assist you in identifying relevant repositories and common international standards for describing and publishing your data, as well as guide you through the submission process.
We recommend to use the PRIDE repository provided by the ProteomeXchange Consortium. The repository admits protein and peptide identification/quantification data with the accompanying mass spectra evidence and any other related data types. Submission is done using the PX Submission Tool
Other types of proteomics data should also be made available, we recommend Slovenian repositories. In order to make the data useful and ready for analyses and integration, a detailed description of the data format and how the variables are organized should be provided. Each protein variable should come with a unique identifier such as UniProt ID or ENGS ID (and stating the versions used to link the data).
Metadata provides ‘data about data’ , and may include information on the methodology used to collect the data, analytical and procedural information, definitions of variables, units of measurement, any assumptions made, the format and file type of the data and software used to collect and/or process the data. Researchers are strongly encouraged to use community metadata standards where these are in place.
It is highly recommended to, already from the beginning of the project, structure e.g. sample metadata in a way that enables sequence data submission without having to reformat the metadata.