
PLANNING AND STRUCTURING YOUR RESEARCH PROJECT
When preparing a new research project, planning in advance is crucial: it is important not only to think about the questions you will address with your research, its impact and outputs, but also to focus on the data your will produce and the different phases your data will go through the project. Thinking about data in advance, even before strarting to produce it, helps to manage it better: this applies both to the data generated by the project and to the data produced by others that are reused for research.
Writing a Data Management Plan (DMP), a formal document that outlines what you will do with your data during and after a research project, can be really helpful and can facilitate the retrieval and understanding and reuse of your data in the future. Importantly, remember to:
- comply with FAIR principles
- examine research data use regulations that may influence your project, by checking compliance with the rules on ethical principles privacy and intellectual property
- identifying rights ownership on data
- recognise potential issues that may arise with AI
DATA COLLECTION AND ORGANIZATION
To ensure a good collection and organization of your research data, firstly define standard methodologies and processes. Moreover, pay attention in structuring your data logically, ordering it into organized datasets and entitling it according to standard naming conventions. As you start collecting data, take care of storing your datasets in appropriate storage spaces providing periodic backups. Ensure that you avoid keeping everything in just one place: it is preferable to have some backup copies.
To guarantee the transparency, traceability and reproducibility of your research, in this phase it is also essential to collect all the documentation necessary to understand the data. Start by preparing an exhaustive README file: the file will allow users accessing your data to fully understand the project in terms of (methods of) data collection, time period, key reference points of the data. Importantly, as you collect and organize data you should also keep account of the adequate metadata supporting your data and the most suitable format to use. You can collect harmonized and persistent documentation by using tools like electronic lab notebook, for example.
DATA SECURITY, STORAGE AND PRESERVATION
In this phase it is essential to ensure the integrity of your research. This means both to guarantee its quality - thus make sure that you always carry out a quality control and check for eventual alternation on your final data - and to certify its "physical" preservation. Importantly, it is crucial to provide security measures and to use appropriate storage tools for the long term to prevent data loss and leaks, and to mitigate the risk of unauthorized access by using proper infrastructures and setting up adequate processes (indeed, when using or generating personal data this process is a legal obligation). Usually, it is advised to consider at least 10 years for proper data storage: this allows for reproducibility and reuse; and it may also be required explicitly by funders or publishers. Indeed, the choice of the right storage tool (i.e., a data repository) is also importantly related to the final phase of data sharing and reuse.
RESULTS DISSEMINATION AND DATA REUSE