You may need to use different file formats at different stages in the Research Data Management Lifecycle but for long-term preservation of data you will need to store your files in a durable format. This ensures your files can be opened by others (including your future self) using readily available programs, perhaps long after the research project has concluded.
Revisit the data management snafu video in the introduction to the Toolkit to see what can happen when "bad" formats are used.
- formats endorsed by standards agencies such as Standards Australia, ISO
- open formats developed and maintained by communities of interest such as OpenDocument Format
- lossless formats
- formats widely used within a given discipline
- proprietary formats
- file format and software obsolescence
You may have to use software that does not save data in a durable format, due to discipline-specific or other requirements e.g. specialised programs to capture or generate data. Export your data to a more durable format such as plain text if you can do so without losing data integrity and include it alongside the original files when you archive them. This is often possible. An example is exporting .csv files from SPSS (with value labels) and archiving them alongside the .sav files.
It is also important to document data capture and storage formats as well as software used and their versions. See the Data Documentation and Metadata section of the Toolkit for more information.
The ETH-Bibliothek (Swiss Federal Institute of Technology) provides some advice for packaging data into archives in their factsheet Recommendations for uploading data (no longer available online):
Packaged files can be used for archiving large collections of heterogenous datasets with some provisos:
We acknowledge the Australian Aboriginal and Torres Strait Islander peoples as the first inhabitants of the nation and acknowledge Traditional Owners of the lands where our staff and students, live, learn and work.