What does this mean?
It will be much easier to find and reuse data if there are many labels are attached to the data. Principle R1 is related to F2, but R1 focuses on the ability of a user (machine or human) to decide if the data is actually USEFUL in a particular context. To make this decision, the data publisher should provide not just metadata that allows discovery, but also metadata that richly describes the context under which the data was generated. This may include the experimental protocols, the manufacturer and brand of the machine or sensor that created the data, the species used, the drug regime, etc. Moreover, R1 states that the data publisher should not attempt to predict the data consumer’s identity and needs. We chose the term ‘plurality’ to indicate that the metadata author should be as generous as possible in providing metadata, even including information that may seem irrelevant.
Some points to take into consideration (non-exhaustive list):
- Describe the scope of your data: for what purpose was it generated/collected?
- Mention any particularities or limitations about the data that other users should be aware of.
- Specify the date of generation/collection of the data, the lab conditions, who prepared the data, the parameter settings, the name and version of the software used.
- Is it raw or processed data?
- Ensure that all variable names are explained or self-explanatory (i.e., defined in the research field’s controlled vocabulary).
- Clearly specify and document the version of the archived and/or reused data.
Links to Resources
- R1 is intended to help you avoid these mistakes: Data Sharing and Management Snafu in 3 Short Acts