Related Data

Linked datasets, software, or code repositories

Engagement (E)

Interoperable (I3)

Justification

Cross-dataset and data-software links enable integration and computational reproducibility. Four sources converge on this signal.

Practical Guide

should-have

Link related datasets and code. Builds the knowledge ecosystem.

Cross-dataset and data-software links enable integration and computational reproducibility. Datasets in connected knowledge graphs receive 16.5x more citations (RR = 16.46, p < 0.001) — the same as publication links because Zenodo captures both in the same field. The real value is ecosystem connectivity: linked datasets are discoverable through multiple entry points.

For Repositories

Support linked dataset fields (IsPartOf, HasPart, IsSupplementTo)
Enable linking to code repositories (GitHub, GitLab)
Map to DataCite #12 RelatedIdentifier with rich relation types

For Depositors

Link to related datasets, especially if your data is part of a collection
Link to code repositories (GitHub) used to process or analyze the data
Use HasPart/IsPartOf for multi-file datasets spanning multiple records

Same 16.5x lift as E1 (shared field in Zenodo). Four standards converge. Builds knowledge ecosystem beyond publication links.

Standards Sources

Convergence score: 4/4 independent sources —

Strongly justified

Standard	Field / Property	Obligation Level
DataCite 4.6	#12 RelatedIdentifier (IsPartOf, HasPart)	Recommended
Dublin Core	Relation	Core Element
schema.org	hasPart / isPartOf	Recommended

FAIR Principle Alignment

Primary mapping: Interoperable (I3)

I3: (Meta)data include qualified references to other (meta)data

RDA FAIR Data Maturity Model Indicators:

RDA-I3-01D: Data includes references to other data

How This Signal Is Measured

Presence of related dataset DOIs or code repository URLs. Binary: at least one linked.

Empirical Evidence (Zenodo, n=1.3M)

Per-signal statistics use Zenodo as the primary validation source because it is the largest general-purpose repository with structured DataCite metadata, natural variance across all 25 signals, and available citation/usage data. Domain-specific repositories exhibit ceiling effects or restricted variance that preclude per-signal discrimination. Cross-repository validation is reported separately.

Prevalence

54.3%

of Zenodo datasets

Citation Lift

16.2x

vs. datasets without

Data Source

Zenodo (CERN)

1,328,100 records analyzed

Interpretation: Same prevalence as E1 because Zenodo related_identifiers field captures both publication and data links. Datasets in connected knowledge graphs receive dramatically more citations, confirming that ecosystem connectivity drives reuse.