SHARE Score

About
Framework
/

Engagement

/

E2

E2

Related Data

Linked datasets, software, or code repositories

Engagement (E)
Interoperable (I3)

Justification

Cross-dataset and data-software links enable integration and computational reproducibility. Four sources converge on this signal.

Practical Guide

should-have

Link related datasets and code. Builds the knowledge ecosystem.

Cross-dataset and data-software links enable integration and computational reproducibility. Datasets in connected knowledge graphs receive 16.5x more citations (RR = 16.46, p < 0.001) — the same as publication links because Zenodo captures both in the same field. The real value is ecosystem connectivity: linked datasets are discoverable through multiple entry points.

For Repositories

  • Support linked dataset fields (IsPartOf, HasPart, IsSupplementTo)
  • Enable linking to code repositories (GitHub, GitLab)
  • Map to DataCite #12 RelatedIdentifier with rich relation types

For Depositors

  • Link to related datasets, especially if your data is part of a collection
  • Link to code repositories (GitHub) used to process or analyze the data
  • Use HasPart/IsPartOf for multi-file datasets spanning multiple records

Same 16.5x lift as E1 (shared field in Zenodo). Four standards converge. Builds knowledge ecosystem beyond publication links.

Standards Sources

Convergence score: 4/4 independent sources —

Strongly justified

StandardField / PropertyObligation Level
DataCite 4.6#12 RelatedIdentifier (IsPartOf, HasPart)
Recommended
Dublin CoreRelation
Core Element
schema.orghasPart / isPartOf
Recommended

FAIR Principle Alignment

Primary mapping: Interoperable (I3)

  • I3: (Meta)data include qualified references to other (meta)data

RDA FAIR Data Maturity Model Indicators:

  • RDA-I3-01D: Data includes references to other data

How This Signal Is Measured

Presence of related dataset DOIs or code repository URLs. Binary: at least one linked.

Empirical Evidence (Zenodo, n=1.3M)

Per-signal statistics use Zenodo as the primary validation source because it is the largest general-purpose repository with structured DataCite metadata, natural variance across all 25 signals, and available citation/usage data. Domain-specific repositories exhibit ceiling effects or restricted variance that preclude per-signal discrimination. Cross-repository validation is reported separately.

Prevalence

54.3%

of Zenodo datasets

Citation Lift

16.2x

vs. datasets without

Data Source

Zenodo (CERN)

1,328,100 records analyzed

Interpretation: Same prevalence as E1 because Zenodo related_identifiers field captures both publication and data links. Datasets in connected knowledge graphs receive dramatically more citations, confirming that ecosystem connectivity drives reuse.

Quantitative Evidence

Scoring Formula

related_dataset_doi ∈ record → 4 pts

Contribution: 4 of 100 points · Engagement bucket (0–20)

With Signal Present

720,512

datasets (54.3%)

μ = 0.428 citations/dataset

Without Signal

607,588

datasets (45.7%)

μ = 0.026 citations/dataset

Rate Ratio

16.46

95% CI: [16.2016.73]

P-value

< 0.001

z = 343.37

Significance

Positive association

Method: Poisson rate ratio · Source: Zenodo (n = 1,328,100)

Note: Same prevalence as E1: Zenodo’s related_identifiers field captures both publication and data links.

ShareScore