Bibliographic References
Cited works with DOIs, linked methods papers
Justification
References to methods papers and related works create knowledge graphs. DataCite Recommends RelatedIdentifier with rich relation types. Dublin Core includes Relation and Source as core elements. schema.org includes citation as recommended. RDA-I3-01M and RDA-I3-03M (Important) require qualified references.
Practical Guide
Link referenced works. Builds knowledge graph connections.
Bibliographic references create knowledge graph connections between datasets and published works. The citation impact is neutral (RR = 1.01, p = 0.666) — references don't directly predict citations. But their value is in contextual discovery: linking your dataset to methods papers and related literature helps users understand what your data is for and how to use it.
For Repositories
- Support structured reference fields with DOI/PMID identifiers
- Map to DataCite #12 RelatedIdentifier with rich relation types
- Auto-extract references from uploaded papers when possible
For Depositors
- Link to methods papers and key publications that informed your data
- Use DOIs for references when available for machine readability
- Include both the dataset's source papers and downstream analysis publications
Knowledge graph value is real but citation evidence is neutral. Low prevalence (0.9%) means high differentiation potential.
Standards Sources
Convergence score: 4/4 independent sources —
| Standard | Field / Property | Obligation Level |
|---|---|---|
| DataCite 4.6 | #12 RelatedIdentifier | Recommended |
| Dublin Core | Relation / Source | Core Element |
| schema.org | citation | Recommended |
FAIR Principle Alignment
Primary mapping: Interoperable (I3)
- I3: (Meta)data include qualified references to other (meta)data
RDA FAIR Data Maturity Model Indicators:
- RDA-I3-01M: Metadata includes references to other metadata
- RDA-I3-03M: Metadata includes qualified references to other metadata
How This Signal Is Measured
Presence of references with identifiers (DOIs, PMIDs). Binary: at least one reference present.
Empirical Evidence (Zenodo, n=1.3M)
Per-signal statistics use Zenodo as the primary validation source because it is the largest general-purpose repository with structured DataCite metadata, natural variance across all 25 signals, and available citation/usage data. Domain-specific repositories exhibit ceiling effects or restricted variance that preclude per-signal discrimination. Cross-repository validation is reported separately.
Prevalence
0.9%
of Zenodo datasets
Citation Lift
1.0x
vs. datasets without
Data Source
Zenodo (CERN)
1,328,100 records analyzed
Interpretation: Neutral citation lift but bibliographic references create knowledge graphs enabling contextual discovery. Value is in network effects, not direct citation prediction.
Quantitative Evidence
Scoring Formula
bibliographic_references.length ≥ 1 → 4 pts
Contribution: 4 of 100 points · Harmonization bucket (0–20)
With Signal Present
11,436
datasets (0.9%)
μ = 0.246 citations/dataset
Without Signal
1,316,664
datasets (99.1%)
μ = 0.244 citations/dataset
Rate Ratio
1.01
95% CI: [0.97–1.05]
P-value
0.666
z = 0.43
Significance
Method: Poisson rate ratio · Source: Zenodo (n = 1,328,100)
Note: Not statistically significant (p = 0.666). Value is in knowledge graph connectivity and contextual discovery, not direct citation prediction.
H — Harmonization Bucket
All signals in this bucket: