SHARE Score

About
Framework
/

Stewardship

/

S1

S1

Geographic Context

Coordinates, place names, bounding boxes

Stewardship (S)
Findable (F2)

Justification

Geographic metadata enables spatial discovery and contextual understanding of datasets. DataCite lists GeoLocation as a Recommended property, supporting coordinates, place names, and bounding boxes. schema.org includes spatialCoverage as a recommended property for Google Dataset Search ranking. Dublin Core’s Coverage element encompasses spatial topics. Together, these three independent standards converge on the importance of geographic context for dataset findability.

Practical Guide

domain-specific

Add location data. Essential for spatial datasets, optional otherwise.

Geographic metadata (coordinates, place names, bounding boxes) helps users discover datasets through spatial searches. Our data shows a 0.34x citation ratio — not because location data hurts, but because geo-tagged datasets serve specialized communities (ecology, geosciences) with lower citation norms. If your data has a geographic dimension, tag it.

Why this signal matters despite the numbers

The negative citation ratio (0.34x) reflects community size, not data quality. Geo-tagged datasets serve niche domains with fewer citers. Geographic metadata enables spatial discovery that citation counts don't capture.

For Repositories

  • Add optional GeoLocation fields (lat/lon, place name, bounding box)
  • Map to DataCite #18 GeoLocation or schema.org spatialCoverage
  • Auto-suggest geographic enrichment for ecology and environmental datasets

For Depositors

  • Include coordinates or place names if your data has a geographic component
  • Use ISO 3166 country codes for standardized geographic tagging
  • Add bounding boxes for region-level datasets

High value for spatial datasets, low prevalence (5.7%) means most general repositories can skip this.

Standards Sources

Convergence score: 4/4 independent sources —

Strongly justified

StandardField / PropertyObligation Level
DataCite 4.6#18 GeoLocation
Recommended
schema.orgspatialCoverage
Recommended
Dublin CoreCoverage (spatial)
Core Element

FAIR Principle Alignment

Primary mapping: Findable (F2)

  • F2: Data are described with rich metadata

RDA FAIR Data Maturity Model Indicators:

  • RDA-F2-01M: Rich metadata is provided to allow discovery

How This Signal Is Measured

Presence of geographic coordinates (lat/lon), place names, country codes, or bounding boxes in dataset metadata. Binary: present or absent.

Empirical Evidence (Zenodo, n=1.3M)

Per-signal statistics use Zenodo as the primary validation source because it is the largest general-purpose repository with structured DataCite metadata, natural variance across all 25 signals, and available citation/usage data. Domain-specific repositories exhibit ceiling effects or restricted variance that preclude per-signal discrimination. Cross-repository validation is reported separately.

Prevalence

5.7%

of Zenodo datasets

Citation Lift

0.3x

vs. datasets without

Data Source

Zenodo (CERN)

1,328,100 records analyzed

Interpretation: Datasets with geographic context tend to serve specialized communities (ecology, geosciences) with lower citation norms. The negative lift reflects domain specificity, not lower quality. Geographic metadata enables spatial discovery that citations don't capture.

Quantitative Evidence

Scoring Formula

geographic_metadata ∈ record → 4 pts

Contribution: 4 of 100 points · Stewardship bucket (0–20)

With Signal Present

75,888

datasets (5.7%)

μ = 0.085 citations/dataset

Without Signal

1,252,212

datasets (94.3%)

μ = 0.254 citations/dataset

Rate Ratio

0.34

95% CI: [0.330.34]

P-value

< 0.001

z = -87.04

Significance

Negative association

Method: Poisson rate ratio · Source: Zenodo (n = 1,328,100)

ShareScore