A researcher-level metric for data sharing practices, directly analogous to the h-index for publications.
A researcher has an S-Index of n if n of their datasets have SHARE scores of at least n.
Just like the h-index rewards both quantity and quality of publications, the S-Index rewards researchers who consistently share well-documented datasets.
Example Calculation
Researcher has 8 datasets with SHARE scores: Sorted descending: [78, 72, 68, 61, 55, 45, 42, 31] Position: 1 2 3 4 5 6 7 8 Check each position: Position 1: 78 ≥ 1? Yes ✓ Position 2: 72 ≥ 2? Yes ✓ ... Position 8: 31 ≥ 8? Yes ✓ S-Index = 8 (limited by number of datasets) Citation: S-Index(v1.0) = 8 (as of 2026-01-24, n=8 datasets)
Rewards Quantity + Quality
Need multiple high-scoring datasets to achieve high S-Index. One perfect dataset isn't enough.
Robust to Outliers
Low-scoring datasets don't drag down a high S-Index. Early-career experiments don't penalize you.
Familiar Model
Works like the h-index researchers already understand. Easy adoption and intuitive comparison.
Gaming Resistant
Can't inflate by adding many low-quality datasets. Encourages genuine improvement.
| S-Index | Rating |
|---|---|
| 50+ | Exceptional - Prolific with consistently strong practices |
| 30-49 | Strong - Substantial portfolio of well-documented datasets |
| 15-29 | Developing - Growing portfolio, typical for mid-career |
| 5-14 | Early - Building portfolio, new to open data sharing |
| 1-4 | Beginning - Just starting to share data openly |
| Property | h-Index (Publications) | S-Index (Data Sharing) |
|---|---|---|
| Definition | h papers with ≥ h citations each | n datasets with SHARE score ≥ n each |
| What it measures | Publication impact | Data sharing quality and consistency |
| Scale | Unbounded (typically 0-100+) | 0-100 (bounded by SHARE score max) |
| Cross-field comparison | Difficult (citation norms vary) | Fair (universal signal vocabulary) |
Note: The S-Index measures data sharing practices, not research quality. A high S-Index indicates excellent metadata practices across many datasets. It should be used alongside, not instead of, traditional research quality indicators.