The LinkageLibrary is a community and repository for researchers involved in combining datasets, facilitating comparison of different algorithms, and promoting transparency and replicability of research. We invite computer scientists, statisticians, and social, behavioral, economic, and health scientists to deposit code and/or data, and to join the conversation.


  • Accelerate development of new record linkage and evaluation methods, and use on real data
  • Improve reproducibility of analyses
  • Develop critical collaborations between researchers, users, and data custodians
  • Help close the gap between research and practice
  • Train the next generation of multi-disciplinary data scientists who can lead the field.
  • Build cross-disciplinary community around data linkage

Three Types of Projects

Data & Code

Data Only

Code Only