Past, present and Future Challenges in Sharing Science: From PhysioNet to Foundation Models

Gari Clifford
Emory University and Georgia Institute of Technology


Abstract

Over the last 25 years, the sharing of data and models for research in cardiology has evolved from sneakernet to the internet - from mailing tapes and compact discs of a handful of well-curated recordings of an array of arrhythmias, to the high-speed download of an entire hospital database. Yet, bandwidth and local computation has not kept pace with the rate at which we can stripmine our data archives. Recently, the trend towards the development of large foundational models has required enormous computing power, making large-scale local clusters or centralized cloud instances of data and compute the only viable solution for developing such models. Instead of democratizing access to data and compute, as was the intent of the pioneers in this field (such as Mark, Moody and Goldberger), this trend is leading to the balkanization of innovation in the hands of the rich and powerful. This talk (and article) will represent a personal tour of this history, ending with a discussion of the most promising future directions, and potential solutions to this concentration of power.