News media play an important role in disseminating scientific publications to general audiences. With the rise of Internet technologies, not only have news sites turned online, but new actors have entered the stage, such as blogs and various Social Media platforms1. Still, news media remain one of the most common sources for citizens to learn about scientific developments2. Scientific publishers and academic organizations have professionalized the dissemination of science news, e.g., by establishing public information officers (PIOs). The latter send out press releases to inform the public and the press about noteworthy news or events, e.g. a new publication or other scientific news of general interest. As such, the ‘academic press release’3 does not differ much from press releases known from non-academic areas. Significantly, press embargoes help in timing and synchronizing academic press releases across many news outlets: press releases may only be published after an embargo date, but are released to journalists early to give time for preparing their reporting. Platforms like EurekAlert! play an important role as brokers between PIOs and journalists: While the former send press releases to EurekAlert!, the latter get early access to press releases through EurekAlert!.
EurekAlert! (https://www.eurekalert.org) is an editorially independent, non-profit, online science news service, launched and operated by the American Association for the Advancement of Science (AAAS) in 1996. It was established to fill a gap noticed by science journalists, press officers and journal publishers, who wished to use the possibilities of the Internet to send and receive their science research news more broadly4. Nowadays, EurekAlert! disseminates news from universities, medical centers, journals, government agencies, and other research organizations5. It offers press releases in English, French, German, Spanish, Portuguese, Japanese, and Chinese and has more than 10,000 PIOs and nearly 12,000 journalists registered worldwide in 20166. EurekAlert!’s focus on being an intermediary between journalists and emitters of press releases, providing access to (unredacted) academic press releases at a global scale and across scientific disciplines makes it stand out from other online news services. A comparable service is AlphaGalileo but has fewer (2,000) contributors and journalists (7,000) listed7. This makes EurekAlert! a prime source for studying academic press releases at a global scale.
Linking the general public to science, academic press releases hold interest from the perspective of science communication, both in small and large-scale quantitative analyses, as done, e.g. by Autzen8 and Sumner et al.9. Other studies have covered the quality of information in press releases10 and potential differences with the publications they promote11. Academic press releases, and those coming from EurekAlert! in particular, have also become the focus of research in the area of altmetrics, which studies online traces of scientific impact12. Bowman and Hassan13 conducted the first descriptive analysis of EurekAlert! press releases using the Altmetric.com database as a data source, combined with a web-scraping approach. They found that EurekAlert! was the second largest news source on Altmetric.com mentioning scientific publications. Lemke et al.14 identified a potential association between an article’s performance and certain qualities (structure, accessibility, and engaging narrative) of its press releases.
Since data on academic press releases is not readily available, various studies have extracted the necessary data every time anew, applying different approaches13,15. This hints to a barrier for large-scale data-driven research on academic press releases, including from EurekAlert!: the lack of a systematic summary of data, data structures and research directions. This barrier for reproducibility and the development of quantitative analyses of academic press releases calls for a structured, comprehensive, open, and well-documented database. Such a database would allow researchers to explore new research questions, test hypotheses, and reproduce results. In fact, the lack of such a database may be seen as an important limitation to the introduction and application of large-scale quantitative approaches in the study of science communication processes. While previous work15 has already presented large-scale analyses of academic press releases, a systematic outline for collecting and structuring the data is still lacking. This paper aims to fill that gap and describes a comprehensive dataset of EurekAlert! press releases16. We provide a detailed description of the collection and curation of EurekAlert! press release-metadata and the creation of a relational database for those records, building on the framework discussed by Orduña-Malea and Costas15.
In presenting a data paper, we add to existing examples in the fields of Scientometrics and Science of Science studies17,18,19; for the part of science communication, we expect our contribution to pave the way for new, particularly quantitative research directions: from descriptive statistics on volume, topics, and contributing organizations, to more advanced analyses linking press releases with scientific publications, social media, and citation data. Moreover, the data supports studying potential biases in science communication, such as overrepresentation of certain topics, as well as changes over time in topical coverage, institutional representation, or geographic origin. Comparisons between the content of scientific publications and their corresponding press releases offer a way of examining accuracy, readability, and framing. When combined with altmetrics and citation data, it also becomes possible to assess the downstream impact of press releases in terms of public attention and academic visibility.
Ultimately, we view this dataset16 as a starting point for broader efforts in press release-based science communication research. We invite others to build upon this resource, whether by linking it with additional sources (e.g., ROR, PubMed, or Mendeley), expanding it with multilingual press releases, or using it to explore new hypotheses. By publishing an open dataset and the related scripts and procedures, we follow open science principles20 and aim to facilitate future research for those interested in press releases and science communication dynamics. Ideally, science communication researchers will follow up by producing similar datasets and curation approaches, collectively increasing the analytical realm of science communication research.