DMTN-318
DOI usage for LSST Data Releases#
Abstract
Each LSST Data Release has to be associated with dataset DOIs. This document will describe the motivation and policy for issuing DOIs for Data Releases.
Digital Object Identifiers in Publications#
The AAS strongly recommend that all papers submitted to their journals follow the best practices outlined in Chen et al. [2022]. This includes the following exhortation (from section 5.4):
Use DOIs to cite related content if available. This includes specific data sets, software, and services used to produce results in the published articles. Archive these in persistent repositories and link them to the article through DOIs minted by the repositories. The DOI links should be included in the bibliography to ensure proper citation, and also be put where the data are discussed to make it easier for readers to locate and access the data.
Furthermore, when submitting a paper the author is explicitly asked for dataset DOIs as part of the process. This indicates that we should ensure that sufficient DOIs are created for LSST Data Releases to and that we are clear with the community that they should be used. DOIs for datasets are extremely useful for citation tracking, if used properly, and will provide us with additional metrics on scientific output from the observatory over and above relying on citations for the data release papers: a citation of the data release dataset implies actual usage of the data whereas as a citation of the data release paper may simply be included as part of a discussion of the survey.
Issuing DOIs#
The Department of Energy Office of Scientific and Technical Information provides to DOE grant holders the ability to issue DOIs through their E-Link service.
Given that the NSF-DOE Vera C. Rubin Observatory is part-funded by the DOE we are allowed to use this mechanism to issue DOIs.
For instrument and dataset DOIs it was agreed that we are allowed to use lsst.io
as the formal landing pages for each DOI.
Instrument DOIs#
OSTI allows instrumentation to be given DOIs. Doing this allows us to reference all our data releases to the instrument that generated the data. This is an extremely useful way to anchor the observatory outputs and science users should be encouraged to cite the instrument DOI explicitly.
As part of the Data Preview 1 activity [O'Mullane, 2025] we demonstrated this by creating a DOI for LSSTComCam [SLAC National Accelerator Laboratory (SLAC) and NSF-DOE Vera C. Rubin Observatory, 2024]. We intend to create DOIs for both LATISS and LSSTCam as well.
Data Release DOIs#
Rubin Observatory plans to make annual data releases of the Legacy Survey of Space and Time [Ivezić et al., 2019]. There will always be an umbrella DOI issued for each data release as a whole. From a citation tracking perspective this is the minimum requirement, but we know that there will be very few papers published that really make use of the full set of imaging and catalog data. Furthermore, Independent Data Access Centers (IDACs) [O'Mullane et al., 2021] are not going to be taking copies of full data release and instead are likely to restrict their holdings to, for example, the object catalog and deep coadd images, or even a version of the object catalog with a subset of the columns (sometimes called “Object Lite”). IDACs will be encouraged to create DOIs for their holdings that explicitly link back to the primary data release archive. To enable this we intend to issue DOIs to every catalog (object, source, forced source, etc) and every Data Butler [Jenness et al., 2022] dataset type (visit images, difference images, co-adds).
An open question is whether to create DOIs for “virtual” Butler data products that can be regenerated on demand.
Relationships#
DOIs much more useful if relationships to other DOIs are specified in the metadata. DataCite and Crossref specify how relationships between DOIs are specified and we will relate release, dataset, and instrument DOIs in the following ways:
The data release dataset “isCollectedBy” the instrument.
The catalog and dataset type component DOIs will be declared as “isPartOf” the main data release DOI.
If an IDAC creates a DOI for their copy of a component they shall use “isVariantFormOF” the original dataset even if the copy is identical.[1] They shall not use “isPartOf”.
If an IDAC has a cut down version of a catalog (e.g., “object lite”) then that version “isVariantFormOf” the original full catalog.
If a data release includes Butler datasets that contain tabular data (for example in Parquet files) that are also available as catalogs in Qserv, then the catalog DOI “isVariantFormOf” the Butler dataset to indicate that they are almost, but not exactly, the same data and that the parquet files are the original form.
A new data release “Obsoletes” a previous data release.[2]
A new release’s catalog dataset (e.g., “Object”) also “Obsoletes” the corresponding catalog dataset in the previous release.
References#
William O'Mullane. Data Preview 1: Definition and planning. Technical Note RTN-085, Vera C. Rubin Observatory, February 2025. URL: https://rtn-085.lsst.io/.
William O'Mullane, Beth Willman, Melissa Graham, Leanne Guy, Robert Blum, and Phil Marshall. Guidelines for Rubin Independent Data Access Centers. Technical Note RTN-003, Vera C. Rubin Observatory, August 2021. URL: https://rtn-003.lsst.io/.
Tracy X. Chen, Marion Schmitz, Joseph M. Mazzarella, Xiuqin Wu, Julian C. van Eyken, Alberto Accomazzi, Rachel L. Akeson, Mark Allen, Rachael Beaton, G. Bruce Berriman, Andrew W. Boyle, Marianne Brouty, Ben H. P. Chan, Jessie L. Christiansen, David R. Ciardi, David Cook, Raffaele D'Abrusco, Rick Ebert, Cren Frayer, Benjamin J. Fulton, Christopher Gelino, George Helou, Calen B. Henderson, Justin Howell, Joyce Kim, Gilles Landais, Tak Lo, Cécile Loup, Barry Madore, Giacomo Monari, August Muench, Anaïs Oberto, Pierre Ocvirk, Joshua E. G. Peek, Emmanuelle Perret, Olga Pevunova, Solange V. Ramirez, Luisa Rebull, Ohad Shemmer, Alan Smale, Raymond Tam, Scott Terek, Doug Van Orsow, Patricia Vannier, and Shin-Ywan Wang. Best Practices for Data Publication in the Astronomical Literature. \apjs , 260(1):5, May 2022. arXiv:2106.01477, doi:10.3847/1538-4365/ac6268.
Željko Ivezić, Steven M. Kahn, J. Anthony Tyson, Bob Abel, Emily Acosta, Robyn Allsman, David Alonso, Yusra AlSayyad, Scott F. Anderson, John Andrew, James Roger P. Angel, George Z. Angeli, Reza Ansari, Pierre Antilogus, Constanza Araujo, Robert Armstrong, Kirk T. Arndt, Pierre Astier, Éric Aubourg, Nicole Auza, Tim S. Axelrod, Deborah J. Bard, Jeff D. Barr, Aurelian Barrau, James G. Bartlett, Amanda E. Bauer, Brian J. Bauman, Sylvain Baumont, Ellen Bechtol, Keith Bechtol, Andrew C. Becker, Jacek Becla, Cristina Beldica, Steve Bellavia, Federica B. Bianco, Rahul Biswas, Guillaume Blanc, Jonathan Blazek, Roger D. Blandford, Josh S. Bloom, Joanne Bogart, Tim W. Bond, Michael T. Booth, Anders W. Borgland, Kirk Borne, James F. Bosch, Dominique Boutigny, Craig A. Brackett, Andrew Bradshaw, William Nielsen Brandt, Michael E. Brown, James S. Bullock, Patricia Burchat, David L. Burke, Gianpietro Cagnoli, Daniel Calabrese, Shawn Callahan, Alice L. Callen, Jeffrey L. Carlin, Erin L. Carlson, Srinivasan Chandrasekharan, Glenaver Charles-Emerson, Steve Chesley, Elliott C. Cheu, Hsin-Fang Chiang, James Chiang, Carol Chirino, Derek Chow, David R. Ciardi, Charles F. Claver, Johann Cohen-Tanugi, Joseph J. Cockrum, Rebecca Coles, Andrew J. Connolly, Kem H. Cook, Asantha Cooray, Kevin R. Covey, Chris Cribbs, Wei Cui, Roc Cutri, Philip N. Daly, Scott F. Daniel, Felipe Daruich, Guillaume Daubard, Greg Daues, William Dawson, Francisco Delgado, Alfred Dellapenna, Robert de Peyster, Miguel de Val-Borro, Seth W. Digel, Peter Doherty, Richard Dubois, Gregory P. Dubois-Felsmann, Josef Durech, Frossie Economou, Tim Eifler, Michael Eracleous, Benjamin L. Emmons, Angelo Fausti Neto, Henry Ferguson, Enrique Figueroa, Merlin Fisher-Levine, Warren Focke, Michael D. Foss, James Frank, Michael D. Freemon, Emmanuel Gangler, Eric Gawiser, John C. Geary, Perry Gee, Marla Geha, Charles J. B. Gessner, Robert R. Gibson, D. Kirk Gilmore, Thomas Glanzman, William Glick, Tatiana Goldina, Daniel A. Goldstein, Iain Goodenow, Melissa L. Graham, William J. Gressler, Philippe Gris, Leanne P. Guy, Augustin Guyonnet, Gunther Haller, Ron Harris, Patrick A. Hascall, Justine Haupt, Fabio Hernandez, Sven Herrmann, Edward Hileman, Joshua Hoblitt, John A. Hodgson, Craig Hogan, James D. Howard, Dajun Huang, Michael E. Huffer, Patrick Ingraham, Walter R. Innes, Suzanne H. Jacoby, Bhuvnesh Jain, Fabrice Jammes, M. James Jee, Tim Jenness, Garrett Jernigan, Darko Jevremović, Kenneth Johns, Anthony S. Johnson, Margaret W. G. Johnson, R. Lynne Jones, Claire Juramy-Gilles, Mario Jurić, Jason S. Kalirai, Nitya J. Kallivayalil, Bryce Kalmbach, Jeffrey P. Kantor, Pierre Karst, Mansi M. Kasliwal, Heather Kelly, Richard Kessler, Veronica Kinnison, David Kirkby, Lloyd Knox, Ivan V. Kotov, Victor L. Krabbendam, K. Simon Krughoff, Petr Kubánek, John Kuczewski, Shri Kulkarni, John Ku, Nadine R. Kurita, Craig S. Lage, Ron Lambert, Travis Lange, J. Brian Langton, Laurent Le Guillou, Deborah Levine, Ming Liang, Kian-Tat Lim, Chris J. Lintott, Kevin E. Long, Margaux Lopez, Paul J. Lotz, Robert H. Lupton, Nate B. Lust, Lauren A. MacArthur, Ashish Mahabal, Rachel Mandelbaum, Thomas W. Markiewicz, Darren S. Marsh, Philip J. Marshall, Stuart Marshall, Morgan May, Robert McKercher, Michelle McQueen, Joshua Meyers, Myriam Migliore, Michelle Miller, David J. Mills, Connor Miraval, Joachim Moeyens, Fred E. Moolekamp, David G. Monet, Marc Moniez, Serge Monkewitz, Christopher Montgomery, Christopher B. Morrison, Fritz Mueller, Gary P. Muller, Freddy Muñoz Arancibia, Douglas R. Neill, Scott P. Newbry, Jean-Yves Nief, Andrei Nomerotski, Martin Nordby, Paul O'Connor, John Oliver, Scot S. Olivier, Knut Olsen, William O'Mullane, Sandra Ortiz, Shawn Osier, Russell E. Owen, Reynald Pain, Paul E. Palecek, John K. Parejko, James B. Parsons, Nathan M. Pease, J. Matt Peterson, John R. Peterson, Donald L. Petravick, M. E. Libby Petrick, Cathy E. Petry, Francesco Pierfederici, Stephen Pietrowicz, Rob Pike, Philip A. Pinto, Raymond Plante, Stephen Plate, Joel P. Plutchak, Paul A. Price, Michael Prouza, Veljko Radeka, Jayadev Rajagopal, Andrew P. Rasmussen, Nicolas Regnault, Kevin A. Reil, David J. Reiss, Michael A. Reuter, Stephen T. Ridgway, Vincent J. Riot, Steve Ritz, Sean Robinson, William Roby, Aaron Roodman, Wayne Rosing, Cecille Roucelle, Matthew R. Rumore, Stefano Russo, Abhijit Saha, Benoit Sassolas, Terry L. Schalk, Pim Schellart, Rafe H. Schindler, Samuel Schmidt, Donald P. Schneider, Michael D. Schneider, William Schoening, German Schumacher, Megan E. Schwamb, Jacques Sebag, Brian Selvy, Glenn H. Sembroski, Lynn G. Seppala, Andrew Serio, Eduardo Serrano, Richard A. Shaw, Ian Shipsey, Jonathan Sick, Nicole Silvestri, Colin T. Slater, J. Allyn Smith, R. Chris Smith, Shahram Sobhani, Christine Soldahl, Lisa Storrie-Lombardi, Edward Stover, Michael A. Strauss, Rachel A. Street, Christopher W. Stubbs, Ian S. Sullivan, Donald Sweeney, John D. Swinbank, Alexander Szalay, Peter Takacs, Stephen A. Tether, Jon J. Thaler, John Gregg Thayer, Sandrine Thomas, Adam J. Thornton, Vaikunth Thukral, Jeffrey Tice, David E. Trilling, Max Turri, Richard Van Berg, Daniel Vanden Berk, Kurt Vetter, Francoise Virieux, Tomislav Vucina, William Wahl, Lucianne Walkowicz, Brian Walsh, Christopher W. Walter, Daniel L. Wang, Shin-Yawn Wang, Michael Warner, Oliver Wiecha, Beth Willman, Scott E. Winters, David Wittman, Sidney C. Wolff, W. Michael Wood-Vasey, Xiuqin Wu, Bo Xin, Peter Yoachim, and Hu Zhan. LSST: From Science Drivers to Reference Design and Anticipated Data Products. \apj , 873(2):111, Mar 2019. arXiv:0805.2366, doi:10.3847/1538-4357/ab042c.
Tim Jenness, James F. Bosch, Andrei Salnikov, Nate B. Lust, Nathan M. Pease, Michelle Gower, Mikolaj Kowalik, Gregory P. Dubois-Felsmann, Fritz Mueller, and Pim Schellart. The Vera C. Rubin Observatory Data Butler and pipeline execution system. In Software and Cyberinfrastructure for Astronomy VII, volume 12189 of Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, 1218911. August 2022. arXiv:2206.14941, doi:10.1117/12.2629569.
SLAC National Accelerator Laboratory (SLAC) and NSF-DOE Vera C. Rubin Observatory. LSST Commissioning Camera. 2024. URL: https://www.osti.gov/servlets/purl/2561361/, doi:10.71929/RUBIN/2561361.