FITS vs. HDF5 data format

Dear all,

is there somewhere a written up concise discussion of FITS vs. HDF5 for X-ray data? I know this has been mentioned in the HEAD 19 splinter (which I could not attend because wrong continent …), for example, mentioning Athena. I want to usefully contribute to a general discussion colleagues coming from optical observations are having about data formats from the X-ray perspective.

If there is nothing written up, I’m also open to being told stuff or being pointed towards talk slides :wink:

Thanks!

2 Likes

Hi Victoria,

That’s a great topic! I am afraid that I do not have a concise discussion on paper, but I can tell you some things from experience.

A couple of years ago, we started working with HDF5 files in our SRON lab for the TES development for Athena X-IFU. We needed a format to store large volumes of raw TES data and HDF5 does a very good job there. Our software engineers, who worked with FITS before, were really happy with the HDF5 library, which they found much more modern and clear than the cfitsio library. We still use HDF5 for TES data files today and it works really well.

In principle, HDF5 is able to contain multiple data tables, just as FITS, including metadata (keywords). There is no limit in the number of datasets that I am aware of and datasets can be nested like in a directory structure. The metadata, equivalent to FITS keywords, can be added to both HDF5 groups (extensions) and the datasets themselves. The difference with FITS is that the metadata attribute (keyword) in HDF5 only has a name and a value. In FITS, you also can also add a comment to a keyword with, for example, the unit of the value. In HDF5, this can be simply solved by creating a second attribute containing the unit specification. This is the only ‘disadvantage’ that I can see for HDF5 in the format itself, but this can all be solved easily in file specifications and software.

What makes the adoption of HDF5 in X-ray astronomy challenging:

  • For the adoption of HDF5, one would need a generally accepted file specification for event files, spectra, responses, etc. Similar to the OGIP format specification of FITS.
  • One needs to newly develop or add HDF5 support to specific mission software, ftools, ds9, etc. There are no general HDF5 tools available for X-ray astronomy that I know of.

Otherwise, I think HDF5 is a very capable file format ready and optimized for use in cluster computing, where it probably will perform much better than FITS also thanks to the parallel reading/writing capability of HDF5. This would be the strongest advantage of HDF5 over FITS.

I hope this information is useful to you and others.

1 Like

Should this group (HEACIT) try to work on a an HDF5 standard? One really easy way to do that might be to just “mirror” the OGIP into hdf5 (i.e. use same keywords and table columns). Then, tools could try to implement that relatively easily as “proof of concept”. One would still get some of the HDF5 benefits (e.g. improved speed for certain read operations, read partial data files over network) and writing a converter to/from the existing OGIP fits would be easy. That’s not getting much new functionality, but could serve as a way to “try it out” in more packages and with more users.

Thanks so much, Jelle! That was super useful!

My next question would have been how easy it would be to implement a converter IF we had a HDF5 standard defined, but Moritz partly answered that by his question.

Moritz - I don’t have strong opinions as someone who is right now also just exploring. I do wonder whether somebody has already tried something similar given how we are certainly not the first ones to discuss HDF5 as a format?

I’m not an x-ray astronomer (my background is theory), so this may be a bit of a red herring, but at the 2019 HDF5 European Workshop, I saw a number of group show off visualisation tools for a variety of HEP fields (HDF5 European Workshop for Science and Industry, Day 1 - YouTube contains some of them). It might be worth considering what may be available in adjacent fields, and see if there anything that can be reused (or at least inform the design of new tools).

Thanks for the suggestion. Indeed there are interesting visualisation tools, also in astronomy, that support HDF5. In our recent HEACIT meeting, CARTA was mentioned as a great tool to view astronomical data cubes. I had a first look and it is really promising, also for X-ray astronomy use. See https://cartavis.org/ for more information.

1 Like