Discussion notes from HEACIT Splinter at HEAD 19

HEACIT hosted a splinter session at the HEAD 19 meeting on Monday, March 14

We gave an overview of HEACIT and a general overview of the software landscape in this Google slide deck.

X-ray polarization talk

  • Remember that there will be negative values and zeros in your model
  • Stokes parameters - three detectors (IQU), each with different PHAs, independent calibrations that you need to model simultaneously
  • Every event has angle on the detector → angle on the sky → IQU; weight parameter is needed for likelihood analysis, based on the modulation factor for that event (E, track length)
  • 87 pages of files!! FITS files will look like average FITS files with a few extra pieces of information – angle (direction of the track), track itself (Level 1), weight, modulation factor
  • Level 2: IQU for that event that you can try to model
  • User FTOOLS - post Level 2 for the user to make selections on the data, make a sky map (three levels); ixpepolarization - get a measurement by selecting the region you want in ds9
  • Python version of IXPE: a simulator and an analyzer — Baldini+ 2022; data is a format usable by XSPEC, can use simulated data to try out your tools
  • Weights are in level 2 files, ixpeobssim — provides ARFs and RMFs that are modified by the weights
  • Q and U spectra can be negative
  • Q: Is there public data available now? Yes, Cas A
  • Q: Software release will include all the area response files, etc. Not quite out but out soon.

Discussion about next generation needs:

  • Might want to take avantage of R
  • Still a lot of statistical theory that needs to be developed to do it right
  • How is the efficiency of R? Can encounter some problems, als o an issue of non intersecting communities
  • Let’s talk about Julia — its the successor of R, C, Python, a lot more efficient. Runs at C speed. Compiled just-in-time.
  • How easy is Julia to learn? … good question … Version 1 came out in 2018. Version 1.8 is coming out. It’s similar to IDL in the way it works.
  • Need more export humans (R, Julia, etc.)
  • Strategy to speed up Python — cython, nuumba does just-in-time compiling
  • Maybe the discussion shouldn’t be about languages — we have a problem with the data model, effectively unchanged since the 1980s. Very linear detectors, simplified Poisson statistics. Has been very successful but about to break. XRISM will squeak by because of PSF and small number of pixels. Athena will break everything.
  • Joern doesn’t believe some of these languages will be around in 15 years
  • Factor of 10 increase in data size, complexity, collecting area (high count rate, no longer exactly Poisson); 1000s of spectra at once. XRISM will have fewer spectra. Why would you degrade the spatial resolution if you don’t have to? Cross-talk is an issue. Non-linear behavior. PIleup will be a major issue that people will have to handle to an unprecedented degree.
  • John Zuhone — simulated Athena datasets are enormous, 10s of GB event file. Trying to fit a microcalorimeter data with a single temperature is silly.
  • Athena working groups are really thinking hard about data, should we switch to HD5 rather than FITS? Working groups are thinking long term.
  • +1 for not using FITS for Athena. All missions are different. But once you constrict yourself to a particular format you constrict yourself. FITS has been limiting.
  • Solar physics colleagues may have dealt with these problems (Parker solar probe, for example … data far ahead of the models as well as enormous datasets). Thought: Not clear that the problem is solved?
  • From gamma ray perspective — we also have this issue. There is a pre-processing pipeline and you provide binned data to people. Not everyone needs event lists and processing themselves. There have been two different approaches to combining multi-instruments. Common event format and common definition of what should be in each time of file. Also 3ML is a solution — different plugins for each instrument.
  • Higher S/N makes the detector systematic response a much more obvious problem. Why not treat it like other wavelengths that operate in a Gaussian space? Our detectors are too non-linear. Is this an issue of not understanding our responses well enough? Or something else? The detector is non-linear because the response is non-static.
  • Right now we are Poisson limited. In the future we will be instrument limted.
  • When will we need to start throwing away data? Transmission limited, storage limited. It shouldn’t be the computation (on the ground) that limits you.
1 Like