Online reading of BinTableHDU subsets of DESI survey

Vital-Fernandez · January 18, 2024, 10:31pm

Hi! I am using the cloud .fits documentation to read some .fits files from the DESI survey.

This is an example of the code I am using:

import numpy as np
import astropy
from astropy.io import fits

program =‘dark’
release =‘fuji’
catalogue =‘healpix’
target_ID = 39627835576420141
fits_url = ‘https://data.desi.lbl.gov/public/edr/spectro/redux/fuji/zcatalog/zall-pix-fuji.fits’

with fits.open(fits_url, use_fsspec=True) as hdul:
zCatalogBin = hdul['ZCATALOG']
targetID_data = zCatalogBin.data['TARGETID']
program_data = zCatalogBin.data['PROGRAM']
idx_target = np.where((targetID_data == target_ID) & (program_data == program))[0]

# Get healpix, survey and redshift
hpx = zCatalogBin.data['HEALPIX'][idx_target]
survey = zCatalogBin.data['SURVEY'][idx_target]
redshift = zCatalogBin.data['Z'][idx_target]

# Compute the url address
url_list = []
for i, idx in enumerate(idx_target):
    hpx_number = hpx[i]
    hpx_ref = f'{hpx_number}'[:-2]
    target_dir = f"/healpix/{survey[i]}/{program}/{hpx_ref}/{hpx_number}"
    coadd_fname = f"coadd-{survey[i]}-{program}-{hpx_number}.fits"

The first time the code reads the data:

targetID_data = zCatalogBin.data[‘TARGETID’]

It takes a long time, for the rest of the commands it becomes faster (my guess is that it downloads the file, although I cannot find it at the cache directory from:

astropy.config.paths.get_cache_dir()

(which is at /home/user/.astropy/cache)

I tried using recommended .section attribute:

zCatalogBin.section[‘TARGETID’]

But I think that is only available for ImageHDU and not BinTableHDU.

Is there and equivalent high efficiency approach for tables?

Thanks for any advice.

saimn · January 22, 2024, 8:09am

For tables the limitation comes from the way they are written on disk, i.e. row by row. So loading a subset of rows should work but to load only a few columns all the rows will be loaded anyway.

Vital-Fernandez · January 22, 2024, 3:38pm

Hello @saimn! Thank you very much for your reply.

That makes sense. What would be the correct way to request just a few rows at a time in the code above?

If .section is not supported with the BinTableHDU: Doesn’t the .data command request all the rows and columns?

Thanks again for any advide.

saimn · January 22, 2024, 5:56pm

Slicing .data by rows maybe works, as it does for a local file. But not sure if this has been tested with a remote file and fsspec, since the main goal was to support images and the interest is more limited with tables.

Topic		Replies	Views
Unable to instantiate astropy.io.fits.TableHDU class Astropy	6	476	November 28, 2022
Can't open fits files using astropy, "fits files are empty or corrupted" Astropy question	3	716	January 24, 2024
Converting fits.hdu.compressed.CompImageHDU to fits.hdu.Image.ImageHDU Astropy astropy , question	0	18	September 15, 2024
Fits viewver in python Astropy fits	1	241	January 20, 2024
Creating cutout fits files from one image with a list of RA/DEC positions Astropy	2	713	October 14, 2022

Online reading of BinTableHDU subsets of DESI survey

Related topics