Online reading of BinTableHDU subsets of DESI survey

Hi! I am using the cloud .fits documentation to read some .fits files from the DESI survey.

This is an example of the code I am using:

import numpy as np
import astropy
from astropy.io import fits

program =‘dark’
release =‘fuji’
catalogue =‘healpix’
target_ID = 39627835576420141
fits_url = ‘https://data.desi.lbl.gov/public/edr/spectro/redux/fuji/zcatalog/zall-pix-fuji.fits

with fits.open(fits_url, use_fsspec=True) as hdul:

zCatalogBin = hdul['ZCATALOG']
targetID_data = zCatalogBin.data['TARGETID']
program_data = zCatalogBin.data['PROGRAM']
idx_target = np.where((targetID_data == target_ID) & (program_data == program))[0]

# Get healpix, survey and redshift
hpx = zCatalogBin.data['HEALPIX'][idx_target]
survey = zCatalogBin.data['SURVEY'][idx_target]
redshift = zCatalogBin.data['Z'][idx_target]

# Compute the url address
url_list = []
for i, idx in enumerate(idx_target):
    hpx_number = hpx[i]
    hpx_ref = f'{hpx_number}'[:-2]
    target_dir = f"/healpix/{survey[i]}/{program}/{hpx_ref}/{hpx_number}"
    coadd_fname = f"coadd-{survey[i]}-{program}-{hpx_number}.fits"

The first time the code reads the data:

targetID_data = zCatalogBin.data[‘TARGETID’]

It takes a long time, for the rest of the commands it becomes faster (my guess is that it downloads the file, although I cannot find it at the cache directory from:

astropy.config.paths.get_cache_dir()

(which is at /home/user/.astropy/cache)

I tried using recommended .section attribute:

zCatalogBin.section[‘TARGETID’]

But I think that is only available for ImageHDU and not BinTableHDU.

Is there and equivalent high efficiency approach for tables?

Thanks for any advice.

For tables the limitation comes from the way they are written on disk, i.e. row by row. So loading a subset of rows should work but to load only a few columns all the rows will be loaded anyway.

Hello @saimn! Thank you very much for your reply.

That makes sense. What would be the correct way to request just a few rows at a time in the code above?

If .section is not supported with the BinTableHDU: Doesn’t the .data command request all the rows and columns?

Thanks again for any advide.

Slicing .data by rows maybe works, as it does for a local file. But not sure if this has been tested with a remote file and fsspec, since the main goal was to support images and the interest is more limited with tables.

1 Like