Downloading multiple fields assynchronously with astroquery


i’ve been trying to download multiple fields from Gaia using the cone_search_async method, by starting the jobs on a loop and them getting the results on another one. Like on the example bellow.

fields = [
      (250, -46, 2),
      (250, -47, 2),
      (250, -48, 2),
      (250, -49, 2),
      (250, -50, 2),
      (250, -51, 2),

  fields = [helper(*field) for field in fields]

  gaia_credentials_file = "gaia_credentials.txt"

  with gaia_credentials(gaia_credentials_file):
      results = [get_gaia_catalog(*inputs) for inputs in fields]
      results = [res.get_data() for res in results]

the helper() and get_gaia_catalog() functions are just helpers to help format the arguments, and the gaia_credentials context manager just help with logging in and out. (get_gaia_catalog basically loads coordinates into skycoords and call Gaia.cone_search_async with the proper parameters)

I’ve also tried to use the async/await syntax but it haven’t worked either (with asyncio.gather)… By watching the logs i can see that the jobs are being started and finished before the next one start, therefore it’s running sequentially (the loop above is taking 5 secs for 1 element and 30 secs for the 6 queries … therefore no gain in speed).

I think i’m just not using it properly. But i can’t find an example that’s fetching more than 1 field at a time.

Someone can give an example of how to fetch multiple fields asynchronously?

Solved the problem by using a ThreadPool on top of astroquery:

fields = [helper*((250, -46 + i, 2)) for i in range(6)] 

gaia_credentials_file = "gaia_credentials.txt"

with gaia_credentials(gaia_credentials_file):
    with ThreadPool() as p:
        res = p.starmap_async(get_gaia_catalog, fields)
        res = [r.get() for r in res]

Don’t know what is happening with the astroquery api, but the code snnipet above gave me a 30x speed gain, and i can see clearly the jobs being initialized one after the other, them downloaded in parallel.