Proxy on http/https redirect

Hello,

I’m having a problem trying to use fido to download behind a proxy. Both HTTP_PROXY and HTTPS_PROXY environment variables are set, however I still get an ssl_handshake_timeout error when trying to download:

Errors:
(<parfive.results.Error object at 0x7f599ee932c0>
\http://jsoc.stanford.edu/SUM66/D1758128850/S00000/hmi.mrmap_latlon_720s_nrt.20240617_192400_TAI.data.fits,
ssl_handshake_timeout should be a positive number, got 0.0)(<parfive.results.Error object at 0x7f599ee93ae0>

I can wget the files with both proxy variables set, but Fido is giving me this error. I noticed that the server is redirecting from \http://jsoc.stanford.edu → \https://jsoc1.stanford.edu. Could this be proxy related?

Hi @nijusan,

Can you tell me how you are adding the proxy settings in your Python script?

If I recall, the proxy has to be set before anything else (sunpy for example) is imported.
If you do it after, parfive will not see the environment variables and error like that.

If that doesn’t work, we need to work out what is going on and try to fix it for you!

Ah, I thought it was picking up the system environment variables. In my .bashrc I have the following:

export HTTP_PROXY=http:///proxy.server:9090
export HTTPS_PROXY=http://proxy.server:9090

Is there something I need to do in the Python script itself to pass this info in?

Yeah, you need to manually set it in the Python script.
I am unsure why but it seems that most of the time Python doesn’t seem to import all the env vars? At least in my experience on a macbook.

import os
os.environ["HTTP_PROXY"] = "http:///proxy.server:9090"
os.environ["HTTPS_PROXY"] = "http:///proxy.server:9090"

<rest of imports and python code>

That should hopefully work!

Hi,

Setting the environment variable as stated above doesn’t seem to change the error. Still getting the “ssl_handshake_timeout” error from parfive?

I do notice in the sunpy release that mentions proxy stuff the following:

Download behind proxies

With the release of parfive 1.1, sunpy has been patched to be able to utilize proxy servers when downloading files.

  • Proxy URL is read from the environment variables HTTP_PROXY or HTTPS_PROXY.
  • Proxy Authentication proxy_auth should be passed as a aiohttp.BasicAuth object, explicitly by the user.
  • Proxy Headers proxy_headers should be passed as dict object, explicitly by the user.

What does the thing about “proxy_auth” needing to be set mean?

Oh, maybe that’s related to a proxy that needs authentication to access? That doesn’t apply to me.

Can you tell me your version of parfive?

Can you also try setting PARFIVE_TOTAL_TIMEOUT as well.

So

os.environ["PARFIVE_TOTAL_TIMEOUT"] = 100

and seeing if that does anything?

the parfive version is 2.1.0

Interesting, when I change the timeout value as you suggested the redirect seems to occur then I get a certificate error:

(<parfive.results.Error object at 0x7fde3397d0e0>
http://jsoc.stanford.edu/SUM2/D1758449218/S00000/hmi.mrmap_latlon_720s_nrt.20240618_164800_TAI.data.fits,
Cannot connect to host jsoc1.stanford.edu:443 ssl:True [SSLCertVerificationError: (1, ‘[SSL: CERTIFICATE_
VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1133)’)])(<parfi
ve.results.Error object at 0x7fde2e695b30>

Before setting that I saw no mention of “jsoc1.stanford.edu:443”:

(<parfive.results.Error object at 0x7fe6d8047180>
http://jsoc.stanford.edu/SUM2/D1758449218/S00000/hmi.mrmap_latlon_720s_nrt.20240618_164800_TAI.data.fits,
ssl_handshake_timeout should be a positive number, got 0.0)(<parfive.results.Error object at 0x7fe6d80479a0>

So overall, the combination of all three environment variables doesn’t work?

Unfortunately, I am out of ideas. These three are what I have to set for Fido to work behind my work proxy so I don’t know where else to suggest.

Thanks for the help to this point. I’m thinking maybe the proxy server is overriding my certificate chain? If I find the cause I’ll update this thread.

The wierd thing is that if I take the failed file links from above and plug them into wget or chrome they download just fine?

Chrome should be navigating the proxy automatically based on the system settings.
wget as you said is getting it from the shell and so it works via the proxy.

I can’t explain why its not working in the Python script with everything set at the top.

The only other thing I can think of is that SSL error. I have seen that happen on Mac OS and people have some work arounds for it: python - SSL: CERTIFICATE_VERIFY_FAILED certificate verify failed: unable to get local issuer certificate (_ssl.c:1129)'))) - Stack Overflow

Tho the workaround should be platform agnostic.

It might be worth just writing a Python script using requests to download the file and seeing if setting the proxy works for it.