Anyone else having problems displaying the dask dashboard after ssh-tunneling to Casper? I click on the link (e.g. /proxy/8787/status), and I get the error [ErrNo 111] Connection refused
.
It seems that Dask is running properly despite this problem...
Can you confirm that dask is actually running the dashboard on port 8787?
How do I check?
Also, can you confirm that you have jupyter-server-proxy
in your environment?
How do I check?
How are you setting the dashboard link in dask's configuration?
dask.config.set({'distributed.dashboard.link': '/proxy/8787/status'})
or
dask.config.set({'distributed.dashboard.link': '/proxy/{port}/status'})
????
I'm doing the second one with {port}
in it.
Okay
Also, can you confirm that you have
jupyter-server-proxy
in your environment?
How about this :point_up:
When I type conda list
in my environment, that package is not listed.
Sounds like we found the issue
OK, I will try that out.
You need the jupyter-server-proxy
for the dask.config.set({'distributed.dashboard.link': '/proxy/{port}/status'})
to work
@Brian Bonnlander, I am noticing similar behavior even when I have jupyter-server-proxy
installed. What version of dask/distributed/bokeh are you running?
# packages in environment at /glade/u/home/bonnland/miniconda3/envs/lens-conversion: # # Name Version Build Channel dask 2.15.0 py_0 conda-forge distributed 2.15.2 py38h32f6830_0 conda-forge bokeh 1.4.0 py38h32f6830_1 conda-forge
It is our zarrification environment, built a few days ago.
Try downgrading dask and distributed to 2.14.0
, and see if the issue goes away
So, after changing the environment, is a kernel restart sufficient to get the changes? Or do I have to restart the lab?
Try refreshing the lab first
I did conda install dask=2.14.0 distributed=2.14.0
, which took a while to solve the environment, but...after hitting the circular 'Refresh' button on the lab and restarting the kernel, the dashboard works! Downgrading was the answer.
Great!... Something weird is going on depending on the versions of dask/distributed/bokeh one is using. I was running into a similar issue with the following:
$ conda list dask # packages in environment at /glade/work/abanihi/softwares/miniconda3/envs/analysis: # # Name Version Build Channel dask 2.15.0 py_0 conda-forge dask-core 2.15.0 py_0 conda-forge dask-jobqueue 0.7.1 py_0 conda-forge dask-mpi 2.0.0 py37_0 conda-forge $ conda list bokeh # packages in environment at /glade/work/abanihi/softwares/miniconda3/envs/analysis: # # Name Version Build Channel bokeh 2.0.1 py37hc8dfbb8_0 conda-forge
I had a similar problem recently. I wonder whether it might be useful to keep track of the working (combination) of versions in here, so that when someone runs into problems they can try downgrading / upgrading to the "tested" versions first?!
OK, for me these commands got me a working dashboard:
conda activate my-pangeo-environment conda install dask=2.14.0 distributed=2.14.0 bokeh=1.4.0
FYI this worked for me as well. After a few frustrating weeks of not seeing a dashboard. Downgraded to the same as @Brian Bonnlander and the dashboard is running again. @Anderson Banihirwe any idea what's going on here? I recall an earlier issue with bokeh
but this seems like a dask
thing now as well. Are the developers aware?
@Anderson Banihirwe any idea what's going on here? I recall an earlier issue with bokeh but this seems like a dask thing now as well. Are the developers aware?
There were some issues with distributed 2.15.0 and 2.15.1. However, yesterday I ran into a similar issue with the latest version (2.16.0). I haven't had time to narrow down the possible causes....
I am going to try out different versions of distributed
, bokek
, and jupyter-server-proxy
to see if I can come up with a combination of versions that are problematic, and then I will open an issue upstream
As an update, it turns out that there were some changes in dask's distributed scheduler codebase that broke the dashboard functionality when the network interface was explicitly specified (under the hood, dask-jobqueue explicitly specifies that dask should use the infiniband interface)....
So, for anyone who is running into this same issue,
one way to fix this is to pass the dashboard_address='0.0.0.0'
which tells the dashboard server to listen to all network interfaces:
cluster = SLURMCluster(...., scheduler_options={"dashboard_address" :'0.0.0.0'})
or
cluster = PBSCluster(...., scheduler_options={"dashboard_address" :'0.0.0.0'})
or
cluster = NCARCluster(...., scheduler_options={"dashboard_address" :'0.0.0.0'})
Or you can wait for the next release of distributed ( I think it's going to be 2.19.1
) which will include a fix for this issue...
Last updated: May 16 2025 at 17:14 UTC