Casper dask dashboard errors · jupyterlab-hub

Anyone else having problems displaying the dask dashboard after ssh-tunneling to Casper? I click on the link (e.g. /proxy/8787/status), and I get the error [ErrNo 111] Connection refused.

Brian Bonnlander (May 04 2020 at 23:16):

Anderson Banihirwe (May 04 2020 at 23:29):

Brian Bonnlander (May 04 2020 at 23:31):

Anderson Banihirwe (May 04 2020 at 23:31):

Anderson Banihirwe (May 04 2020 at 23:32):

dask.config.set({'distributed.dashboard.link': '/proxy/8787/status'})

dask.config.set({'distributed.dashboard.link': '/proxy/{port}/status'})

Brian Bonnlander (May 04 2020 at 23:33):

Anderson Banihirwe (May 04 2020 at 23:33):

Brian Bonnlander (May 04 2020 at 23:34):

Anderson Banihirwe (May 04 2020 at 23:35):

Brian Bonnlander (May 04 2020 at 23:35):

Anderson Banihirwe (May 04 2020 at 23:35):

You need the jupyter-server-proxy for the dask.config.set({'distributed.dashboard.link': '/proxy/{port}/status'}) to work

Anderson Banihirwe (May 05 2020 at 00:26):

@Brian Bonnlander, I am noticing similar behavior even when I have jupyter-server-proxy installed. What version of dask/distributed/bokeh are you running?

Brian Bonnlander (May 05 2020 at 00:47):

# packages in environment at /glade/u/home/bonnland/miniconda3/envs/lens-conversion:
#
# Name                    Version                   Build  Channel
dask                      2.15.0                     py_0    conda-forge
distributed               2.15.2           py38h32f6830_0    conda-forge
bokeh                     1.4.0            py38h32f6830_1    conda-forge

Anderson Banihirwe (May 05 2020 at 00:50):

Try downgrading dask and distributed to 2.14.0, and see if the issue goes away

Brian Bonnlander (May 05 2020 at 00:54):

So, after changing the environment, is a kernel restart sufficient to get the changes? Or do I have to restart the lab?

Anderson Banihirwe (May 05 2020 at 00:56):

Brian Bonnlander (May 05 2020 at 01:03):

I did conda install dask=2.14.0 distributed=2.14.0, which took a while to solve the environment, but...after hitting the circular 'Refresh' button on the lab and restarting the kernel, the dashboard works! Downgrading was the answer.

Anderson Banihirwe (May 05 2020 at 01:06):

Great!... Something weird is going on depending on the versions of dask/distributed/bokeh one is using. I was running into a similar issue with the following:

$ conda list dask
# packages in environment at /glade/work/abanihi/softwares/miniconda3/envs/analysis:
#
# Name                    Version                   Build  Channel
dask                      2.15.0                     py_0    conda-forge
dask-core                 2.15.0                     py_0    conda-forge
dask-jobqueue             0.7.1                      py_0    conda-forge
dask-mpi                  2.0.0                    py37_0    conda-forge

$ conda list bokeh
# packages in environment at /glade/work/abanihi/softwares/miniconda3/envs/analysis:
#
# Name                    Version                   Build  Channel
bokeh                     2.0.1            py37hc8dfbb8_0    conda-forge

Anna-Lena Deppenmeier (May 05 2020 at 16:00):

I had a similar problem recently. I wonder whether it might be useful to keep track of the working (combination) of versions in here, so that when someone runs into problems they can try downgrading / upgrading to the "tested" versions first?!

Brian Bonnlander (May 05 2020 at 16:12):

conda activate my-pangeo-environment
conda install dask=2.14.0 distributed=2.14.0 bokeh=1.4.0

Riley Brady (May 22 2020 at 16:58):

FYI this worked for me as well. After a few frustrating weeks of not seeing a dashboard. Downgraded to the same as @Brian Bonnlander and the dashboard is running again. @Anderson Banihirwe any idea what's going on here? I recall an earlier issue with bokeh but this seems like a dask thing now as well. Are the developers aware?

Anderson Banihirwe (May 22 2020 at 18:00):

There were some issues with distributed 2.15.0 and 2.15.1. However, yesterday I ran into a similar issue with the latest version (2.16.0). I haven't had time to narrow down the possible causes....

Anderson Banihirwe (May 22 2020 at 18:01):

I am going to try out different versions of distributed, bokek, and jupyter-server-proxy to see if I can come up with a combination of versions that are problematic, and then I will open an issue upstream

Anderson Banihirwe (Jul 02 2020 at 15:44):

As an update, it turns out that there were some changes in dask's distributed scheduler codebase that broke the dashboard functionality when the network interface was explicitly specified (under the hood, dask-jobqueue explicitly specifies that dask should use the infiniband interface)....

one way to fix this is to pass the dashboard_address='0.0.0.0' which tells the dashboard server to listen to all network interfaces:

cluster = SLURMCluster(...., scheduler_options={"dashboard_address" :'0.0.0.0'})

cluster = PBSCluster(...., scheduler_options={"dashboard_address" :'0.0.0.0'})

cluster = NCARCluster(...., scheduler_options={"dashboard_address" :'0.0.0.0'})

Anderson Banihirwe (Jul 02 2020 at 15:47):

Or you can wait for the next release of distributed ( I think it's going to be 2.19.1) which will include a fix for this issue...

Stream: jupyterlab-hub

Topic: Casper dask dashboard errors

Brian Bonnlander (May 04 2020 at 23:13):

Brian Bonnlander (May 04 2020 at 23:16):

Anderson Banihirwe (May 04 2020 at 23:29):

Brian Bonnlander (May 04 2020 at 23:31):

Anderson Banihirwe (May 04 2020 at 23:31):

Anderson Banihirwe (May 04 2020 at 23:32):

Anderson Banihirwe (May 04 2020 at 23:32):

Brian Bonnlander (May 04 2020 at 23:33):

Anderson Banihirwe (May 04 2020 at 23:33):

Anderson Banihirwe (May 04 2020 at 23:33):

Brian Bonnlander (May 04 2020 at 23:34):

Anderson Banihirwe (May 04 2020 at 23:35):

Brian Bonnlander (May 04 2020 at 23:35):

Anderson Banihirwe (May 04 2020 at 23:35):

Anderson Banihirwe (May 05 2020 at 00:26):

Brian Bonnlander (May 05 2020 at 00:47):

Anderson Banihirwe (May 05 2020 at 00:50):

Brian Bonnlander (May 05 2020 at 00:54):

Anderson Banihirwe (May 05 2020 at 00:56):

Brian Bonnlander (May 05 2020 at 01:03):

Anderson Banihirwe (May 05 2020 at 01:06):

Anna-Lena Deppenmeier (May 05 2020 at 16:00):

Brian Bonnlander (May 05 2020 at 16:12):

Riley Brady (May 22 2020 at 16:58):

Anderson Banihirwe (May 22 2020 at 18:00):

Anderson Banihirwe (May 22 2020 at 18:01):

Anderson Banihirwe (Jul 02 2020 at 15:44):

Anderson Banihirwe (Jul 02 2020 at 15:47):