Stream: dask

Topic: Casper PBS


view this post on Zulip Max Grover (Apr 06 2021 at 23:24):

@all Here is a post detailing how to get started using Dask with PBSCluster on Casper through the new Jupyterhub which launches tomorrow https://ncar.github.io/esds/posts/casper_pbs_dask/

view this post on Zulip Dafydd Stephenson (Aug 21 2021 at 00:25):

If anybody has this working, I'm trying to do it currently but am unsure of the order of things. Any help much appreciated!

Currently I'm starting a JupyterHub session (e.g. Casper batch, 2 nodes, 16 CPUs per node, 100GiB per node) and then running a version of the example code (from the GitHub page) in the resulting notebook. But it seems strange to me to be providing all the session information again rather than just, say, a Job ID from which this could be grabbed. Additionally the PBSCluster documentation suggests that it is setting up a new job from scratch (e.g. passing arguments to #PBS), rather than using the one I already have. Am I doing things the right way around, or asking for a whole new job after already starting one? And - apologies for two questions in one post! - what is the significance of cluster.scale(2) ?

view this post on Zulip Daniel Kennedy (Aug 24 2021 at 20:01):

Hi Dafydd,
I am using that function to get dask workers for my notebooks.

I think of those dask clusters as associated with a specific notebook rather than with a JupyterHub session. So I don't ask for my computational resources when I am logging into JupyterHub, I just adjust the wallclock. Then I ask for the required number of CPUs within a given notebook with that PBSCluster sequence.

The cluster.scale() function goes and gets your extra computational cores. cluster.scale(2) would deliver 2 CPUS.

But I'm interested to hear if there are other approaches out there..

view this post on Zulip Dafydd Stephenson (Aug 24 2021 at 21:35):

Hi Daniel,
This makes sense, thanks! I've managed to get this behaving more as expected by launching the notebook on the login node and then requesting the dask workers on the compute nodes from within the notebook. I suppose a related question would then be how to get dask to use an existing set of resources (say for running a python script to process output at the end of a model run which is already using many nodes/cpus), but that's probably one for another stream.

view this post on Zulip Anderson Banihirwe (Aug 24 2021 at 22:34):

@Dafydd Stephenson,

I suppose a related question would then be how to get dask to use an existing set of resources (say for running a python script to process output at the end of a model run which is already using many nodes/cpus), but that's probably one for another stream.

You may find dask-mpi to be useful for this kind of setup: https://mpi.dask.org/en/latest/

view this post on Zulip Dafydd Stephenson (Aug 24 2021 at 22:36):

You may find dask-mpi to be useful for this kind of setup: https://mpi.dask.org/en/latest/

This looks exactly like what I'm looking for and seems very straightforward to set up. Thanks!


Last updated: Jan 30 2022 at 12:01 UTC