Hello, I'm trying to get a dask cluster going but it's a command that worked earlier today is now giving an error related to /glade/scratch saying permission denied. I assume this is because /glade/scratch is no more. I'm using the PBSCluster command from the dask_jobqueue. Is it possible that something is hardcoded in here to use /glade/scratch and that this needs to be updated?
I'm also getting that same error
I was able to solve this by modifying the following file
~/.config/dask/ncar-jobqueue.yaml
Update the following lines under casper-dav
log-directory: '/glade/derecho/scratch/${USER}/dask/casper-dav/logs'
local-directory: '/glade/derecho/scratch/${USER}/dask/casper-dav/local-dir'
That fixed it! Thank you so much, Gustavo!! :)
Hmm, this is not resolving the issue for me. I didn't have an ncar-jobqueue.yml file in ~/.config/dasks. I had a jobqueue.yml file and I changed all the occurrences of /glade/scratch in there to /glade/derecho/scratch. That didn't work. I then copied over Gustavo's ncar-jobqueue.yml file into ~/.config/dask and that still didn't work. I restarted jupyterhub each time. Any other thoughts?
@Isla Simpson Yes /glade/scratch
is not longer available. Your dask default settings are either in ~/.config/dask/ncar-jobqueue.yaml (as @Gustavo M Marques suggested ) or in ~/.dask/jobqueueu.yml.
For you specifically, I checked, and you need to update the local directory or log directory and job extra arguments in the following file:
cat ~/.dask/jobqueue.yaml
distributed:
comm:
compression: null
scheduler:
bandwidth: 1000000000
worker:
memory:
pause: 0.8
spill: false
target: 0.9
terminate: 0.95
jobqueue:
pbs:
cores: 36
interface: ib0
job-extra: []
local-directory: /glade/scratch/islas
log-directory: /glade/scratch/islas
memory: 109GB
name: dask-worker
processes: 1
queue: regular
resource-spec: select=1:ncpus=36:mem=109GB
walltime: 01:00:00
slurm:
cores: 1
interface: ib0
job-extra:
- -C casper
- -o /glade/scratch/islas/dask-worker.o%J
- -e /glade/scratch/islas/dask-worker.e%J
local-directory: /glade/scratch/islas
log-directory: /glade/scratch/islas
memory: 25GB
name: dask-worker
processes: 1
walltime: 06:00:00
Also, I noticed the default values in this file are not very optimal. I would argue the following values instead:
pbs:
cores: 1
interface: ext
job-extra: []
local-directory: /glade/derecho/scratch/islas
log-directory: /glade/derecho/scratch/islas
memory: 4GiB
name: dask-worker
processes: 1
queue: casper
resource-spec: select=1:ncpus=1:mem=4GB
walltime: 01:00:00
I would also remove slurm arguments too as we are not using it. Please let m know if you have any questions or concerns on this. :-)
Oh, I see. I have too jobqueue.yaml files. On in ~/.dask and one in ~/config/dask and I only changed the one in ~/.config/dask. I've changed the correct one now and I'm not getting the error any more. Thanks a lot!
Last updated: May 16 2025 at 17:14 UTC