A ticket was created about Dask workers hitting walltime but still having completed their work earlier. Because of this, they abort non-gracefully and then generate excessive email warnings.
Is there an agreed upon way to terminate Dask jobs cleanly to ensure efficient use of compute resources but also avoid those emails? See this Dask Discourse thread for related discussion and possible solution.
I know using cluster = PBSCluster(..., job_extra_directives = ['-m n'])
should fix email problem but that's not ideal.
I usually run:
client.close()
cluster.close()
at the bottom of a notebook when I'm done working to shutdown my dask workers. But I guess this requires that extra step of manually closing. It would be interesting to explore an automated solution (that doesn't involve letting the wallclock run out).
Thanks. Per some of the threads online, those commands didn't appear to always give an "error-free" exit despite yes, the commands actually closing each Dask task. I see the worker.close() command so maybe that would be helpful to try too if placed at the end of a set of work assigned to a worker. https://distributed.dask.org/en/stable/worker.html
I'll share with Fred C and maybe update here if they recommend differently.
Last updated: May 16 2025 at 17:14 UTC