is anyone else having trouble spinning up a server on casper using jupyterhub.ucar.edu? I don't see any requests in squeue...
I was able to get in a few minutes ago... but I only requested a single node
@Deepak Cherian, I am able to get in as well. Can you confirm that your gladequota
is okay?
I had trouble yesterday. Eventually it worked in the afternoon and I am not having trouble today.
yup gladequota
looks great. is there a log somewhere? THe webiste says "Spawning server..."
now it's failed with a timeout
looks great. is there a log somewhere?
The logs reside in /glade/scratch/$USER/.jupyter_logs
However, I don't think they show up until the job is up and running
now it's failed with a timeout
Are you using the default settings?
ya just tried that.
I wonder if the queue is simply choked up? Many of the Casper nodes are now on PBS—I presume this means they are unavailable via Slurm.
Do all the requests use Slurm by default?
CISL is in the process of transition Casper from all Slurm to all PBS. Perhaps @Brian Vanderwende or @mickc have some insight here.
Approximately half of the nodes have been moved from Slurm to PBS, so indeed you will see slower dispatch times than normal during busy hours until things are totally migrated to PBS. I expect that this will get better as the week goes along as folks migrate their traditional jobs.
The production JupyterHub still uses Slurm (I forgot to mention that explicitly!)
@all, in case you missed it, jupyterhub.ucar.edu now redirects to jupyterhub.hpc.ucar.edu; the Hub has been updated and there is a new interface for spawning a server (or multiple servers via the control panel).
I am having issues creating a Jupyter Notebook on Cheyenne.
When I try to create a new notebook, I am getting the error:
"Unexpected error while saving file: Untitled.ipynb attempt to write a readonly database"
I am attaching screenshots of the error message I am getting.
Screen-Shot-2021-04-12-at-10.18.07-AM.png
Screen-Shot-2021-04-12-at-10.18.13-AM.png .
I am getting this error with the JupyterHub and with interactive sessions on cheyenne. I have no issues on my laptop or on the cgd machines. This only happens on the CISL machines. I should have a permission problem somewhere but I haven't been able to figure it out. I looked on stackoverflow, but I haven't find a way to solve this problem
There are a lot of things that can look like a permission error, too. Such as not having storage space. But I'm not sure what the problem is.
@Cecile Hannay, you might want to check your quota: Kevin is right, it could simply be no disk space available.
@Kevin Paul and @Matt Long: It was a good thought but I haven't reached my quota and it is not a space issue.
Are you using the new JupyterHub? (jupyterhub.hpc.ucar.edu) And if so, are you on the Cheyenne Login
option?
I think we need clarification. The first error message you get is the Unexpected error while saving...
error, right? And then you get the second error after you click "Dismiss" on the first error message dialog box?
@Max Grover: @Cecile Hannay is seeing this on both the JHub and via self-launched (i.e., SSH tunnels) Jupyter sessions.
@Kevin Paul: This is correct. I when the first error when I try to create Notebook and the second error after clicking "dismiss" on the dialog box.
From the command line, can you try running jupyter notebook --NotebookNotary.db_file=':memory:'
there is a thread from google groups https://groups.google.com/a/continuum.io/g/anaconda/c/dGcZoFIci1k on this, but I am not sure if you have tried this. It says that it could be that you do not have write permissions to your home directory for some reason. Along with a github issue thread here https://github.com/jupyter/notebook/issues/5321
@Max Grover I am not sure at which stage I should try that command.
I either use the JupyterHub (jupyterhub.hpc.ucar.edu) or start-jupyter.
Can you open up a terminal once you are on the Jupyterhub? Then type it in there?
@Max Grover
Thanks for your reply. I tried that but I am not sure it is doing anything. I still get the same error.
I open a terminal and type the command:
Screen-Shot-2021-04-12-at-5.19.23-PM.png [Screen-Shot-2021-04-12-at-5.19.39-PM.png]
Here is what happens on the screen but I cannot click on these links.
(user_uploads/2/ef/bamQGFCIII15CuIfm7Su_C3J/Screen-Shot-2021-04-12-at-5.19.39-PM.png) [Screen-Shot-2021-04-12-at-5.20.12-PM.png]
When I try to open a Jupyter Notebook, I get the same error.
(user_uploads/2/6d/8XGKw78NoEQc-tZUdD7hhyy3/Screen-Shot-2021-04-12-at-5.20.12-PM.png)
@Cecile Hannay, I think you should send this info to CISL help. I cannot reproduce your problem on my end.
You can look in
/glade/${USER}/scratch/.jupyter_logs
Perhaps there is something useful there? (I tried to look for you, but don't have permission.)
cc @Brian Vanderwende
I have worked with @Max Grover.
From the command: jupyter notebook --NotebookNotary.db_file=':memory:'
it looks like I don't have write permissions to my home directory for jupyter Notebook.
I contacted cislhelp yesterday but I will update with this new piece of information.
We also looked at:
/glade/${USER}/scratch/.jupyter_logs
All my recent attempts didn't create any jupyter_logs.
weird. I don't have any idea what could be going wrong. Is the behavior consistent on Capser PBS Batch, Login, etc.?
Here is the same behavior on casper.
I will try to clean up to get my quota under 90% as Brian suggested.
my home directory is at 94.52%
so I wouldn't guess that that's the trouble...but it's beyond me what might be going wrong.
@Brian Vanderwende
I brought my quota to:
/glade/u/home/hannay 43.09 GB 50.00 GB 86.18 % 147355
I am still getting the same error.
@Cecile Hannay Thanks for letting me know. Which instance did you use in your most recent attempts? A JupyterHub session or a tunnel? If the Hub, which system and was it batch or login? This information will help me narrow down which logs to have the admins look at.
@Brian Vanderwende In my last attempt after reducing the quota, I tried on the JupyterHub on cheyenne.
Thanks Cecile. We are taking a look.
@Brian Vanderwende on a related topic, the new JupyterHub interface http://jupyterhub.hpc.ucar.edu/ is really slick. A couple of documentation items that might make it much more accessible:
The problem has been solved with cisl. I am posting the fix here in case someone runs into the same issue.
The problem was that the auto-created file:
~/.local/share/jupyter/nbsignatures.db
got currupted.
Because this database couldn't be accessed properly, I couldn't access/create notebooks. Erasing the file solved the problem.
Last thing: before deleting the file, you need to make sure you don't have any Jupyter sessions that have that file locked / opened.
Great to hear!
Thanks, @Cecile Hannay!
Hello, I'm having trouble accessing jupyterhub this morning. I can type my username and password, but then it brings me to a page that says "This page isn't working" . See screenshot. Would anyone be able to help me?
Screen-Shot-2021-04-15-at-10.16.23-AM.png
I had the same error two days ago. It was temporary and it worked after trying again later.
Oh ok, thanks Cecile! I'll try again in a little while
@Jared Baker is aware of these issues.
Ok, thanks... it's still not working.
What about now?
still not working..
Neato. Okay, well I'm going to go see if I can prune your entries in the state database since the API is not doing what it says it is.
ok thanks! Just let me know when I should try again...
@Kristen Krumhardt I imagine the web page if you refresh will ask you to log in again.
no it just keeps saying "This page isn't working"
no it just keeps saying "This page isn't working"
Do you get the same error message when accessing the page from a private browser window or a different browser ?
I was just able to spawn a server...
Made another change Kristen. what about now?
yes, now it's asking me sign in again
oh now it looks like it might work! says 'my server is starting up'
yes! it's working! thank you!
@Anderson Banihirwe I tried with different browsers before and it just brought me to a blank white page
Good deal, what a doozy.
but now problem solved:))
well, I just got kicked off jupyterhub and it's been stuck on this page for a couple min. Is anyone else having this issue? Screen-Shot-2021-04-15-at-1.23.36-PM.png
I tried to sign in today as unable to do so
Yes. I was trying to figure this out just now.
When I first go to jupyterhub.hpc.ucar.edu I get an unfamiliar login page
Screen-Shot-2021-04-15-at-1.25.45-PM.png
Then I get the "Your server is stopping " page as above.
I have tried rebooting, clearing browser cache etc.
I am able to connect from a shell with jupyter lab, but my Dask Dashboard hangs with the same infinite waiting page.
The hub remains unstable for me as well.
There are some runaway things now. trying to keep it alive.
Yeah, it looks like there's issues on casper-login1. The load is very high on it and my ssh login is hanging on it.
I was able to get the node wrangled back and hopefully didn't interrupt too much running through the hub. Apologies, but what a perfect storm.
thanks @Jared Baker!
Let me know if it's not working for you I suppose.
I'm still getting that "Your server is stopping" page...
were you running a casper-batch job?
yes
but then I got kicked off.. and then I tried to restart the server with another casper-batch job but it just keeps landing on this page
Okay, I have a theory on what happened here then. I'm curious if it was in the process of spawning then the hub became overwhelmed and never got the update. Do you mind if I try something interesting?
I don't mind! try anything:)
I am also seeing the same thing I did earlier today (Y"our server is stopping")
glad I'm not the only one!
I have the same issue! I've been having issues since around noon
FYI I was getting some very flaky behavior late yesterday (I thought it might be a glade issue) and just shut down for the day. Perhaps something is still running from that instance?
This feels pretty systemic to me—it's never really been stable since the PBS switch—but today has been particularly bad. I've switched to using SSH tunnels so I can get work done.
@Jared Baker, please let us know what's most helpful for you regarding testing, complaining, etc.
It was absolutely systemic. I'm not sure I can really blame PBS here. JupyterHub with the new login spawners had at one point 22k open file handles. The system security limits were preventing proper response times, then things started stacking up on the Hub's polling eventually leading up to the "hang" on casper-login1 today.
@Kristen Krumhardt I've attempted to insert a proxy route and I think I was successful on that, but I'm not sure if it gave you the ability to access your instance again. I think it's still trying to stop.
Thanks @Jared Baker! I'll try again tomorrow!
things were going smoothly for awhile this morning...but then I just lost my kernel. I am not getting any error messages, the interface has simply stopped responding.
...and now it's back. It must have choked on something for a bit
and now it's unresponsive again. I am on crhtc53. load average: 5.11, 5.59, 5.33...doesn't seem terrible.
It let me sign in this morning and then it went to "This page isn't working" , like it was yesterday morning.
Kristen, your instance seemed to have a mismatched route. I've removed it. I'm hoping :fingers_crossed: that will give you options to spawn a server again.
Yes! it worked this time!
@Matt Long I think your issue is hub agnostic. I'm not sure what might be causing the log messages "kernel interrupted" that are in the job logs (_/glade/scratch/$USER/.jupyter_logs/_). I'm going to have to look those messages closer. I honestly have no idea why a kernel may see interruptions like that; presently at least.
I can believe that. hasn't happened again. With Cheyenne and the old Hub, I found that the share queue was hard to use because of intermittent unresponsiveness. Could just be a load issue, I guess.
@Jared Baker, the hub seems pretty stable today. Thanks for all your work on it!
That's good. Made a couple background changes. Although this has given me a reason to write some tools to inspect the health of the hub as well. Enjoy your weekend!
you too! Thanks!
Seems like yesterday's jupyterhub issues are still not resolved. I clicked on "Production" from the main page and got a CIT login prompt, but then landed on the screen below. Now I get this screen when I start over and click on "Production".
Screen-Shot-2021-04-17-at-8.38.23-AM.png
I was able to spawn a session this morning, a few hours ago, and it's still running.
I was getting that same behavior earlier this week though...tried different browser and such...didn't help
@Stephen Yeager you can try again whenever. You'll need to log back in.
the hub has become unresponsive for me:
when I click "production" on https://jupyterhub.hpc.ucar.edu/, it seemingly starts to load a new page, but seems to be waiting indefinitely.
Mine seemingly loaded just fine. I'll go check state, then it'll be to the logs.
Hub is not working for me - clicking Production at https://jupyterhub.hpc.ucar.edu/ just spins going on an hour now - tried 2 different machines and 3 different browsers. Matt says it works for him. Any suggestions, or ideas why the experience is user dependent? Thanks!
My guess is that the Hub is either bogged down with users and can't respond to more requests, or there is something in your browser cache that needs to be cleared out. But that's just a guess.
Thanks Kevin - CISL says "We have had to do some work on the JupyterHub login system since yesterday evening. I believe this is impacting your login ability. We will be continuing to work on it tomorrow and a notice will be sent out soon detailing the downtime while work is being done." Odd that it only affects some users.
It only affects some users because the hub DoS'ed one of the login nodes and to get it restored, we had to block things at a network level rather than a host-level. I'm cleaning it up, but unfortunately it's a slow process. I have some potential workarounds to restore access if you'd like to pursue that?
a workaround would be great, thanks, and thanks for the explanation too
@Britt Stephens might be back now.
thanks - I can log in, but can't open any notebooks (existing or new) - you likely already know, but I have a help ticket open that Daniel Howard has been iterating with me on
If you refresh the page, it will ask that you re-login at this point, but I think it'll be back to normal
thanks Jared - unfortunately same behavior - trying to open an existing notebook gives "File Load Error for cmip6-sno-compute.ipynb Unhandled error" and trying to start a new notebook gives "Launcher Error Cannot read property 'path' of undefined"
That's a new one for me.
So I think the error is a red herring. You're $HOME is at 100%.
Can we move something to the scratch filesystem to check?
maybe the SOCO2_210309.tar file?
ah, great catch - that worked - sorry I didn't notice that before - thanks a lot for the help!
Last updated: May 16 2025 at 17:14 UTC