This is more of a general question for Cheyenne/Casper rather than jupyter specific. Does anyone have a sense of what causes such sluggish pip install
times sometimes? I use conda
for my environment, but when developing e.g. project-specific packages of code, I need to run pip install . --upgrade
on the main package directory.
Some days this seems to be instantaneous. Other days it will take minutes to run. Any thoughts on this in general or how to speed it up? Today is a day that it's taking minutes. It seems to happen on the login node and compute nodes.
It seems to happen on the login node and compute nodes.
This is likely due to some GLADE/filesystem issue
Can you try running pip install
with the --verbose
option on to see if there is any useful information about the problem
pip install . --upgrade --verbose
P.S.: Apparently the --verbose
option is additive. Beware :grinning:
$ pip --help
-v, --verbose Give more output. Option is additive, and can be
used up to 3 times.
@Anderson Banihirwe, the --verbose
option doesn't give enlightening information. It hangs here for 30s-1min (sometimes longer):
$ pip install . --upgrade --verbose Non-user install because site-packages writeable Created temporary directory: /glade/scratch/rbrady/tmp/pip-ephem-wheel-cache-qljvs7i7 Created temporary directory: /glade/scratch/rbrady/tmp/pip-req-tracker-dzd7io56 Initialized build tracking at /glade/scratch/rbrady/tmp/pip-req-tracker-dzd7io56 Created build tracker: /glade/scratch/rbrady/tmp/pip-req-tracker-dzd7io56 Entered build tracker: /glade/scratch/rbrady/tmp/pip-req-tracker-dzd7io56 Created temporary directory: /glade/scratch/rbrady/tmp/pip-install-deditwzg Processing /glade/work/rbrady/projects/carbonpathways Created temporary directory: /glade/scratch/rbrady/tmp/pip-req-build-7vwn7mxe
Then cranks through this:
Processing /glade/work/rbrady/projects/carbonpathways Created temporary directory: /glade/scratch/rbrady/tmp/pip-req-build-7vwn7mxe Added file:///glade/work/rbrady/projects/carbonpathways to build tracker '/glade/scratch/rbrady/tmp/pip-req-tracker-dzd7io56' Running setup.py (path:/glade/scratch/rbrady/tmp/pip-req-build-7vwn7mxe/setup.py) egg_info for package from file:///glade/work/rbrady/projects/carbonpathways Running command python setup.py egg_info running egg_info creating /glade/scratch/rbrady/tmp/pip-req-build-7vwn7mxe/pip-egg-info/carbonpathways.egg-info writing /glade/scratch/rbrady/tmp/pip-req-build-7vwn7mxe/pip-egg-info/carbonpathways.egg-info/PKG-INFO writing dependency_links to /glade/scratch/rbrady/tmp/pip-req-build-7vwn7mxe/pip-egg-info/carbonpathways.egg-info/dependency_links.txt writing top-level names to /glade/scratch/rbrady/tmp/pip-req-build-7vwn7mxe/pip-egg-info/carbonpathways.egg-info/top_level.txt writing manifest file '/glade/scratch/rbrady/tmp/pip-req-build-7vwn7mxe/pip-egg-info/carbonpathways.egg-info/SOURCES.txt' reading manifest file '/glade/scratch/rbrady/tmp/pip-req-build-7vwn7mxe/pip-egg-info/carbonpathways.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' writing manifest file '/glade/scratch/rbrady/tmp/pip-req-build-7vwn7mxe/pip-egg-info/carbonpathways.egg-info/SOURCES.txt' Source in /glade/scratch/rbrady/tmp/pip-req-build-7vwn7mxe has version 0.1.0, which satisfies requirement carbonpathways==0.1.0 from file:///glade/work/rbrady/projects/carbonpathways Removed carbonpathways==0.1.0 from file:///glade/work/rbrady/projects/carbonpathways from build tracker '/glade/scratch/rbrady/tmp/pip-req-tracker-dzd7io56' Building wheels for collected packages: carbonpathways Created temporary directory: /glade/scratch/rbrady/tmp/pip-wheel-8jdqqqq1 Building wheel for carbonpathways (setup.py) ... Destination directory: /glade/scratch/rbrady/tmp/pip-wheel-8jdqqqq1 Running command /glade/work/rbrady/miniconda3/envs/carbonpathways/bin/python3.8 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/glade/scratch/rbrady/tmp/pip-req-build-7vwn7mxe/setup.py'"'"'; __file__='"'"'/glade/scratch/rbrady/tmp/pip-req-build-7vwn7mxe/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /glade/scratch/rbrady/tmp/pip-wheel-8jdqqqq1 running bdist_wheel running build running build_py creating build creating build/lib creating build/lib/carbonpathways copying carbonpathways/memory.py -> build/lib/carbonpathways copying carbonpathways/parallel.py -> build/lib/carbonpathways copying carbonpathways/__init__.py -> build/lib/carbonpathways copying carbonpathways/regions.py -> build/lib/carbonpathways copying carbonpathways/subset.py -> build/lib/carbonpathways copying carbonpathways/preprocess.py -> build/lib/carbonpathways creating build/lib/carbonpathways/visualization copying carbonpathways/visualization/visualize.py -> build/lib/carbonpathways/visualization copying carbonpathways/visualization/__init__.py -> build/lib/carbonpathways/visualization creating build/lib/carbonpathways/data copying carbonpathways/data/make_dataset.py -> build/lib/carbonpathways/data copying carbonpathways/data/__init__.py -> build/lib/carbonpathways/data running egg_info creating carbonpathways.egg-info writing carbonpathways.egg-info/PKG-INFO writing dependency_links to carbonpathways.egg-info/dependency_links.txt writing top-level names to carbonpathways.egg-info/top_level.txt writing manifest file 'carbonpathways.egg-info/SOURCES.txt' reading manifest file 'carbonpathways.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' writing manifest file 'carbonpathways.egg-info/SOURCES.txt' copying carbonpathways/particle_test_file.nc -> build/lib/carbonpathways installing to build/bdist.linux-x86_64/wheel running install running install_lib creating build/bdist.linux-x86_64 creating build/bdist.linux-x86_64/wheel creating build/bdist.linux-x86_64/wheel/carbonpathways copying build/lib/carbonpathways/memory.py -> build/bdist.linux-x86_64/wheel/carbonpathways copying build/lib/carbonpathways/parallel.py -> build/bdist.linux-x86_64/wheel/carbonpathways copying build/lib/carbonpathways/__init__.py -> build/bdist.linux-x86_64/wheel/carbonpathways copying build/lib/carbonpathways/regions.py -> build/bdist.linux-x86_64/wheel/carbonpathways copying build/lib/carbonpathways/particle_test_file.nc -> build/bdist.linux-x86_64/wheel/carbonpathways copying build/lib/carbonpathways/subset.py -> build/bdist.linux-x86_64/wheel/carbonpathways creating build/bdist.linux-x86_64/wheel/carbonpathways/visualization copying build/lib/carbonpathways/visualization/visualize.py -> build/bdist.linux-x86_64/wheel/carbonpathways/visualization copying build/lib/carbonpathways/visualization/__init__.py -> build/bdist.linux-x86_64/wheel/carbonpathways/visualization creating build/bdist.linux-x86_64/wheel/carbonpathways/data copying build/lib/carbonpathways/data/make_dataset.py -> build/bdist.linux-x86_64/wheel/carbonpathways/data copying build/lib/carbonpathways/data/__init__.py -> build/bdist.linux-x86_64/wheel/carbonpathways/data copying build/lib/carbonpathways/preprocess.py -> build/bdist.linux-x86_64/wheel/carbonpathways running install_egg_info Copying carbonpathways.egg-info to build/bdist.linux-x86_64/wheel/carbonpathways-0.1.0-py3.8.egg-info running install_scripts adding license file "LICENSE" (matched pattern "LICEN[CS]E*") creating build/bdist.linux-x86_64/wheel/carbonpathways-0.1.0.dist-info/WHEEL creating '/glade/scratch/rbrady/tmp/pip-wheel-8jdqqqq1/carbonpathways-0.1.0-py3-none-any.whl' and adding 'build/bdist.linux-x86_64/wheel' to it adding 'carbonpathways/__init__.py' adding 'carbonpathways/memory.py' adding 'carbonpathways/parallel.py' adding 'carbonpathways/particle_test_file.nc' adding 'carbonpathways/preprocess.py' adding 'carbonpathways/regions.py' adding 'carbonpathways/subset.py' adding 'carbonpathways/data/__init__.py' adding 'carbonpathways/data/make_dataset.py' adding 'carbonpathways/visualization/__init__.py' adding 'carbonpathways/visualization/visualize.py' adding 'carbonpathways-0.1.0.dist-info/LICENSE' adding 'carbonpathways-0.1.0.dist-info/METADATA' adding 'carbonpathways-0.1.0.dist-info/WHEEL' adding 'carbonpathways-0.1.0.dist-info/top_level.txt' adding 'carbonpathways-0.1.0.dist-info/RECORD' removing build/bdist.linux-x86_64/wheel done Created wheel for carbonpathways: filename=carbonpathways-0.1.0-py3-none-any.whl size=52801 sha256=0e3e90d90d10e3861cdd426b74b357a77785d693cbc22aae06885f3fc32983b0 Stored in directory: /glade/scratch/rbrady/tmp/pip-ephem-wheel-cache-qljvs7i7/wheels/64/2a/a6/9dc322f41f7002c714ffef0f74000ba1384978c0591cfd84be Successfully built carbonpathways Installing collected packages: carbonpathways Attempting uninstall: carbonpathways Found existing installation: carbonpathways 0.1.0 Uninstalling carbonpathways-0.1.0: Created temporary directory: /glade/work/rbrady/miniconda3/envs/carbonpathways/lib/python3.8/site-packages/~arbonpathways-0.1.0.dist-info Removing file or directory /glade/work/rbrady/miniconda3/envs/carbonpathways/lib/python3.8/site-packages/carbonpathways-0.1.0.dist-info/ Created temporary directory: /glade/work/rbrady/miniconda3/envs/carbonpathways/lib/python3.8/site-packages/~arbonpathways Removing file or directory /glade/work/rbrady/miniconda3/envs/carbonpathways/lib/python3.8/site-packages/carbonpathways/ Successfully uninstalled carbonpathways-0.1.0 Created temporary directory: /glade/scratch/rbrady/tmp/pip-unpacked-wheel-7qxfk_no Successfully installed carbonpathways-0.1.0 Cleaning up... Removing source in /glade/scratch/rbrady/tmp/pip-req-build-7vwn7mxe Removed build tracker: '/glade/scratch/rbrady/tmp/pip-req-tracker-dzd7io56'
This is about as lightweight of a package as you can get. I notice that some days it will install in order seconds. Other days, minutes. Also, what does additive mean in this case? Just for my own knowledge.
pip
are you running?time pip install . --upgrade
?--upgrade
flag?Additive might mean that --verbose ---verbose
will give you even more information.
Additive might mean that
--verbose ---verbose
will give you even more information.
Yep.. and the short version looks like: pip install -vvv ....
--verbose
implies temporary things are being installed) is at 9% full. Work, where pip/conda installs to is 57% full. The interior carbonpathways
folder with python code is 252 kb. Although the main folder is 66GB since I have a ./data
folder with some post-processed output for now. That's not included whatsoever in the setup.py
file so I figured it ignored that kind of stuff.--upgrade
overwrote the current installation. As opposed to doing uninstall then reinstall. Also this is independent of cheyenne or casper node.
I think even with that large /data
folder I have some days where this sort of install works in a second or two.
I guess I'm not changing the version name, but I am adding code and modules. So I figured --upgrade overwrote the current installation. As opposed to doing uninstall then reinstall.
-e
, or --editable
option might be a better alternative to --upgrade
i.e. pip install . -e
. When using -e
option, pip will just link the package to the original location, meaning any changes to the original package would reflect directly in your environment.
Hm. I need to read up on that. That installs instantaneously but isn't working with autoreload
. In the pip install . --upgrade
case, I don't have to restart my notebook. If I run pip install -e .
with and without --upgrade
I don't get updates to functions in my notebook.
It takes 1 min 6s to install.
I am assuming this is the real (wall clock) time. When you get a chance, can you post the full output of time pip install . --upgrade
? When I first asked for this, I was going for the user
, and sys
times as well.
Hm. I need to read up on that. That installs instantaneously but isn't working with
autoreload
. In thepip install . --upgrade
case, I don't have to restart my notebook. If I runpip install -e .
with and without--upgrade
I don't get updates to functions in my notebook.
Ooooh I see... I didn't know you were using the autoreload
magic as well.. %autoreload
has some caveats...
real 1m6.161s user 0m1.715s sys 1m2.335s
I'm using autoreload
since I'm working with dask_jobqueue
and don't want to have to kill and restart all my workers every time I update my local package.
As I suspected, -e
is way faster:
$ time pip install -e . Obtaining file:///glade/scratch/abanihi/carbonpathways Installing collected packages: carbonpathways Running setup.py develop for carbonpathways Successfully installed carbonpathways real 0m4.519s user 0m1.941s sys 0m0.639s
My takeaway from this is that you either have to trade pip install . --upgrade
speed for the flexibility provided by %autoreload
magic or go with pip install -e .
at the expense of having to rerun your notebook from scratch :frown:
Thanks for the input! Wasn't sure if I was missing something. I'll ping you in this thread if a day comes up soon where the timings are drastically different. If that doesn't happen for awhile maybe I'll try moving my data
folder out of there to see if setup
is somehow including it. Although I don't think that's the case.
By the way, the data
directory may be the culprit here....
During pip install . --upgrade
, you will notice that pip creates a temporary directory
It then copies everything from carbonpathways
main directory into this temporary directory
Hm I'll move that tomorrow and see what happens. I'm following more of a cookiecutter repo format (https://github.com/bradyrx/cookiecutter-climate-science) to keep everything nice and together . Data isn't going up to git of course but is nice to have consolidated there rather than in scratch. So maybe I should forego that for speed.
Removing the data
directory reduces the wall clock time to
real 0m5.541s user 0m1.894s sys 0m0.863s
That's a huge improvement compared to the original
real 1m6.161s user 0m1.715s sys 1m2.335s
Hm, okay. That's it then. i'm wondering if there's some flag or way to have pip ignore certain sub-directorries. Because it is convenient to keep my post-proc data there for the project.
So, you may want to move data
somewhere else
Data isn't going up to git of course but is nice to have consolidated there rather than in scratch
could you keep data elsewhere on /glade/work
and then softlink it in your git clone?
That's a good idea @Michael Levy . I'll just do a soft link into my repo. Thanks! Well, I'll check that pip doesn't try to copy that linked directory.
Well, I'll check that pip doesn't try to copy that linked directory.
yeah, I don't know pip
well enough to know what it'll do, but :fingers_crossed:
Another solution is to edit your MANIFEST.in
file and add the following line
prune data*
I haven't tested this yet though
it does not work
Give python setup.py install
a try
You may not need to move your data
directory after all
I was thinking about that instead of pip! Will give that a try tomorrow and let you know.
I ended up going down a rabbit hole, and I think it paid off... :grinning:
Good News:
If you upgrade to pip>=20.1
, you should be good... It appears that this issue (https://github.com/pypa/pip/issues/2195) was addressed in https://github.com/pypa/pip/pull/7882
I tested it, and here's what I got
real 0m4.362s user 0m1.857s sys 0m0.553s
That's it for me for today :grinning: .... I won't spam this stream/topic again at least for today...
Works great, thanks so much @Anderson Banihirwe. The old update-the-package trick. Well, if you just do standard upgrade it only seems to go to 20.0.2 or so. So one does have to force >=20.1
in the conda environment.
real 0m1.833s user 0m1.387s sys 0m0.388s
Last updated: May 16 2025 at 17:14 UTC