Stream: python-questions
Topic: counting nans per month
Anna-Lena Deppenmeier (Jun 07 2021 at 22:42):
Hi all, I'm trying to count the nans in my dataset per month. I have previously counted nans in the entire dataset like this:
nans_wdt = xr.ufuncs.isnan(ds_wdt_iso.wdt_iso).sum(dim='time')
but now would like to do this per month. Using groupby('time.month')
doesn't work unfortunately:
AttributeError: 'DatasetGroupBy' object has no attribute '_unary_op'
I can count per month with this (where mon
stands for the months I'm currently counting): xr.ufuncs.isnan(ds_wdt_iso.sel(time=ds_wdt_iso.wdt_iso.time.dt.month.isin([mon]))).sum(dim='time')
,
but then the only way I see to get all 12 months is to loop -- can someone help with a more elegant solution?
Deepak Cherian (Jun 07 2021 at 23:15):
ds.groupby("time.month").map(lambda g: g.isnull().sum("time"))
Anna-Lena Deppenmeier (Jun 08 2021 at 15:16):
cool I'll try it and learn more about map and lambda (I noticed it saved the day already with Danica's problem)
Deepak Cherian (Jun 08 2021 at 15:44):
A non-lambda version is
def count_null(obj): return obj.isnull().sum("time") ds.groupby("time.month").map(count_null). # applies count_null to each group
They're equivalent, for such short functions that are used only once it's just a lot more convenient to use a lambda (especially if you're lazy like me)
Brian Bonnlander (Jun 08 2021 at 19:59):
Coding "best practices" can be endlessly debated, but I follow the general rule of avoiding the use of lambda if you're writing code that other people in the wider community will want to run also. One of python's potential advantages is that ideas can be expressed in short, simple pieces of code that can be understood quickly.
Danica Lombardozzi (Jun 08 2021 at 20:30):
@Brian Bonnlander I do not fully understand lambda
functionality (it was also provided in an example to me earlier this week), but can you explain why it might be problematic when sharing code more broadly?
Brian Bonnlander (Jun 08 2021 at 20:42):
I didn't mean anything super profound. Lambda functions are not strictly necessary in most computer languages. They act as a form of convenience. Lambdas are "unnamed" or "ephemeral" functions which can be convenient when you need to use a function once, you never want to use it again.
Most places where lambdas are used, you could instead define a function first (as above with the count_null()
function). Function definitions are nice because you can give them English names that describe what they do. Most people find that helpful.
It always takes me longer to understand what someone is trying to do with their code if many steps are happening on a single line. Lambdas kind of require you to produce longer coding expressions, and that makes it harder for others to quickly determine the code's intent.
Danica Lombardozzi (Jun 08 2021 at 20:45):
Thanks for this clear explanation! It's very helpful for me to understand lambda and best coding practices.
Last updated: Jan 30 2022 at 12:01 UTC