The NA-CORDEX climate dataset has simulation runs with daily values from several different calendars (360-day, 365 day noleap, 365 with leap), that I would like to combine into a single xarray dataset. Is anyone familiar with an example of how to do this in xarray? If it's not too difficult to do, I hope that the combined dataset does not throw away values, but pads out missing days from one calendar or the other with NaN values. I have looked online for examples of how to do this, but have not found anything yet. Thanks in advance for any pointers.
maybe something like
ds360 = ds360.assign(time=ds360.indexes["time"].asi8) ... combined = xr.merge([ds360, ...]) combined.time.attrs["units"] = "microseconds since 1970-01-01" combined = xr.decode_cf(combined)
basically you get everything on to a common reference axis. then xarray's automatic alignment will insert NaNs in the right place.
Thanks for the suggestion! I may not be able to try it right away, but my goal is to create an example notebook that demonstrates this.
the best way might be to open an issue at cftime
asking for a function to convert between calendars.
I have found pandas time handling to almost always be able to do what I need. Have you used that much? It has the best timezone handling I have seen, too.
Something like,
import pandas as pd
pd.Timestamp([your time array])
to get into pandas and then combine from there. I can be more specific as needed but this is a starting point. I know pandas isn't xarray but clearly they play really well together.
I have found pandas time handling to almost always be able to do what I need. Have you used that much? It has the best timezone handling I have seen, too.
Pandas datetime functionality works great, but unfortunately, it supports the proleptic Gregorian calendar only. As @Brian Bonnlander pointed out, he's dealing with some non-standard calendars ie. 360-day
, 365 day noleap
. cftime is likely going to be the only option
However, I think @Kristen Thyng makes a good point that many times (not always) you only need to know timestamps, and durations (i.e., distances between timestamps) are not needed. When you do not need to use durations, then pandas should be fine. If you need to use durations, then some smart handling of calendars is needed.
...Although some care probably needs to be taken to make sure that the timestamps generated for non-standard calendars actually make sense.
I am not sure it's a good idea to combine datasets with different calendars into a single xarray Datatset.
However, I think Kristen Thyng makes a good point that many times (not always) you only need to know timestamps, and durations (i.e., distances between timestamps) are not needed. When you do not need to use durations, then pandas should be fine. If you need to use durations, then some smart handling of calendars is needed.
Yes I was thinking more like this --- if you know (or can recreate) the dates, you can still combine between calendars. I wasn't aware, though, that pandas only works with one calendar, I guess I only ever use one!
Last updated: May 16 2025 at 17:14 UTC