Stream: general

Topic: summing up fields in a xarray


view this post on Zulip Jean-Francois Lamarque (Apr 29 2021 at 20:29):

Is there a way to sum up fields within a single xarray? Like all the 3-d fields in this. And I want to just add the fields without knowing their names

<xarray.Dataset>
Dimensions: (lat: 96, lon: 144, time: 324)
Coordinates:

* lon (lon) float32 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5
* lat (lat) float32 -90.0 -88.10526 -86.210526 ... 88.10526 90.0
* time (time) datetime64[ns] 1850-01-15 1850-02-15 ... 2100-12-15
Data variables:
bb (time, lat, lon) float32 ...
anthro (time, lat, lon) float32 ...
date (time) int32 ...
gridbox_area (lat, lon) float32 ...
volcano (time, lat, lon) float32 ..

view this post on Zulip Matt Long (Apr 29 2021 at 20:36):

here's a toy example

ds = xr.Dataset({
    'a': xr.DataArray(np.ones(10), dims=('x')),
    'b': xr.DataArray(np.ones(10), dims=('x')),
    'c': xr.DataArray(np.ones(10), dims=('x')),
})

varnames = [v for v in ds.data_vars if ds[v].dims == ('x',)]

total = xr.full_like(ds[varnames[0]], fill_value=0.)
for v in varnames:
    total += ds[v]

view this post on Zulip Matt Long (Apr 29 2021 at 20:37):

in your case, ds.data_vars will include gridbox_area, so you want to filter out the variables that don't have the right dims

view this post on Zulip Jean-Francois Lamarque (Apr 29 2021 at 20:44):

Looks like what I need. Thanks!

view this post on Zulip Deepak Cherian (Apr 29 2021 at 21:32):

If you get rid of date and gridbox_area, then ds.to_array("variable").sum("variable") will do what you want. It will stack all "data variables" into a new array along a new dimension named "variable". Then sum along the "variable" dimension

view this post on Zulip Jean-Francois Lamarque (Apr 29 2021 at 21:55):

That's interesting approach, Deepak. Not all files have those fields so it is a little bit of a mess and therefore querying has to happen, either to remove or to sum them.

view this post on Zulip Matt Long (Apr 29 2021 at 22:36):

you could combine the solutions:

varnames = [v for v in ds.data_vars if ds[v].dims == ('x',)]
ds[varnames].to_array('variable').sum('variable')

Last updated: Jan 30 2022 at 12:01 UTC