Efficient (dask) array indexing · python-questions

Stream: python-questions

Topic: Efficient (dask) array indexing

Rich Neale (Jul 25 2022 at 16:05):

So I have a nlat x nlon x ndays array - quite large ~3Gb and I am trying paralyze a particular indexing operation
A bit tricky to explain, but at each lat/lon point I am indexing the time array to be days of the year days=(1,2,...365) and I am rewriting that to an an equivalent sized array that is a randomized version of the days_rand(45,365,...1,3) then my operation seems simple
The key think is days and days_rand changes for each ilat/ilon.

var_new(ilat,ilon,days) = var_old(ilat,ilon,days_rand)

I do this for a multi-year array and days and days_rand are precomputed.
Dask will not let me compute like this presumably because of this.
https://docs.dask.org/en/stable/array-slicing.html
And that I cannot slice with more than one dimension (ilat,ilon are still slices I'm presuming).

So I am then .load() -ing and operating on this with an ilat,ilon for loops.
So any ideas on parallelizing this either as dask or numpy?
Thanks!
Rich

Deepak Cherian (Jul 25 2022 at 16:17):

Does vindex work? https://docs.dask.org/en/stable/generated/dask.array.Array.vindex.html#dask.array.Array.vindex

If not I would chunk so that all timesteps are in one block; and then map_blocks your permutation.

Deepak Cherian (Jul 25 2022 at 16:19):

Oh this might work: https://numpy.org/doc/stable/reference/generated/numpy.take_along_axis.html#numpy.take_along_axis

Last updated: May 16 2025 at 17:14 UTC