geowatch.utils.util_kwutil module

Functions that may be moved to kwutil

geowatch.utils.util_kwutil.distributed_subitems(items, max_num=None)[source]

Return a subset of items maximally distributed over the input index space. I.e. the chosen indexes maximize the space between them.

Parameters:

items (List | Dict) – an ordered indexable object

Returns:

a subset of the input with a length at most max_num.

Return type:

List | Dict

Note

Prefer using the generator variant farthest_first() instead.

Todo

  • [X] Find a better name. ChatGPT suggests using “spread”, which I

    like. Maybe spreadsort, spreadshuffle, spredtraverse? spreadtake, takespread?

  • [ ] Figure out where this lives, probably kwutil.

  • [X] maybe we should force this to be a generator? Or make a generator variant?

Example

>>> from geowatch.utils.util_kwutil import *  # NOQA
>>> items = list(range(100))
>>> max_num = 5
>>> sub_items = distributed_subitems(items, max_num)
>>> print(sub_items)
[0, 25, 50, 75, 99]

Example

>>> from geowatch.utils.util_kwutil import *  # NOQA
>>> items = {chr(i): i for i in range(ord('a'), ord('a') + 26)}
>>> max_num = 5
>>> sub_items = distributed_subitems(items, max_num)
>>> print(sub_items)
{'a': 97, 'g': 103, 'n': 110, 't': 116, 'z': 122}
geowatch.utils.util_kwutil.farthest_first(items, max_num=None, first=None)[source]

Return a subset of items maximally distributed over the input index space. I.e. the chosen indexes maximize the space between them.

Parameters:
  • items (List | Dict) – an ordered indexable object

  • first (int | None) – if specified, start with this index.

Returns:

a subset of the input with a length at most max_num.

Return type:

List | Dict

Example

>>> from geowatch.utils.util_kwutil import *  # NOQA
>>> items = list(range(100))
>>> max_num = 5
>>> sub_items = list(farthest_first(items, max_num))
>>> print(list(sub_items))
[99, 0, 50, 25, 75]

Example

>>> from geowatch.utils.util_kwutil import *  # NOQA
>>> items = {chr(i): i for i in range(ord('a'), ord('a') + 26)}
>>> max_num = 5
>>> sub_items = list(farthest_first(items, max_num))
>>> print(dict(sub_items))
{'z': 122, 'a': 97, 'n': 110, 'g': 103, 't': 116}