geowatch.utils.util_param_grid module

Handles github actions like parameter matrices

The main function of interest here is expand_param_grid() and its underlying workhorse: extended_github_action_matrix().

geowatch.utils.util_param_grid.handle_yaml_grid(default, auto, arg)[source]

Unused

Example

>>> from geowatch.utils.util_param_grid import *  # NOQA
>>> default = {}
>>> auto = {}
>>> arg = ub.codeblock(
>>>     '''
>>>     matrix:
>>>         foo: ['bar', 'baz']
>>>     include:
>>>         - {'foo': 'buz', 'bug': 'boop'}
>>>     ''')
>>> grid = handle_yaml_grid(default, auto, arg)
>>> print(f'grid = {ub.urepr(grid, nl=1)}')
>>> default = {'baz': [1, 2, 3]}
>>> arg = '''
>>>     include:
>>>     - {
>>>       "thresh": 0.1,
>>>       "morph_kernel": 3,
>>>       "norm_ord": 1,
>>>       "agg_fn": "probs",
>>>       "thresh_hysteresis": "None",
>>>       "moving_window_size": "None",
>>>       "polygon_fn": "heatmaps_to_polys"
>>>     }
>>>     '''
>>> handle_yaml_grid(default, auto, arg)
geowatch.utils.util_param_grid.coerce_list_of_action_matrices(arg)[source]

Preprocess the parameter grid input into a standard form

CommandLine

xdoctest -m geowatch.utils.util_param_grid coerce_list_of_action_matrices

Example

>>> from geowatch.utils.util_param_grid import *  # NOQA
>>> arg = ub.codeblock(
    '''
    matrices:
      - matrix:
            foo: bar
      - matrix:
            foo: baz
    '''
    )
>>> arg = coerce_list_of_action_matrices(arg)
>>> print(arg)
>>> assert len(arg) == 2
geowatch.utils.util_param_grid.prevalidate_param_grid(arg)[source]

Determine if something may go wrong

geowatch.utils.util_param_grid.expand_param_grid(arg, max_configs=None)[source]

Our own method for specifying many combinations. Uses the github actions method under the hood with our own

Parameters:
  • arg (str | Dict) – text or parsed yaml that defines the grid. Handled by coerce_list_of_action_matrices().

  • max_configs (int | None) – if specified restrict to generating at most this number of configs. NOTE: may be removed in the future to reduce complexity. It is easy enough to get this behavior with itertools.islice().

Yields:

dict – a concrete item from the grid

Example

>>> from geowatch.utils.util_param_grid import *  # NOQA
>>> arg = ub.codeblock(
    '''
    - matrix:
        trk.pxl.model: [trk_a, trk_b]
        trk.pxl.data.tta_time: [0, 4]
        trk.pxl.data.set_cover_algo: [None, approx]
        trk.pxl.data.test_dataset: [D4_S2_L8]

act.pxl.model: [act_a, act_b] act.pxl.data.test_dataset: [D4_WV_PD, D4_WV] act.pxl.data.input_space_scale: [1GSD, 4GSD]

trk.poly.thresh: [0.17] act.poly.thresh: [0.13]

exclude:

# # The BAS A should not run with tta - trk.pxl.model: trk_a

trk.pxl.data.tta_time: 4

# The BAS B should not run without tta - trk.pxl.model: trk_b

trk.pxl.data.tta_time: 0

# # The SC B should not run on the PD dataset when GSD is 1 - act.pxl.model: act_b

act.pxl.data.test_dataset: D4_WV_PD act.pxl.data.input_space_scale: 1GSD

# The SC A should not run on the WV dataset when GSD is 4 - act.pxl.model: act_a

act.pxl.data.test_dataset: D4_WV act.pxl.data.input_space_scale: 4GSD

# # The The BAS A and SC B model should not run together - trk.pxl.model: trk_a

act.pxl.model: act_b

# Other misc exclusions to make the output cleaner - trk.pxl.model: trk_b

act.pxl.data.input_space_scale: 4GSD

  • trk.pxl.data.set_cover_algo: None act.pxl.data.input_space_scale: 1GSD

include:

# only try the 10GSD scale for trk model A - trk.pxl.model: trk_a

trk.pxl.data.input_space_scale: 10GSD

‘’’)

>>> grid_items = list(expand_param_grid(arg))
>>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1, sort=0)))
>>> from geowatch.utils.util_dotdict import dotdict_to_nested
>>> print(ub.urepr([dotdict_to_nested(p) for p in grid_items], nl=-3, sort=0))
>>> print(len(grid_items))
geowatch.utils.util_param_grid.github_action_matrix(arg)[source]

Implements the github action matrix strategy exactly as described.

Unless I’ve implemented something incorrectly, I believe this method is limited and have extended it in extended_github_action_matrix().

Parameters:

arg (Dict | str) – a dictionary or a yaml file that resolves to a dictionary containing the keys “matrix”, which maps parameters to a list of possible values. For convinieince if a single scalar value is detected it is converted to a list of 1 item. The matrix may also include an “include” and “exclude” item, which are lists of dictionaries that modify existing / add new matrix configurations or remove them. The “include” and “exclude” parameter can also be specified at the same level of “matrix” for convinience.

Yields:

dict – a single entry in the grid.

References

https://docs.github.com/en/actions/using-jobs/using-a-matrix-for-your-jobs#expanding-or-adding-matrix-configurations

CommandLine

xdoctest -m geowatch.utils.util_param_grid github_action_matrix:2

Example

>>> from geowatch.utils.util_param_grid import *  # NOQA
>>> arg = ub.codeblock(
         '''
           matrix:
             fruit: [apple, pear]
             animal: [cat, dog]
             include:
               - color: green
               - color: pink
                 animal: cat
               - fruit: apple
                 shape: circle
               - fruit: banana
               - fruit: banana
                 animal: cat
         ''')
>>> grid_items = list(github_action_matrix(arg))
>>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1)))
grid_items = [
    {'fruit': 'apple', 'animal': 'cat', 'color': 'pink', 'shape': 'circle'},
    {'fruit': 'apple', 'animal': 'dog', 'color': 'green', 'shape': 'circle'},
    {'fruit': 'pear', 'animal': 'cat', 'color': 'pink'},
    {'fruit': 'pear', 'animal': 'dog', 'color': 'green'},
    {'fruit': 'banana'},
    {'fruit': 'banana', 'animal': 'cat'},
]

Example

>>> from geowatch.utils.util_param_grid import *  # NOQA
>>> arg = ub.codeblock(
        '''
          matrix:
            os: [macos-latest, windows-latest]
            version: [12, 14, 16]
            environment: [staging, production]
            exclude:
              - os: macos-latest
                version: 12
                environment: production
              - os: windows-latest
                version: 16
    ''')
>>> grid_items = list(github_action_matrix(arg))
>>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1)))
grid_items = [
    {'os': 'macos-latest', 'version': 12, 'environment': 'staging'},
    {'os': 'macos-latest', 'version': 14, 'environment': 'staging'},
    {'os': 'macos-latest', 'version': 14, 'environment': 'production'},
    {'os': 'macos-latest', 'version': 16, 'environment': 'staging'},
    {'os': 'macos-latest', 'version': 16, 'environment': 'production'},
    {'os': 'windows-latest', 'version': 12, 'environment': 'staging'},
    {'os': 'windows-latest', 'version': 12, 'environment': 'production'},
    {'os': 'windows-latest', 'version': 14, 'environment': 'staging'},
    {'os': 'windows-latest', 'version': 14, 'environment': 'production'},
]

Example

>>> from geowatch.utils.util_param_grid import *  # NOQA
>>> arg = ub.codeblock(
         '''
         matrix:
           old_variable:
               - null
               - auto
         include:
             - old_variable: null
               new_variable: 1
             - old_variable: null
               new_variable: 2
         ''')
>>> grid_items = list(github_action_matrix(arg))
>>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1)))
geowatch.utils.util_param_grid.extended_github_action_matrix(arg)[source]

A variant of the github action matrix for our mlops framework that overcomes some of the former limitations.

This keeps the same weird include / exclude semantics, but adds an additional “submatrix” component that has the following semantics.

A submatrices is a list of dictionaries, but each dictionary may have more than one value, and are expanded into a list of items, similarly to a dictionary. In this respect the submatrix is “resolved” to a list of dictionary items just like “include”. The difference is that when a common elements of a submatrix grid item matches a matrix grid item, it updates it with its new values and yields it immediately. Subsequent submatrix grid items can yield different variations of this item. The actions include rules are then applied on top of this.

Parameters:

arg (Dict | str) – See github_action_matrix, but with new submatrices

Yields:

dict – a single entry in the grid.

CommandLine

xdoctest -m geowatch.utils.util_param_grid extended_github_action_matrix:2

Example

>>> from geowatch.utils.util_param_grid import *  # NOQA
>>> from geowatch.utils import util_param_grid
>>> arg = ub.codeblock(
         '''
           matrix:
             fruit: [apple, pear]
             animal: [cat, dog]
             submatrices1:
               - color: green
               - color: pink
                 animal: cat
               - fruit: apple
                 shape: circle
               - fruit: banana
               - fruit: banana
                 animal: cat
         ''')
>>> grid_items = list(extended_github_action_matrix(arg))
>>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1)))

Example

>>> from geowatch.utils.util_param_grid import *  # NOQA
>>> arg = ub.codeblock(
        '''
          matrix:
            os: [macos-latest, windows-latest]
            version: [12, 14, 16]
            environment: [staging, production]
            exclude:
              - os: macos-latest
                version: 12
                environment: production
              - os: windows-latest
                version: 16
    ''')
>>> grid_items = list(extended_github_action_matrix(arg))
>>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1)))

Example

>>> from geowatch.utils.util_param_grid import *  # NOQA
>>> from geowatch.utils import util_param_grid
>>> # Specifying an explicit list of things to run
>>> arg = ub.codeblock(
         '''
         submatrices:
            - common_variable: a
              old_variable: a
            - common_variable: a
              old_variable: null
              new_variable: 1
            - common_variable: a
              old_variable: null
              new_variable: 11
            - common_variable: a
              old_variable: null
              new_variable: 2
            - common_variable: b
              old_variable: null
              new_variable: 22
         ''')
>>> grid_items = list(extended_github_action_matrix(arg))
>>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1)))
>>> assert len(grid_items) == 5

Example

>>> from geowatch.utils.util_param_grid import *  # NOQA
>>> from geowatch.utils import util_param_grid
>>> arg = ub.codeblock(
         '''
         matrix:
           common_variable:
               - a
               - b
           old_variable:
               - null
               - auto
         submatrices:
             - old_variable: null
               new_variable1:
                   - 1
                   - 2
               new_variable2:
                   - 3
                   - 4
             - old_variable: null
               new_variable2:
                   - 33
                   - 44
             # These wont be used because blag doesn't exist
             - old_variable: blag
               new_variable:
                   - 10
                   - 20
         ''')
>>> grid_items = list(extended_github_action_matrix(arg))
>>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1)))
>>> assert len(grid_items) == 14

Example

>>> from geowatch.utils.util_param_grid import *  # NOQA
>>> from geowatch.utils import util_param_grid
>>> arg = ub.codeblock(
         '''
         matrix:
           step1.src:
               - dset1
               - dset2
               - dset3
               - dset4
           step1.resolution:
               - 10
               - 20
               - 30
         submatrices1:
            - step1.resolution: 10
              step2.resolution: [10, 15]
            - step1.resolution: 20
              step2.resolution: 20
         submatrices2:
            - step1.src: dset1
              step2.src: big_dset1A
            - step1.src: dset2
              step2.src:
                 - big_dset2A
                 - big_dset2B
            - step1.src: dset3
              step2.src: big_dset3A
         ''')
>>> grid_items = list(extended_github_action_matrix(arg))
>>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1)))
>>> assert len(grid_items) == 20