geowatch.utils.util_param_grid module¶
Handles github actions like parameter matrices
The main function of interest here is expand_param_grid()
and
its underlying workhorse: extended_github_action_matrix()
.
- geowatch.utils.util_param_grid.handle_yaml_grid(default, auto, arg)[source]¶
Unused
Example
>>> from geowatch.utils.util_param_grid import * # NOQA >>> default = {} >>> auto = {} >>> arg = ub.codeblock( >>> ''' >>> matrix: >>> foo: ['bar', 'baz'] >>> include: >>> - {'foo': 'buz', 'bug': 'boop'} >>> ''') >>> grid = handle_yaml_grid(default, auto, arg) >>> print(f'grid = {ub.urepr(grid, nl=1)}')
>>> default = {'baz': [1, 2, 3]} >>> arg = ''' >>> include: >>> - { >>> "thresh": 0.1, >>> "morph_kernel": 3, >>> "norm_ord": 1, >>> "agg_fn": "probs", >>> "thresh_hysteresis": "None", >>> "moving_window_size": "None", >>> "polygon_fn": "heatmaps_to_polys" >>> } >>> ''' >>> handle_yaml_grid(default, auto, arg)
- geowatch.utils.util_param_grid.coerce_list_of_action_matrices(arg)[source]¶
Preprocess the parameter grid input into a standard form
CommandLine
xdoctest -m geowatch.utils.util_param_grid coerce_list_of_action_matrices
Example
>>> from geowatch.utils.util_param_grid import * # NOQA >>> arg = ub.codeblock( ''' matrices: - matrix: foo: bar - matrix: foo: baz ''' ) >>> arg = coerce_list_of_action_matrices(arg) >>> print(arg) >>> assert len(arg) == 2
- geowatch.utils.util_param_grid.prevalidate_param_grid(arg)[source]¶
Determine if something may go wrong
- geowatch.utils.util_param_grid.expand_param_grid(arg, max_configs=None)[source]¶
Our own method for specifying many combinations. Uses the github actions method under the hood with our own
- Parameters:
arg (str | Dict) – text or parsed yaml that defines the grid. Handled by
coerce_list_of_action_matrices()
.max_configs (int | None) – if specified restrict to generating at most this number of configs. NOTE: may be removed in the future to reduce complexity. It is easy enough to get this behavior with
itertools.islice()
.
- Yields:
dict – a concrete item from the grid
Example
>>> from geowatch.utils.util_param_grid import * # NOQA >>> arg = ub.codeblock( ''' - matrix: trk.pxl.model: [trk_a, trk_b] trk.pxl.data.tta_time: [0, 4] trk.pxl.data.set_cover_algo: [None, approx] trk.pxl.data.test_dataset: [D4_S2_L8]
act.pxl.model: [act_a, act_b] act.pxl.data.test_dataset: [D4_WV_PD, D4_WV] act.pxl.data.input_space_scale: [1GSD, 4GSD]
trk.poly.thresh: [0.17] act.poly.thresh: [0.13]
- exclude:
# # The BAS A should not run with tta - trk.pxl.model: trk_a
trk.pxl.data.tta_time: 4
# The BAS B should not run without tta - trk.pxl.model: trk_b
trk.pxl.data.tta_time: 0
# # The SC B should not run on the PD dataset when GSD is 1 - act.pxl.model: act_b
act.pxl.data.test_dataset: D4_WV_PD act.pxl.data.input_space_scale: 1GSD
# The SC A should not run on the WV dataset when GSD is 4 - act.pxl.model: act_a
act.pxl.data.test_dataset: D4_WV act.pxl.data.input_space_scale: 4GSD
# # The The BAS A and SC B model should not run together - trk.pxl.model: trk_a
act.pxl.model: act_b
# Other misc exclusions to make the output cleaner - trk.pxl.model: trk_b
act.pxl.data.input_space_scale: 4GSD
trk.pxl.data.set_cover_algo: None act.pxl.data.input_space_scale: 1GSD
- include:
# only try the 10GSD scale for trk model A - trk.pxl.model: trk_a
trk.pxl.data.input_space_scale: 10GSD
‘’’)
>>> grid_items = list(expand_param_grid(arg)) >>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1, sort=0))) >>> from geowatch.utils.util_dotdict import dotdict_to_nested >>> print(ub.urepr([dotdict_to_nested(p) for p in grid_items], nl=-3, sort=0)) >>> print(len(grid_items))
- geowatch.utils.util_param_grid.github_action_matrix(arg)[source]¶
Implements the github action matrix strategy exactly as described.
Unless I’ve implemented something incorrectly, I believe this method is limited and have extended it in
extended_github_action_matrix()
.- Parameters:
arg (Dict | str) – a dictionary or a yaml file that resolves to a dictionary containing the keys “matrix”, which maps parameters to a list of possible values. For convinieince if a single scalar value is detected it is converted to a list of 1 item. The matrix may also include an “include” and “exclude” item, which are lists of dictionaries that modify existing / add new matrix configurations or remove them. The “include” and “exclude” parameter can also be specified at the same level of “matrix” for convinience.
- Yields:
dict – a single entry in the grid.
References
CommandLine
xdoctest -m geowatch.utils.util_param_grid github_action_matrix:2
Example
>>> from geowatch.utils.util_param_grid import * # NOQA >>> arg = ub.codeblock( ''' matrix: fruit: [apple, pear] animal: [cat, dog] include: - color: green - color: pink animal: cat - fruit: apple shape: circle - fruit: banana - fruit: banana animal: cat ''') >>> grid_items = list(github_action_matrix(arg)) >>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1))) grid_items = [ {'fruit': 'apple', 'animal': 'cat', 'color': 'pink', 'shape': 'circle'}, {'fruit': 'apple', 'animal': 'dog', 'color': 'green', 'shape': 'circle'}, {'fruit': 'pear', 'animal': 'cat', 'color': 'pink'}, {'fruit': 'pear', 'animal': 'dog', 'color': 'green'}, {'fruit': 'banana'}, {'fruit': 'banana', 'animal': 'cat'}, ]
Example
>>> from geowatch.utils.util_param_grid import * # NOQA >>> arg = ub.codeblock( ''' matrix: os: [macos-latest, windows-latest] version: [12, 14, 16] environment: [staging, production] exclude: - os: macos-latest version: 12 environment: production - os: windows-latest version: 16 ''') >>> grid_items = list(github_action_matrix(arg)) >>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1))) grid_items = [ {'os': 'macos-latest', 'version': 12, 'environment': 'staging'}, {'os': 'macos-latest', 'version': 14, 'environment': 'staging'}, {'os': 'macos-latest', 'version': 14, 'environment': 'production'}, {'os': 'macos-latest', 'version': 16, 'environment': 'staging'}, {'os': 'macos-latest', 'version': 16, 'environment': 'production'}, {'os': 'windows-latest', 'version': 12, 'environment': 'staging'}, {'os': 'windows-latest', 'version': 12, 'environment': 'production'}, {'os': 'windows-latest', 'version': 14, 'environment': 'staging'}, {'os': 'windows-latest', 'version': 14, 'environment': 'production'}, ]
Example
>>> from geowatch.utils.util_param_grid import * # NOQA >>> arg = ub.codeblock( ''' matrix: old_variable: - null - auto include: - old_variable: null new_variable: 1 - old_variable: null new_variable: 2 ''') >>> grid_items = list(github_action_matrix(arg)) >>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1)))
- geowatch.utils.util_param_grid.extended_github_action_matrix(arg)[source]¶
A variant of the github action matrix for our mlops framework that overcomes some of the former limitations.
This keeps the same weird include / exclude semantics, but adds an additional “submatrix” component that has the following semantics.
A submatrices is a list of dictionaries, but each dictionary may have more than one value, and are expanded into a list of items, similarly to a dictionary. In this respect the submatrix is “resolved” to a list of dictionary items just like “include”. The difference is that when a common elements of a submatrix grid item matches a matrix grid item, it updates it with its new values and yields it immediately. Subsequent submatrix grid items can yield different variations of this item. The actions include rules are then applied on top of this.
- Parameters:
arg (Dict | str) – See github_action_matrix, but with new submatrices
- Yields:
dict – a single entry in the grid.
CommandLine
xdoctest -m geowatch.utils.util_param_grid extended_github_action_matrix:2
Example
>>> from geowatch.utils.util_param_grid import * # NOQA >>> from geowatch.utils import util_param_grid >>> arg = ub.codeblock( ''' matrix: fruit: [apple, pear] animal: [cat, dog] submatrices1: - color: green - color: pink animal: cat - fruit: apple shape: circle - fruit: banana - fruit: banana animal: cat ''') >>> grid_items = list(extended_github_action_matrix(arg)) >>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1)))
Example
>>> from geowatch.utils.util_param_grid import * # NOQA >>> arg = ub.codeblock( ''' matrix: os: [macos-latest, windows-latest] version: [12, 14, 16] environment: [staging, production] exclude: - os: macos-latest version: 12 environment: production - os: windows-latest version: 16 ''') >>> grid_items = list(extended_github_action_matrix(arg)) >>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1)))
Example
>>> from geowatch.utils.util_param_grid import * # NOQA >>> from geowatch.utils import util_param_grid >>> # Specifying an explicit list of things to run >>> arg = ub.codeblock( ''' submatrices: - common_variable: a old_variable: a - common_variable: a old_variable: null new_variable: 1 - common_variable: a old_variable: null new_variable: 11 - common_variable: a old_variable: null new_variable: 2 - common_variable: b old_variable: null new_variable: 22 ''') >>> grid_items = list(extended_github_action_matrix(arg)) >>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1))) >>> assert len(grid_items) == 5
Example
>>> from geowatch.utils.util_param_grid import * # NOQA >>> from geowatch.utils import util_param_grid >>> arg = ub.codeblock( ''' matrix: common_variable: - a - b old_variable: - null - auto submatrices: - old_variable: null new_variable1: - 1 - 2 new_variable2: - 3 - 4 - old_variable: null new_variable2: - 33 - 44 # These wont be used because blag doesn't exist - old_variable: blag new_variable: - 10 - 20 ''') >>> grid_items = list(extended_github_action_matrix(arg)) >>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1))) >>> assert len(grid_items) == 14
Example
>>> from geowatch.utils.util_param_grid import * # NOQA >>> from geowatch.utils import util_param_grid >>> arg = ub.codeblock( ''' matrix: step1.src: - dset1 - dset2 - dset3 - dset4 step1.resolution: - 10 - 20 - 30 submatrices1: - step1.resolution: 10 step2.resolution: [10, 15] - step1.resolution: 20 step2.resolution: 20 submatrices2: - step1.src: dset1 step2.src: big_dset1A - step1.src: dset2 step2.src: - big_dset2A - big_dset2B - step1.src: dset3 step2.src: big_dset3A ''') >>> grid_items = list(extended_github_action_matrix(arg)) >>> print('grid_items = {}'.format(ub.urepr(grid_items, nl=1))) >>> assert len(grid_items) == 20