geowatch.mlops.aggregate module¶
Loads results from an evaluation, aggregates them, and reports text or visual results.
This is the main entry point for the mlops.aggregate CLI. It contains the logic to consolidate rows of results into macro averages and compute a parameter hashid (param_hashid) for each row. It also contains the basic text report logic (although maybe that should be moved out?). It relies on several other files in this directory
aggregate_loader.py - handles the loading of individual rows from mlops output
aggregate_plots.py - handles plotting relationships between parameters and metrics
- smart_global_helper.py - quick and dirty project specific stuff that ideally wont
get in the way of general use-cases but should eventually be factored out.
Todo
- [ ] The package_fpath (i.e. model_cols) reporting does heuristics to
shorten the path to the package, but we shouldn’t do this. We should make a new column that indicates it is a shortened name for the model, otherwise it is confusing.
- class geowatch.mlops.aggregate.AggregateLoader(*args, **kwargs)[source]¶
Bases:
DataConfig
Base config that will be mixed in to the
AggregateEvluationConfig
. This config just defines parts related to constructing theAggregator
objects (i.e. loading the tables).Valid options: []
- Parameters:
*args – positional arguments for this data config
**kwargs – keyword arguments for this data config
- default = {'cache_resolved_results': <Value(True)>, 'display_metric_cols': <Value('auto')>, 'eval_nodes': <Value(None)>, 'io_workers': <Value('avail')>, 'pipeline': <Value('joint_bas_sc')>, 'primary_metric_cols': <Value('auto')>, 'target': <Value(None)>}¶
- normalize()¶
- class geowatch.mlops.aggregate.AggregateEvluationConfig(*args, **kwargs)[source]¶
Bases:
AggregateLoader
Aggregates results from multiple DAG evaluations.
Valid options: []
- Parameters:
*args – positional arguments for this data config
**kwargs – keyword arguments for this data config
- default = {'cache_resolved_results': <Value(True)>, 'custom_query': <Value(None)>, 'display_metric_cols': <Value('auto')>, 'embed': <Value(False)>, 'eval_nodes': <Value(None)>, 'export_tables': <Value(False)>, 'inspect': <Value(None)>, 'io_workers': <Value('avail')>, 'output_dpath': <Value('./aggregate')>, 'pipeline': <Value('joint_bas_sc')>, 'plot_params': <Value(False)>, 'primary_metric_cols': <Value('auto')>, 'query': <Value(None)>, 'resource_report': <Value(False)>, 'rois': <Value('auto')>, 'snapshot': <Value(False)>, 'stdout_report': <Value(True)>, 'symlink_results': <Value(False)>, 'target': <Value(None)>}¶
- main(**kwargs)¶
Aggregate entry point.
Loads results for each evaluation node_type, constructs aggregator objects, and then executes user specified commands that could include filtering, macro-averaging, reporting, plotting, etc…
- normalize()¶
- geowatch.mlops.aggregate.main(cmdline=True, **kwargs)[source]¶
Aggregate entry point.
Loads results for each evaluation node_type, constructs aggregator objects, and then executes user specified commands that could include filtering, macro-averaging, reporting, plotting, etc…
- class geowatch.mlops.aggregate.TopResultsReport(region_id_to_summary, top_param_lut)[source]¶
Bases:
object
Object to hold the result of
Aggregator.report_best()
.
- class geowatch.mlops.aggregate.AggregatorAnalysisMixin[source]¶
Bases:
object
Analysis methods for
Aggregator
.- varied_parameter_report(concise=True, concise_value_char_threshold=80)[source]¶
Dump a machine and human readable varied parameter report.
- Parameters:
concise (bool) – if True, sacrifice row homogeneity for shorter encodings
- analyze(metrics_of_interest=None)[source]¶
Does a stats analysis on each varied parameter. Note this makes independence assumptions that may not hold in general.
- report_best(top_k=100, shorten=True, per_group=None, verbose=1, reference_region=None, print_models=False, concise=False, show_csv=False, grouptop=None) TopResultsReport [source]¶
Report the top k pointwise results for each region / macro-region.
Note
Results are chosen per-region independently. To get comparable results for a specific set of parameters choose a
reference_region
, which could be a macro region.- Parameters:
top_k (int) – number of top results for each region
shorten (bool) – if True, shorten the columns by removing non-ambiguous prefixes wrt to a known node eval_type.
concise (bool) – if True, remove certain columns that communicate context for a more concise report.
reference_region (str | None) – if specified filter the top results in all other regions to only be with respect to the top results in this region (or macro region). Can be set to the special key “final” to choose the last region, which is typically a macro region.
show_csv (bool) – also print as a CSV suitable for copy/paste into google sheets.
grouptop (str | List[str]) – if specified, these are a list of columns that a “suboptimized”, which means that we group the table by these columns (e.g. the model column) and then only consider the “best” scoring results within these groups. This can help remove clutter if attempting to choose between a specific parameter.
Todo
This might need to become a class that builds the TopResultsReport as it is getting somewhat complex.
- Returns:
contains: region_id_to_summary (T1=Dict[str, DataFrame]):
mapping from region_id to top k results
- top_param_lut (T2=Dict[str, DataFrame]):
mapping from param hash to invocation details
- Return type:
Example
>>> from geowatch.mlops.aggregate import * # NOQA >>> agg = Aggregator.demo(rng=0, num=100).build() >>> agg.report_best(print_models=True, top_k=3) >>> agg.report_best(print_models=True, top_k=3, grouptop='special:model') >>> agg.report_best(print_models=True, top_k=3, grouptop='special:model', reference_region='region1')
- class geowatch.mlops.aggregate.Aggregator(table, output_dpath=None, node_type=None, primary_metric_cols='auto', display_metric_cols='auto', dag=None)[source]¶
Bases:
NiceRepr
,AggregatorAnalysisMixin
,_AggregatorDeprecatedMixin
Stores multiple data frames that separate metrics, parameters, and other information using consistent pandas indexing. Can be filtered to a comparable subsets of choice. Can also handle building macro averaged results over different “regions” with the same parameters.
Set config based on your problem
Example
>>> from geowatch.mlops.aggregate import * # NOQA >>> agg = Aggregator.demo(rng=0, num=3).build() >>> print(f'agg.config = {ub.urepr(agg.config, nl=1)}') >>> print('--- The table of only metrics ---') >>> print(agg.metrics) >>> print('--- The table of resource utilization ---') >>> print(agg.resources) >>> print('--- The table of explicitly requested hyperparameters (to distinguish from defaults) ---') >>> print(agg.resolved_params) >>> print('--- The table of resolved hyperparameters ---') >>> print(agg.resolved_params) >>> print('--- The table with unique indexes for each experiment ---') >>> print(agg.index) >>> print('--- The entire joined table ---') >>> print(agg.table)
- Parameters:
table (pandas.DataFrame) – a table with a specific column structure (e.g. built by the aggregate_loader). See the demo for an example. Needs more docs here.
output_dpath (None | PathLike) – Path where output aggregate results should be written
node_type (str | None) – should not need to specify this anymore. This should just be the “node” column in the table.
primary_metric_cols (List[str] | Literal[‘auto’]) – if “auto”, then the “node_type” must be known by the global helpers. Otherwise list the metric columns in the priority that should be used to rank the rows.
display_metric_cols (List[str] | Literal[‘auto’]) – if “auto”, then the “node_type” must be known by the global helpers. Otherwise list the metric columns in the order they should be displayed (after the primary metrics).
dag (geowatch.mlops.Pipeline) – The pipeline that the evaluation table corresponds to. Only needed if introspection if necessary. If all “auto” params are specified, this should not be needed.
- classmethod demo(num=10, rng=None)[source]¶
Construct a demo aggregator for testing.
This gives an example of the very particular column format that is expected as input the the aggregator.
- Parameters:
num (int) – number of rows
rng (int | None) – random number generator / state
- Returns:
Aggregator
Example
>>> from geowatch.mlops.aggregate import * # NOQA >>> agg = Aggregator.demo(rng=0, num=100) >>> print(agg.table) >>> agg.build() >>> agg.analyze() >>> agg.resource_summary_table() >>> agg.report_best()
- build()[source]¶
Inspect the aggregator’s table and build supporting information
- Returns:
returns self for method chaining
- Return type:
Self
- property primary_macro_region¶
- filterto(index=None, models=None, param_hashids=None, query=None)[source]¶
Build a new aggregator with a subset of rows from this one.
- Parameters:
index (List | pd.Index) – a subset of pandas row indexes to restrict to
models (List[str]) – list of effective model names (not paths) to restrict to.
param_hashids (List[str]) – list of parameter hashids to restrict to
query (str) – A custom query string currently parsed
our_hack_query()
, which can either be a DataFrame.query or a simple eval usingdf
as the dataframe variable (i.e.agg.table
) that should resolve to flags or indexes to indicates which rows to take. See the example for demo usage.
- Returns:
A new aggregator with a subset of data
- Return type:
Example
>>> from geowatch.mlops.aggregate import * # NOQA >>> agg = Aggregator.demo(rng=0, num=100) >>> agg.build() >>> subagg = agg.filterto(query='df["context.demo_node.uuid"].str.startswith("c")') >>> assert len(subagg) > 0, 'query should return something' >>> assert subagg.table['context.demo_node.uuid'].str.startswith('c').all() >>> assert not agg.table['context.demo_node.uuid'].str.startswith('c').all() >>> print(subagg.table['context.demo_node.uuid'])
- FIXME:
On 2024-02-12 CI failed this test with. Not sure where non-determinisim came from. assert len(subagg) > 0, ‘query should return something’ AssertionError: query should return something
Another instance on 2024-04-19. Job log is: https://gitlab.kitware.com/computer-vision/geowatch/-/jobs/9652752
This is likely because of unseeded UUIDs, which should now be fixed.
- property metrics¶
- property resources¶
- property index¶
- property requested_params¶
- property specified_params¶
- property resolved_params¶
- property default_vantage_points¶
- build_effective_params()[source]¶
Consolodate / cleanup / expand information
THIS COMPUTES THE
param_hashid
COLUMN!The “effective params” normalize the full set of given parameters so we can compute more consistent param_hashid. This is done by condensing paths (which is a debatable design decision) as well as mapping non-hashable data to strings.
Populates:
self.hashid_to_effective_params
: Dict[str, Dict[str, Any]]self.mappings
self.effective_params
- find_macro_comparable(verbose=0)[source]¶
Search for groups that have the same parameters over multiple regions.
We determine if two columns have the same parameters by using the param_hashid, so the details of how that is computed (and which parameters are ignored when computing it - e.g. paths to datasets) has a big impact on the behavior of this function.
- SeeAlso:
Aggregator.build_effective_params()
-the method that determines what parameters go into the param_hashid, and how to normalize them.
- gather_macro_compatable_groups(regions_of_interest)[source]¶
Given a set of ROIs, find groups in the comparable regions that contain all of the requested ROIs.
- build_single_macro_table(rois, average='mean')[source]¶
Builds a single macro table for a choice of regions.
A macro table is a table of paramters and metrics macro averaged over multiple regions of interest.
There is some hard-coded values in this function, but the core idea is general, and they just need to be parameterized correctly.
- Parameters:
rois (List[str]) – names of regions to average together
average (str) – mean or gmean
- Return type:
DataFrame | None
- geowatch.mlops.aggregate.aggregate_param_cols(df, aggregator=None, hash_cols=None, allow_nonuniform=False)[source]¶
Aggregates parameter columns. Specified hash_cols should be dataset-specific columns to be hashed. All other columns should be effectively the same, otherwise we will warn.
- Parameters:
hash_cols (None | List[str]) – columns whos values should be hashed together.
- Returns:
a single row representing the combined rows
- Return type:
Todo
[ ] optimize this
[ ] Rectify with ~/code/watch/geowatch/utils/util_pandas.py :: aggregate_columns
Example
>>> from geowatch.mlops.aggregate import * # NOQA >>> import pandas as pd >>> agg = Aggregator.demo(num=3) >>> agg.build() >>> df = pd.concat([agg.table] * 3).reset_index() >>> import scipy.stats.mstats >>> gmean = scipy.stats.mstats.gmean >>> aggregator = {'metrics.demo_node.metric1': gmean} >>> hash_cols = 'param_hashid' >>> allow_nonuniform = True >>> hash_cols = ['region_id'] + agg.test_dset_cols >>> agg_row = aggregate_param_cols(df, aggregator=aggregator, hash_cols=hash_cols, allow_nonuniform=allow_nonuniform) >>> print(agg_row)
- geowatch.mlops.aggregate.macro_aggregate(agg, group, aggregator, average='mean')[source]¶
Helper function