geowatch.mlops.confusor_analysis module

This is a SMART-specific analysis of TP/FP/TN/FN site cases with visualizations.

#### LORES

DVC_DATA_DPATH=$(geowatch_dvc –tags=’phase2_data’ –hardware=hdd) python -m geowatch.mlops.confusor_analysis

–metrics_node_dpath /home/joncrall/remote/toothbrush/data/dvc-repos/smart_expt_dvc/_drop7_nowinter_baseline_joint_bas_sc/eval/flat/bas_poly_eval/bas_poly_eval_id_ec937017/ –out_dpath /home/joncrall/remote/toothbrush/data/dvc-repos/smart_expt_dvc/_drop7_nowinter_baseline_joint_bas_sc/eval/flat/bas_poly_eval/bas_poly_eval_id_ec937017/lores-confusion –true_region_dpath=”$DVC_DATA_DPATH”/annotations/drop7/region_models –true_site_dpath=”$DVC_DATA_DPATH”/annotations/drop7/site_models –viz_sites=True –reload=0

#### TEST WITH HIGHRES KWCOCO

# ON KR_R002

DVC_DATA_DPATH=$(geowatch_dvc –tags=’phase2_data’ –hardware=hdd) python -m geowatch.mlops.confusor_analysis

–metrics_node_dpath /home/joncrall/remote/toothbrush/data/dvc-repos/smart_expt_dvc/_drop7_nowinter_baseline_joint_bas_sc/eval/flat/bas_poly_eval/bas_poly_eval_id_ec937017/ –true_region_dpath=”$DVC_DATA_DPATH”/annotations/drop7/region_models –true_site_dpath=”$DVC_DATA_DPATH”/annotations/drop7/site_models –src_kwcoco=$DVC_DATA_DPATH/Aligned-Drop7/KR_R002/imgonly-KR_R002.kwcoco.zip –viz_sites=True –reload=0

#### TEST WITH AC KWCOCO

DVC_DATA_DPATH=$(geowatch_dvc –tags=’phase2_data’ –hardware=hdd) python -m geowatch.mlops.confusor_analysis

–metrics_node_dpath /home/joncrall/remote/toothbrush/data/dvc-repos/smart_expt_dvc/_demo_ac_eval/eval/flat/sc_poly_eval/sc_poly_eval_id_f689ba48 –true_region_dpath=”$DVC_DATA_DPATH”/annotations/drop7/region_models –true_site_dpath=”$DVC_DATA_DPATH”/annotations/drop7/site_models –viz_sites=True –reload=1

class geowatch.mlops.confusor_analysis.ConfusorAnalysisConfig(*args, **kwargs)[source]

Bases: DataConfig

Requires that IARPA metrics are computed

Valid options: []

Parameters:
  • *args – positional arguments for this data config

  • **kwargs – keyword arguments for this data config

default = {'ac_kwcoco': <Value(None)>, 'bas_kwcoco': <Value(None)>, 'bas_metric_dpath': <Value(None)>, 'detections_fpath': <Value(None)>, 'dst_kwcoco': <Value(None)>, 'embed': <Value(False)>, 'metrics_node_dpath': <Value(None)>, 'out_dpath': <Value(None)>, 'performer_id': <Value('kit')>, 'pred_sites': <Value(None)>, 'proposals_fpath': <Value(None)>, 'region_id': <Value(None)>, 'reload': <Value(False)>, 'src_kwcoco': <Value(None)>, 'stage_to_metrics': <Value(None)>, 'stage_to_sites': <Value(None)>, 'strict': <Value(False)>, 'true_region_dpath': <Value(None)>, 'true_site_dpath': <Value(None)>, 'viz_sites': <Value(False)>, 'viz_summary': <Value(False)>}
normalize()
geowatch.mlops.confusor_analysis.main(cmdline=1, **kwargs)[source]

CommandLine

xdoctest -m /home/joncrall/code/watch/geowatch/mlops/confusor_analysis.py main
HAS_DVC=1 xdoctest -m geowatch.mlops.confusor_analysis main:0

Example

>>> # xdoctest: +REQUIRES(env:HAS_DVC)
>>> from geowatch.mlops.confusor_analysis import *  # NOQA
>>> import geowatch
>>> data_dvc_dpath = geowatch.find_dvc_dpath(tags='phase2_data', hardware='auto')
>>> region_id = 'NZ_R001'
>>> true_site_dpath = data_dvc_dpath / 'annotations/drop6/site_models'
>>> true_region_dpath = data_dvc_dpath / 'annotations/drop6/region_models'
>>> src_kwcoco = data_dvc_dpath / f'Drop6-MeanYear10GSD/imgonly-{region_id}.kwcoco.zip'
>>> dst_kwcoco = data_dvc_dpath / f'Drop6-MeanYear10GSD/confusor-{region_id}.kwcoco.zip'
>>> dag_dpath = ub.Path('/data/joncrall/dvc-repos/smart_expt_dvc/_airflow/ta2_preeval10_pyenv_t33_post3')
>>> dpath = dag_dpath / region_id
>>> bas_metric_dpath = dpath / 'metrics/overall/bas'
>>> #bas_metric_dpath = dpath / 'local_metrics' / region_id / 'overall/bas'
>>> out_dpath = dpath / 'local_metrics'
>>> cmdline = 0
>>> kwargs = dict(
>>>     bas_metric_dpath=bas_metric_dpath,
>>>     pred_sites=(dpath / 'sc-fusion/sc_out_site_models'),
>>>     true_site_dpath=true_site_dpath,
>>>     true_region_dpath=true_region_dpath,
>>>     out_dpath=out_dpath,
>>>     dst_kwcoco=dst_kwcoco,
>>>     src_kwcoco=src_kwcoco,
>>>     region_id=region_id,
>>> )
>>> main(cmdline=cmdline, **kwargs)
class geowatch.mlops.confusor_analysis.ConfusionAnalysis(config)[source]

Bases: object

Note: this class is a refactoring of a large mono-function so its functions need to be called in a particular order.

reload()[source]

Reloads data we assume is previously written

FIXME:

be robust to the case of anyone putting bad files in these dirs

load_geojson_models()[source]

Loads the true and predicted site models

load_confusion_assignment()[source]

Load the association between true and predicted site models computed by the metrics framework.

Note

The possible confusion codes and the corresponding confusion_color are assigned in geowatch.heuristics.IARPA_CONFUSION_COLORS

load_new_stage_stuff()[source]

We should redo confusion stuff at each stage of the pipeline and determine when mistakes and good decisions are made.

add_confusion_to_geojson_models()[source]

Modify region / site models with a confusion info in their cache.

Add properties to each site model (and their associated site summaries) indicating the type of confusion they are causing based on the following “confusion specs”.

True Confusion Spec
-------------------

"cache":  {
    "confusion": {
        "true_site_id": str,          # redundant site id information,
        "pred_site_ids": List[str],   # the matching predicted site ids,
        "type": str,                  # the type of true confusion assigned by T&E
        "color": str,                 # a named color coercable via kwimage.Color.coerce
    }
}

Predicted Confusion Spec
-------------------

"cache":  {
    "confusion": {
        "pred_site_id": str,          # redundant site id information,
        "true_site_ids": List[str],   # the matching predicted site ids,
        "type": str,        # the type of predicted confusion assigned by T&E
        "color": str,       # a named color coercable via kwimage.Color.coerce
    }
}
build_hard_cases()[source]

Build feedback to retrain on. Does to things:

1. Find the false positive cases that do not overlap with any truth and add them as negative examples to a new set of “hard annotations”.

  1. Finds the false negative examples and increases their weight.

dump_confusion_geojson()[source]

Write confusion geojson files for visualization and analysis

dump_hardneg_geojson()[source]

Write new annotation file that can be fed back to the system.

dump_hardneg_kwcoco()[source]

Write kwcoco files for potential system feedback (not used atm)

dump_confusion_kwcoco()[source]

Write confusion kwcoco files for visualization and analysis

dump_summary_viz()[source]

Too slow

dump_site_case_viz()[source]

Per-site visualization for analysis and presentations.

build_site_confusion_cases()[source]

Build a set of cases that inspect the predictions of a single site.

geowatch.mlops.confusor_analysis.make_case(pred_sites, true_sites, true_utm_geoms, pred_utm_geoms, main_pred_idx, region_start_date, region_end_date, performer_id, type_)[source]

Build a dict with enough into to make a plot

geowatch.mlops.confusor_analysis.visualize_case(coco_dset, case, true_id_to_site, pred_id_to_site)[source]

Creates a visualization for a confusion case.

cases = sorted(cases, key=lambda x: x[‘time_iou’])[::-1] case = cases[1]

geowatch.mlops.confusor_analysis.make_pred_score_timeline(main_pred_site)[source]
geowatch.mlops.confusor_analysis.make_case_timeline(case)[source]

executor = ub.Executor(‘process’, max_workers=1) future = executor.submit(make_case_timeline, case) future.result()

geowatch.mlops.confusor_analysis.visualize_all_timelines(cases, coco_dset, type_to_sites, type_to_summary)[source]
geowatch.mlops.confusor_analysis.differentiate_site_id(site_id, tag)[source]
geowatch.mlops.confusor_analysis.fix_site_id(site_id, region_id, performer_id)[source]
geowatch.mlops.confusor_analysis.coco_upgrade_track_ids(coco_dset)[source]
geowatch.mlops.confusor_analysis.make_summary_visualization(dst_dset, viz_dpath)[source]
geowatch.mlops.confusor_analysis.to_styled_kml(data)[source]

Make a kml version of the geojson that works nice with QGIS

geowatch.mlops.confusor_analysis.nan_to_null(x)[source]
geowatch.mlops.confusor_analysis.safediv(a, b)[source]