geowatch.cli.coco_combine_features module

Combine kwcoco files with different “auxiliary” / “asset” features into a single kwcoco file.

class geowatch.cli.coco_combine_features.CocoCombineFeatures(*args, **kwargs)[source]

Bases: DataConfig

Combine kwcoco files with different “auxiliary” / “asset” features into a single kwcoco file.

The names of the kwcoco images in all of the input src datasets must be the same.

Todo

  • [ ] This might go in kwcoco proper? This could be folded into “union”

Valid options: []

Parameters:
  • *args – positional arguments for this data config

  • **kwargs – keyword arguments for this data config

default = {'absolute': <Value(False)>, 'dst': <Value(None)>, 'io_workers': <Value('avail')>, 'src': <Value([])>}
geowatch.cli.coco_combine_features.main(cmdline=True, **kwargs)[source]

Example

>>> from geowatch.cli import coco_combine_features
>>> import geowatch
>>> dset = geowatch.coerce_kwcoco('geowatch-msi')
>>> dpath = ub.Path.appdir('geowatch/tests/combine_fetures').ensuredir()
>>> # Breakup the data into two parts with different features
>>> dset1 = dset.copy()
>>> dset2 = dset.copy()
>>> dset1.fpath = dpath / 'part1.kwcoco.json'
>>> dset2.fpath = dpath / 'part2.kwcoco.json'
>>> # Remove all but the first asset from dset1
>>> for coco_img in dset1.images().coco_images:
...     del coco_img.img['auxiliary'][1:]
>>> # Remove the first asset from dset2
>>> for coco_img in dset2.images().coco_images:
...     del coco_img.img['auxiliary'][0]
>>> dset1.dump()
>>> dset2.dump()
>>> from geowatch.utils import kwcoco_extensions
>>> chan_stats0 = kwcoco_extensions.coco_channel_stats(dset)['chan_hist']
>>> chan_stats1 = kwcoco_extensions.coco_channel_stats(dset1)['chan_hist']
>>> chan_stats2 = kwcoco_extensions.coco_channel_stats(dset2)['chan_hist']
>>> assert chan_stats1 != chan_stats0, 'channels should be different'
>>> # Combining the two modified kwcoco files should result in the original
>>> dst_fpath = dpath / 'combo.kwcoco.json'
>>> kwargs = {
>>>     'src': [str(dset1.fpath), str(dset2.fpath)],
>>>     'dst': str(dst_fpath),
>>> }
>>> cmdline = 0
>>> coco_combine_features.main(cmdline=cmdline, **kwargs)
>>> dst_dset = geowatch.coerce_kwcoco(dst_fpath)
>>> chan_stats3 = kwcoco_extensions.coco_channel_stats(dst_dset)['chan_hist']
>>> assert chan_stats3 == chan_stats0, (
>>>     'combine features should have the same as the original dset')

Example

>>> # xdoctest: +REQUIRES(env:DVC_DPATH)
>>> # xdoctest: +SKIP
>>> # drop1-S2-L8-aligned-old deprecated
>>> from geowatch.cli.coco_combine_features import *  # NOQA
>>> import os
>>> _default = ub.expandpath('$HOME/data/dvc-repos/smart_watch_dvc')
>>> dvc_dpath = ub.Path(os.environ.get('DVC_DPATH', _default))
>>> fpath1 = dvc_dpath / 'drop1-S2-L8-aligned/data.kwcoco.json'
>>> #fpath1 = dvc_dpath / 'drop1-S2-L8-aligned-old/data.kwcoco.json'
>>> fpath2 = dvc_dpath / 'drop1-S2-L8-aligned-old/uky_invariants.kwcoco.json'
>>> fpath3 = dvc_dpath / 'drop1-S2-L8-aligned/_testcombo.kwcoco.json'
>>> assert fpath1.exists()
>>> assert fpath2.exists()
>>> cmdline = False
>>> kwargs = {
>>>     'src': [str(fpath1), str(fpath2)],
>>>     'dst': str(fpath3),
>>> }
>>> main(cmdline, **kwargs)
geowatch.cli.coco_combine_features.combine_auxiliary_features(dst_dset, src_dsets)[source]

Copies all non-existing assets from src_dsets into dst_dset.

Updates each image in dst_dset with all non-existing asset (as determined by the ‘channels’ attribute) in each corresponding image in each src_dsets.

Parameters:
  • dst_dset (kwcoco.CocoDataset) – modified inplace

  • src_dsets (List[kwcoco.CocoDataset])

Returns:

returns input dst_dset.

Return type:

kwcoco.CocoDataset

Example

>>> from geowatch.cli.coco_combine_features import *  # NOQA
>>> import kwcoco
>>> base = kwcoco.CocoDataset.demo('vidshapes8-multispectral')
>>> dset1 = base.copy()
>>> dset2 = base.copy()
>>> dset3 = base.copy()
>>> dset4 = base.copy()
>>> for img in dset1.index.imgs.values():
>>>     del img['auxiliary'][0::3]
>>> for img in dset2.index.imgs.values():
>>>     del img['auxiliary'][1::3]
>>> dset2.remove_images([2, 3])
>>> for img in dset3.index.imgs.values():
>>>     del img['auxiliary'][2::3]
>>> dset3.remove_images([2, 3])
>>> for img in dset4.index.imgs.values():
>>>     del img['auxiliary'][0::2]
>>> dset4.remove_images([2, 3])
>>> dst_dset = dset1
>>> src_dsets = [dset2, dset3, dset4]
>>> for img in dset1.index.imgs.values():
...     assert len(img['auxiliary']) != 5
>>> dst_dset = combine_auxiliary_features(dst_dset, src_dsets)
>>> lens1 = list(map(len, dset1.images(set(dset1.imgs) - {2, 3}).lookup('auxiliary')))
>>> assert ub.allsame([5] + lens1)
>>> lens2 = list(map(len, dset1.images({2, 3}).lookup('auxiliary')))
>>> assert ub.allsame([3] + lens2)
geowatch.cli.coco_combine_features.associate_images(dset1, dset2)[source]

Get image ids for images in two datasets that share the same name.

This is a hueristic for getting pairs of images that correspond between two datasets.

Parameters:
  • dset1 (kwcoco.CocoDataset)

  • dset2 (kwcoco.CocoDataset)

Return type:

Tuple[List[int], List[int], Dict]