geowatch.cli.smartflow_ingress module

class geowatch.cli.smartflow_ingress.SmartflowIngressConfig(*args, **kwargs)[source]

Bases: DataConfig

Ingress KWCOCO data to T&E baseline framework structure

Valid options: []

Parameters:
  • *args – positional arguments for this data config

  • **kwargs – keyword arguments for this data config

default = {'assets': <Value(None)>, 'aws_profile': <Value(None)>, 'dont_error_on_missing_asset': <Value(False)>, 'dryrun': <Value(False)>, 'input_path': <Value(None)>, 'outdir': <Value(None)>, 'show_progress': <Value(False)>}
geowatch.cli.smartflow_ingress.main()[source]
geowatch.cli.smartflow_ingress.smartflow_ingress(input_path, assets, outdir, aws_profile=None, dryrun=False, show_progress=False, dont_error_on_missing_asset=False)[source]

Downloads a STAC manifest and select items within it.

Parameters:
  • input_path (str) – The path in the s3 bucket that the STAC item will be downloaded from.

  • assets (List[str | Dict]) – A List of keys into the stac item assets that we will download. Can also be a list of dictionaries that must contain a "key": <str> item, as well as other options to control behavior, like "allow_missing": True.

  • outdir (str | PathLike) – local path to download to.

  • aws_profile (str | None) – aws cp argument

  • dryrun (bool) – aws cp argument

  • show_progress (bool) – aws cp argument

  • dont_error_on_missing_asset (bool) – if True warn if an asset is missing. TODO: variable name is too long and has a double negative. maybe rename to “missing_policy” or “ignore_missing”

Returns:

mapping from downloaded assets to their local path

Return type:

Dict[str, str | PathLike]

Example

>>> from geowatch.cli.smartflow_ingress import *  # NOQA
>>> dpath = ub.Path.appdir('geowatch/tests/smartflow_ingress/dst').ensuredir()
>>> fake_remote = ub.Path.appdir('geowatch/tests/smartflow_ingress/fake_remote').ensuredir()
>>> fake_fpath = fake_remote / 'my_path.txt'
>>> fake_fpath.write_text('foobar')
>>> fake_dpath = (fake_remote / 'my_dir').ensuredir()
>>> (fake_dpath / 'content1').touch()
>>> (fake_dpath / 'content2').touch()
>>> (fake_dpath / 'subdir1').ensuredir()
>>> (fake_dpath / 'subdir1/subcontent1').touch()
>>> (fake_dpath / 'subdir1/subcontent2').touch()
>>> (fake_dpath / 'subdir1/subsubdir').ensuredir()
>>> # Save this dummy stac item locally
>>> # In practice we download it, but we are using dry run mode
>>> # so we cant do that here.
>>> demo_stac_content = {'raw_images': [],
>>>  'stac': {'type': 'FeatureCollection',
>>>   'features': [{'type': 'Feature',
>>>     'stac_version': '1.0.0',
>>>     'stac_extensions': [],
>>>     'id': '66d3e2f605a44aa8b7bacc6ce7e96b9a',
>>>     'geometry': {'type': 'Polygon',
>>>      'coordinates': (((-109.56, 44.56),
>>>        (-109.57, 44.55),
>>>        (-109.53, 44.56),
>>>        (-109.56, 44.56)),)},
>>>     'bbox': [-109.57, 44.52, -109.51, 44.56],
>>>     'properties': {},
>>>     'assets': {'asset_file1': {'href': str(fake_fpath)},
>>>      'asset_dir1': {'href': str(fake_dpath)}}}]}}
>>> remote_dpath = (dpath / 'remote').ensuredir()
>>> input_path = remote_dpath / 'items.jsonl'
>>> input_path.write_text(json.dumps(demo_stac_content))
>>> outdir = (dpath / 'local').ensuredir()
>>> assets = ['asset_file1', 'asset_dir1', {'key': 'foobar', 'allow_missing': True}]
>>> kwcoco_stac_item_assets = smartflow_ingress(
>>>     input_path,
>>>     assets,
>>>     outdir,
>>> )
>>> assert kwcoco_stac_item_assets['asset_file1'] == os.fspath(outdir / 'my_path.txt')
>>> assert kwcoco_stac_item_assets['asset_dir1'] == os.fspath(outdir / 'my_dir')
>>> assert len(ub.Path(kwcoco_stac_item_assets['asset_dir1']).ls()) > 0
>>> assert ub.Path(kwcoco_stac_item_assets['asset_file1']).exists()