geowatch.cli.smartflow_ingress module¶
- class geowatch.cli.smartflow_ingress.SmartflowIngressConfig(*args, **kwargs)[source]¶
Bases:
DataConfigIngress KWCOCO data to T&E baseline framework structure
Valid options: []
- Parameters:
*args – positional arguments for this data config
**kwargs – keyword arguments for this data config
- default = {'assets': <Value(None)>, 'aws_profile': <Value(None)>, 'dont_error_on_missing_asset': <Value(False)>, 'dryrun': <Value(False)>, 'input_path': <Value(None)>, 'outdir': <Value(None)>, 'show_progress': <Value(False)>}¶
- geowatch.cli.smartflow_ingress.smartflow_ingress(input_path, assets, outdir, aws_profile=None, dryrun=False, show_progress=False, dont_error_on_missing_asset=False)[source]¶
Downloads a STAC manifest and select items within it.
- Parameters:
input_path (str) – The path in the s3 bucket that the STAC item will be downloaded from.
assets (List[str | Dict]) – A List of keys into the stac item assets that we will download. Can also be a list of dictionaries that must contain a
"key": <str>item, as well as other options to control behavior, like"allow_missing": True.outdir (str | PathLike) – local path to download to.
aws_profile (str | None) – aws cp argument
dryrun (bool) – aws cp argument
show_progress (bool) – aws cp argument
dont_error_on_missing_asset (bool) – if True warn if an asset is missing. TODO: variable name is too long and has a double negative. maybe rename to “missing_policy” or “ignore_missing”
- Returns:
mapping from downloaded assets to their local path
- Return type:
Example
>>> from geowatch.cli.smartflow_ingress import * # NOQA >>> dpath = ub.Path.appdir('geowatch/tests/smartflow_ingress/dst').ensuredir() >>> fake_remote = ub.Path.appdir('geowatch/tests/smartflow_ingress/fake_remote').ensuredir() >>> fake_fpath = fake_remote / 'my_path.txt' >>> fake_fpath.write_text('foobar') >>> fake_dpath = (fake_remote / 'my_dir').ensuredir() >>> (fake_dpath / 'content1').touch() >>> (fake_dpath / 'content2').touch() >>> (fake_dpath / 'subdir1').ensuredir() >>> (fake_dpath / 'subdir1/subcontent1').touch() >>> (fake_dpath / 'subdir1/subcontent2').touch() >>> (fake_dpath / 'subdir1/subsubdir').ensuredir() >>> # Save this dummy stac item locally >>> # In practice we download it, but we are using dry run mode >>> # so we cant do that here. >>> demo_stac_content = {'raw_images': [], >>> 'stac': {'type': 'FeatureCollection', >>> 'features': [{'type': 'Feature', >>> 'stac_version': '1.0.0', >>> 'stac_extensions': [], >>> 'id': '66d3e2f605a44aa8b7bacc6ce7e96b9a', >>> 'geometry': {'type': 'Polygon', >>> 'coordinates': (((-109.56, 44.56), >>> (-109.57, 44.55), >>> (-109.53, 44.56), >>> (-109.56, 44.56)),)}, >>> 'bbox': [-109.57, 44.52, -109.51, 44.56], >>> 'properties': {}, >>> 'assets': {'asset_file1': {'href': str(fake_fpath)}, >>> 'asset_dir1': {'href': str(fake_dpath)}}}]}} >>> remote_dpath = (dpath / 'remote').ensuredir() >>> input_path = remote_dpath / 'items.jsonl' >>> input_path.write_text(json.dumps(demo_stac_content)) >>> outdir = (dpath / 'local').ensuredir() >>> assets = ['asset_file1', 'asset_dir1', {'key': 'foobar', 'allow_missing': True}] >>> kwcoco_stac_item_assets = smartflow_ingress( >>> input_path, >>> assets, >>> outdir, >>> ) >>> assert kwcoco_stac_item_assets['asset_file1'] == os.fspath(outdir / 'my_path.txt') >>> assert kwcoco_stac_item_assets['asset_dir1'] == os.fspath(outdir / 'my_dir') >>> assert len(ub.Path(kwcoco_stac_item_assets['asset_dir1']).ls()) > 0 >>> assert ub.Path(kwcoco_stac_item_assets['asset_file1']).exists()