geowatch.cli.smartflow_egress module

class geowatch.cli.smartflow_egress.SmartflowEgressConfig(*args, **kwargs)[source]

Bases: DataConfig

Egress KWCOCO data to T&E baseline framework structure

Valid options: []

Parameters:
  • *args – positional arguments for this data config

  • **kwargs – keyword arguments for this data config

default = {'assets': <Value(None)>, 'aws_profile': <Value(None)>, 'dryrun': <Value(False)>, 'input_path': <Value(None)>, 'outdir': <Value(None)>, 'show_progress': <Value(False)>}
geowatch.cli.smartflow_egress.main()[source]
geowatch.cli.smartflow_egress.smartflow_egress_with_arg_processing(assetnames_and_paths, region_path, output_path, outbucket, aws_profile=None, dryrun=False, newline=False, show_progress=False)[source]
geowatch.cli.smartflow_egress.smartflow_egress(assetnames_and_local_paths, region_path, output_path, outbucket, aws_profile=None, dryrun=False, newline=False, show_progress=False)[source]

Uploads specified assets to S3 with a STAC manifest.

Parameters:
  • assetnames_and_local_paths (Dict) – Mapping from an asset name to the local path to upload. The asset name will be indexable in the uploaded STAC item. Any local path specified multiple times will only be uploaded once, but multiple STAC assets will be associated with it.

  • region_path (str | PathLike) – local path to the region file associated with a processing node

  • output_path (str) – The path in the s3 bucket that the stac item will be uploaded to.

  • outbucket (str) – The s3 bucket that assets will be uploaded to.

  • aws_profile (str | None) – aws cp argument

  • newline (bool) – controls formatting of output stac item

Returns:

Of asset names and output paths

Return type:

Dict

CommandLine

xdoctest -m geowatch.cli.smartflow_egress smartflow_egress

Example

>>> from geowatch.cli.smartflow_egress import *  # NOQA
>>> from geowatch.geoannots.geomodels import RegionModel
>>> from os.path import join
>>> dpath = ub.Path.appdir('geowatch/tests/smartflow_egress').ensuredir()
>>> local_dpath = (dpath / 'local').ensuredir()
>>> remote_root = (dpath / 'fake_s3_loc').ensuredir()
>>> #outbucket = 's3://fake/bucket'
>>> outbucket = remote_root
>>> output_path = join(outbucket, 'items.jsonl')
>>> region = RegionModel.random()
>>> region_path = dpath / 'demo_region.geojson'
>>> region_path.write_text(region.dumps())
>>> assetnames_and_local_paths = {
>>>     'asset_file1': dpath / 'my_path1.txt',
>>>     'asset_file2': dpath / 'my_path2.txt',
>>>     'asset_file_reference': dpath / 'my_path1.txt',
>>>     'asset_dir1': dpath / 'my_dir1',
>>> }
>>> # Generate local data we will pretend to egress
>>> assetnames_and_local_paths['asset_file1'].write_text('foobar1')
>>> assetnames_and_local_paths['asset_file2'].write_text('foobar2')
>>> assetnames_and_local_paths['asset_dir1'].ensuredir()
>>> (assetnames_and_local_paths['asset_dir1'] / 'data1').write_text('data1')
>>> (assetnames_and_local_paths['asset_dir1'] / 'data1').write_text('data2')
>>> te_output = smartflow_egress(
>>>     assetnames_and_local_paths,
>>>     region_path,
>>>     output_path,
>>>     outbucket,
>>>     newline=False,
>>> )
geowatch.cli.smartflow_egress.fallback_copy(local_path, asset_s3_outpath)[source]

Copying with fsspec alone seems to be causing issues. This provides a fallback to a raw S3 command, as well as other verbosity.