geowatch.tasks.fusion.datamodules.data_utils module¶
I dont like the name of this file. I want to rename it, but it exists to keep the size of the datamodule down for now.
Todo
[ ] Break BalancedSampleTree and BalancedSampleForest into their own balanced sampling module.
[ ] Make a good augmentation module
[ ] Determine where MultiscaleMask should live.
- geowatch.tasks.fusion.datamodules.data_utils.resolve_scale_request(request=None, data_gsd=None)[source]¶
Helper for handling user and machine specified spatial scale requests
- Parameters:
request (None | float | str) – Indicate a relative or absolute requested scale. If given as a float, this is interpreted as a scale factor relative to the underlying data. If given as a string, it will accept the format “{:f} *GSD” and resolve to an absolute GSD. Defaults to 1.0.
data_gsd (None | float) – if specified, this indicates the GSD of the underlying data. (Only valid for geospatial data). TODO: is there a better generalization?
- Returns:
- resolvedcontaining keys
scale (float): the scale factor to obtain the requested gsd (float | None): if data_gsd is given, this is the absolute
GSD of the request.
- Return type:
Dict[str, Any]
Note
The returned scale is relative to the DATA. If you are resizing a sampled image, then use it directly, but if you are adjusting a sample WINDOW, then it needs to be used inversely.
Example
>>> from geowatch.tasks.fusion.datamodules.data_utils import * # NOQA >>> resolve_scale_request(1.0) >>> resolve_scale_request('native') >>> resolve_scale_request('10 GSD', data_gsd=10) >>> resolve_scale_request('20 GSD', data_gsd=10)
Example
>>> from geowatch.tasks.fusion.datamodules.data_utils import * # NOQA >>> import ubelt as ub >>> grid = list(ub.named_product({ >>> 'request': ['10GSD', '30GSD'], >>> 'data_gsd': [10, 30], >>> })) >>> grid += list(ub.named_product({ >>> 'request': [None, 1.0, 2.0, 0.25, 'native'], >>> 'data_gsd': [None, 10, 30], >>> })) >>> for kwargs in grid: >>> print('kwargs = {}'.format(ub.urepr(kwargs, nl=0))) >>> resolved = resolve_scale_request(**kwargs) >>> print('resolved = {}'.format(ub.urepr(resolved, nl=0))) >>> print('---')
- geowatch.tasks.fusion.datamodules.data_utils.fliprot(img, rot_k=0, flip_axis=None, axes=(0, 1))[source]¶
- Parameters:
img (ndarray) – H, W, C
rot_k (int) – number of ccw rotations
flip_axis (Tuple[int, …]) – either [], [0], [1], or [0, 1]. 0 is the y axis and 1 is the x axis.
axes (Typle[int, int]) – the location of the y and x axes
Example
>>> img = np.arange(16).reshape(4, 4) >>> unique_fliprots = [ >>> {'rot_k': 0, 'flip_axis': None}, >>> {'rot_k': 0, 'flip_axis': (0,)}, >>> {'rot_k': 1, 'flip_axis': None}, >>> {'rot_k': 1, 'flip_axis': (0,)}, >>> {'rot_k': 2, 'flip_axis': None}, >>> {'rot_k': 2, 'flip_axis': (0,)}, >>> {'rot_k': 3, 'flip_axis': None}, >>> {'rot_k': 3, 'flip_axis': (0,)}, >>> ] >>> for params in unique_fliprots: >>> img_fw = fliprot(img, **params) >>> img_inv = inv_fliprot(img_fw, **params) >>> assert np.all(img == img_inv)
- geowatch.tasks.fusion.datamodules.data_utils.fliprot_annot(annot, rot_k, flip_axis=None, axes=(0, 1), canvas_dsize=None)[source]¶
- geowatch.tasks.fusion.datamodules.data_utils.inv_fliprot_annot(annot, rot_k, flip_axis=None, axes=(0, 1), canvas_dsize=None)[source]¶
- geowatch.tasks.fusion.datamodules.data_utils.inv_fliprot(img, rot_k=0, flip_axis=None, axes=(0, 1))[source]¶
Undo a fliprot
- Parameters:
img (ndarray) – H, W, C
- geowatch.tasks.fusion.datamodules.data_utils.samecolor_nodata_mask(stream, hwc, relevant_bands, use_regions=0, samecolor_values=None)[source]¶
Find a 2D mask that indicates what values should be set to nan. This is typically done by finding clusters of zeros in specific bands.
Example
>>> from geowatch.tasks.fusion.datamodules.data_utils import * # NOQA >>> import kwcoco >>> import kwarray >>> stream = kwcoco.FusedChannelSpec.coerce('foo|red|green|bar') >>> stream_oset = ub.oset(stream) >>> relevant_bands = ['red', 'green'] >>> relevant_band_idxs = [stream_oset.index(b) for b in relevant_bands] >>> rng = kwarray.ensure_rng(0) >>> hwc = (rng.rand(32, 32, stream.numel()) * 3).astype(int) >>> use_regions = 0 >>> samecolor_values = {0} >>> samecolor_mask = samecolor_nodata_mask( >>> stream, hwc, relevant_bands, use_regions=use_regions, >>> samecolor_values=samecolor_values) >>> assert samecolor_mask.sum() == (hwc[..., relevant_band_idxs] == 0).any(axis=2).sum()
- class geowatch.tasks.fusion.datamodules.data_utils.MultiscaleMask[source]¶
Bases:
object
A helper class to build up a mask indicating what pixels are unobservable based on data from different resolution.
In othe words, if you have multiple masks, and each mask has a different resolution, then this will iteravely upscale the masks to the largest resolution so far and perform a logical or. This helps keep the memory footprint small.
Todo
Does this live in kwimage?
CommandLine
xdoctest -m geowatch.tasks.fusion.datamodules.data_utils MultiscaleMask --show
Example
>>> from geowatch.tasks.fusion.datamodules.data_utils import * # NOQA >>> image = kwimage.grab_test_image() >>> image = kwimage.ensure_float01(image) >>> rng = kwarray.ensure_rng(1) >>> mask1 = kwimage.Mask.random(shape=(12, 12), rng=rng).data >>> mask2 = kwimage.Mask.random(shape=(32, 32), rng=rng).data >>> mask3 = kwimage.Mask.random(shape=(16, 16), rng=rng).data >>> omask = MultiscaleMask() >>> omask.update(mask1) >>> omask.update(mask2) >>> omask.update(mask3) >>> masked_image = omask.apply(image, np.nan) >>> # Now we can use our upscaled masks on an image. >>> masked_image = kwimage.fill_nans_with_checkers(masked_image, on_value=0.3) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> inputs = kwimage.stack_images( >>> [kwimage.atleast_3channels(m * 255) for m in [mask1, mask2, mask3]], >>> pad=2, bg_value='kw_green', axis=1) >>> kwplot.imshow(inputs, pnum=(1, 3, 1), title='input masks') >>> kwplot.imshow(omask.mask, pnum=(1, 3, 2), title='final mask') >>> kwplot.imshow(masked_image, pnum=(1, 3, 3), title='masked image') >>> kwplot.show_if_requested()
- update(mask)[source]¶
Expand the observable mask to the larger data and take the logical or of the resized masks.
- property masked_fraction¶