geowatch.utils.util_gis module

Utilities for geopandas and other geographic information system tools

geowatch.utils.util_gis.plot_geo_background()[source]
geowatch.utils.util_gis.geopandas_pairwise_overlaps(gdf1, gdf2, locator='iloc', predicate='intersects')[source]

Find pairwise relationships between each geometries

Parameters:
  • gdf1 (GeoDataFrame) – query geo data

  • gdf2 (GeoDataFrame) – database geo data (builds spatial index)

  • locator (str) – either iloc for positional indexes or loc for pandas indexes.

  • predicate (str, default=’intersects’) – a DE-9IM [1] predicate. (e.g. if intersection finds intersections between geometries)

References

Todo

  • [ ] This can move to geowatch.utils

Returns:

mapping from integer-iloc-indexes in gdf1 to overlapping integer-iloc-indexes in gdf2

Return type:

dict

Example

>>> from geowatch.utils.util_gis import *  # NOQA
>>> import geopandas as gpd
>>> import geodatasets
>>> geodatasets.get_path('GeoDa AirBnB')
>>> gdf1 = gpd.read_file(geodatasets.get_path('GeoDa milwaukee1'))
>>> gdf2 = gpd.read_file(geodatasets.get_path('GeoDa milwaukee2'))
>>> #gdf1 = gpd.read_file(
>>> #    gpd.datasets.get_path('naturalearth_lowres')
>>> #)
>>> #gdf2 = gpd.read_file(
>>> #    gpd.datasets.get_path('naturalearth_cities')
>>> #)
>>> #gdf1 = gdf1.set_index('name')
>>> #gdf2 = gdf2.set_index('name')
>>> idx_mapping = geopandas_pairwise_overlaps(gdf1, gdf2, locator='iloc')
>>> id_mapping = geopandas_pairwise_overlaps(gdf1, gdf2, locator='loc')

Benchmark

import timerit ti = timerit.Timerit(10, bestof=3, verbose=2) for timer in ti.reset(‘with sindex O(N * log(M))’):

with timer:

fast_mapping = geopandas_pairwise_overlaps(gdf1, gdf2)

for timer in ti.reset(‘without sindex O(N * M)’):
with timer:

from collections import defaultdict slow_mapping = defaultdict(list) for idx1, geom1 in enumerate(gdf1.geometry):

slow_mapping[idx1] = [] for idx2, geom2 in enumerate(gdf2.geometry):

if geom1.intersects(geom2):

slow_mapping[idx1].append(idx2)

# check they are the same assert set(slow_mapping) == set(fast_mapping) for idx1 in slow_mapping:

slow_idx2s = slow_mapping[idx1] fast_idx2s = fast_mapping[idx1] assert sorted(fast_idx2s) == sorted(slow_idx2s)

geowatch.utils.util_gis.latlon_text(lat, lon, precision=6)[source]

Make a lat,lon string suitable for a filename.

Pads with leading zeros so file names will align nicely at the same level of prcision.

Parameters:
  • lat (float) – degrees latitude

  • lon (float) – degrees longitude

  • precision (float, default=6) –

    Number of trailing decimal places. As rule of thumb set this to:

    6 - for ~10cm accuracy, 5 - for ~1m accuracy, 2 - for ~1km accuracy,

Todo

  • [ ] This can move to geowatch.utils

Notes

1 degree of latitude is very roughly the order of 100km, so the default precision of 6 localizes down to ~0.1 meters, which will usually be sufficient for satellite applications, but be mindful of using this text in applications that require more precision. Note 1 degree of longitude will vary, but will always be at least as precise as 1 degree of latitude.

Example

>>> lat = 90
>>> lon = 180
>>> print(latlon_text(lat, lon))
N90.000000E180.000000
>>> lat = 0
>>> lon = 0
>>> print(latlon_text(lat, lon))
N00.000000E000.000000

Example

>>> print(latlon_text(80.123, 170.123))
>>> print(latlon_text(10.123, 80.123))
>>> print(latlon_text(0.123, 0.123))
N80.123000E170.123000
N10.123000E080.123000
N00.123000E000.123000
>>> print(latlon_text(80.123, 170.123, precision=2))
>>> print(latlon_text(10.123, 80.123, precision=2))
>>> print(latlon_text(0.123, 0.123, precision=2))
N80.12E170.12
N10.12E080.12
N00.12E000.12
>>> print(latlon_text(80.123, 170.123, precision=5))
>>> print(latlon_text(10.123, 80.123, precision=5))
>>> print(latlon_text(0.123, 0.123, precision=5))
N80.12300E170.12300
N10.12300E080.12300
N00.12300E000.12300
geowatch.utils.util_gis.demo_regions_geojson_text()[source]
geowatch.utils.util_gis.load_geojson(file, default_axis_mapping='OAMS_TRADITIONAL_GIS_ORDER')[source]
Parameters:
  • file (str | file) – path or file object containing geojson data.

  • axis_mapping (str, default=’OAMS_TRADITIONAL_GIS_ORDER’) – The axis-ordering of the geojson file on disk. This is assumed to be traditional ordering by default according to the geojson spec.

Returns:

a dataframe with geo info.

Note: geopandas always stores data in traditional XY, although its CRS does seem to hold axis order info?

# OLD: This will ALWAYS return # with an OAMS_AUTHORITY_COMPLIANT wgs84 crs (i.e. lat,lon) even # though the on disk order is should be OAMS_TRADITIONAL_GIS_ORDER.

Return type:

GeoDataFrame

References

https://geopandas.org/docs/user_guide/projections.html#the-axis-order-of-a-crs

Example

>>> import io
>>> from geowatch.utils.util_gis import *  # NOQA
>>> geojson_text = demo_regions_geojson_text()
>>> file = io.StringIO()
>>> file.write(geojson_text)
>>> file.seek(0)
>>> region_df = load_geojson(file)
>>> # xdoctest: +REQUIRES(--show)
>>> import kwplot
>>> import geopandas as gpd
>>> kwplot.autompl()
>>> wld_map_gdf = gpd.read_file(
>>>     gpd.datasets.get_path('naturalearth_lowres')
>>> ).to_crs('OGC:CRS84')
>>> ax = wld_map_gdf.plot()
>>> region_df.plot(ax=ax, edgecolor='orange', alpha=0.8)
>>> # https://gis.stackexchange.com/questions/372564/userwarning-when-trying-to-get-centroid-from-a-polygon-geopandas
>>> centroid = region_df.to_crs('+proj=cea').centroid.to_crs(region_df.crs)
>>> centroid.plot(ax=ax, edgecolor='orange', alpha=0.8)

print(‘region_df.crs = {!r}’.format(region_df.crs)) print(‘wld_map_gdf.crs = {!r}’.format(wld_map_gdf.crs))

geowatch.utils.util_gis.read_geojson(file, default_axis_mapping='OAMS_TRADITIONAL_GIS_ORDER')
Parameters:
  • file (str | file) – path or file object containing geojson data.

  • axis_mapping (str, default=’OAMS_TRADITIONAL_GIS_ORDER’) – The axis-ordering of the geojson file on disk. This is assumed to be traditional ordering by default according to the geojson spec.

Returns:

a dataframe with geo info.

Note: geopandas always stores data in traditional XY, although its CRS does seem to hold axis order info?

# OLD: This will ALWAYS return # with an OAMS_AUTHORITY_COMPLIANT wgs84 crs (i.e. lat,lon) even # though the on disk order is should be OAMS_TRADITIONAL_GIS_ORDER.

Return type:

GeoDataFrame

References

https://geopandas.org/docs/user_guide/projections.html#the-axis-order-of-a-crs

Example

>>> import io
>>> from geowatch.utils.util_gis import *  # NOQA
>>> geojson_text = demo_regions_geojson_text()
>>> file = io.StringIO()
>>> file.write(geojson_text)
>>> file.seek(0)
>>> region_df = load_geojson(file)
>>> # xdoctest: +REQUIRES(--show)
>>> import kwplot
>>> import geopandas as gpd
>>> kwplot.autompl()
>>> wld_map_gdf = gpd.read_file(
>>>     gpd.datasets.get_path('naturalearth_lowres')
>>> ).to_crs('OGC:CRS84')
>>> ax = wld_map_gdf.plot()
>>> region_df.plot(ax=ax, edgecolor='orange', alpha=0.8)
>>> # https://gis.stackexchange.com/questions/372564/userwarning-when-trying-to-get-centroid-from-a-polygon-geopandas
>>> centroid = region_df.to_crs('+proj=cea').centroid.to_crs(region_df.crs)
>>> centroid.plot(ax=ax, edgecolor='orange', alpha=0.8)

print(‘region_df.crs = {!r}’.format(region_df.crs)) print(‘wld_map_gdf.crs = {!r}’.format(wld_map_gdf.crs))

geowatch.utils.util_gis.get_crs84()[source]

Constructing the CRS84 is slow. This function memoizes it so it only happens once.

Returns:

pyproj.crs.crs.CRS

Example

>>> get_crs84()
geowatch.utils.util_gis.shapely_flip_xy(geom)[source]
geowatch.utils.util_gis.project_gdf_to_local_utm(gdf_crs84, max_utm_zones=1, mode=0, tolerance=None)[source]

Find the local UTM zone for a geo data frame and project to it.

All geometry in the GDF must be in the same UTM zone.

Parameters:
  • gdf_crs84 (geopandas.GeoDataFrame) – The data with CRS-84 geometry to project into a local UTM

  • max_utm_zones (int) – If the data spans more than this many UTM zones, error. Otherwise, we take the first one.

  • mode (int) – which version of the logic to use. Default is 0, which is the original. Also have version 1 which is experimental and maybe faster.

  • tolerance (int) – if the projected region is further than this many meters from the center of the estimated UTM zone, raise a pyproj.exceptions.CRSError error. Only used in mode 1.

Returns:

The a copy of the input dataframe with projected geometry.

Return type:

geopandas.GeoDataFrame

Notes

Assumes input geometry is in CRS-84.

If an empty dataframe is given as input, it is returned as-is. No copy is made.

Example

>>> import geopandas as gpd
>>> from geowatch.utils.util_gis import *  # NOQA
>>> import kwarray
>>> import kwimage
>>> rng = kwarray.ensure_rng(0)
>>> # Gen lat/lons between 0 and 1, which is in UTM zone 31N
>>> gdf_crs84 = gpd.GeoDataFrame({'geometry': [
>>>     kwimage.Polygon.random(rng=rng).to_shapely(),
>>>     kwimage.Polygon.random(rng=rng).to_shapely(),
>>>     kwimage.Polygon.random(rng=rng).to_shapely(),
>>> ]}, crs='OGC:CRS84')
>>> gdf_utm = project_gdf_to_local_utm(gdf_crs84)
>>> assert gdf_utm.crs.name == 'WGS 84 / UTM zone 31N'

Example

>>> import geopandas as gpd
>>> from geowatch.utils.util_gis import *  # NOQA
>>> import kwarray
>>> import kwimage
>>> # If the data is too big for a single UTM zone,
>>> rng = kwarray.ensure_rng(0)
>>> gdf_crs84 = gpd.GeoDataFrame({'geometry': [
>>>     kwimage.Polygon.random(rng=rng).scale(90).to_shapely(),
>>>     kwimage.Polygon.random(rng=rng).scale(90).to_shapely(),
>>>     kwimage.Polygon.random(rng=rng).scale(90).to_shapely(),
>>> ]}, crs='OGC:CRS84')
>>> import pytest
>>> with pytest.raises(ValueError):
>>>     gdf_utm = project_gdf_to_local_utm(gdf_crs84)

Example

>>> from geowatch.utils.util_gis import *  # NOQA
>>> import geopandas as gpd
>>> import kwarray
>>> import kwimage
>>> # Data convers a lot of utm zones
>>> rng = kwarray.ensure_rng(0)
>>> gdf_crs84 = gpd.GeoDataFrame({'geometry': [
>>>     kwimage.Polygon.random(rng=rng).scale(2).translate((-1, -1)).scale(0.001, about='centroid').scale((180, 90)).to_shapely()
>>>     for _ in range(5)
>>> ]}, crs='OGC:CRS84')
>>> # Mode 1 uses a tolerance test instead
>>> gdf_utm = project_gdf_to_local_utm(gdf_crs84, mode=1, tolerance=float('inf'))
>>> import pytest
>>> import pyproj
>>> with pytest.raises(pyproj.exceptions.CRSError):
>>>     gdf_utm = project_gdf_to_local_utm(gdf_crs84, mode=1, tolerance=10000)
>>> # Mode 1 uses the bounds of the entire region

# TODO: Gracefully handle cases where the UTM zones are different # but all neighbors. Find a good example of this. # Example: # >>> import geopandas as gpd # >>> from geowatch.utils.util_gis import * # NOQA # >>> import kwarray # >>> # If the data is too big for a single UTM zone, # >>> rng = kwarray.ensure_rng(0) # >>> # Corner case in Madagascar where the extent isn’t too big, but it # >>> # spans multiple UTM zones # >>> poly = kwimage.Polygon.coerce({ # >>> “type”: “Polygon”, # >>> “coordinates”: [ # >>> [[-73.77200379967688, 42.864783745778894], # >>> [-73.77177715301514, 42.86412514733195], # >>> [-73.77110660076141, 42.8641654498268], # >>> [-73.77105563879013, 42.86423720786224], # >>> [-73.7710489332676, 42.864399400374786], # >>> [-73.77134531736374, 42.8649134986743], # >>> [ -73.77200379967688, 42.864783745778894]]] # >>> }) # >>> gdf_crs84 = gpd.GeoDataFrame({ # >>> ‘geometry’: [poly.to_shapely()]}, crs=’OGC:CRS84’)

geowatch.utils.util_gis.utm_epsg_from_latlon(lat, lon)[source]

Find a reasonable UTM CRS for a given lat / lon

The purpose of this function is to get a reasonable CRS for computing distances in meters. If the region of interest is very large, this may not be valid.

Parameters:
  • lat (float) – degrees in latitude

  • lon (float) – degrees in longitude

Returns:

the ESPG code of the UTM zone

Return type:

int

References

https://gis.stackexchange.com/questions/190198/how-to-get-appropriate-crs-for-a-position-specified-in-lat-lon-coordinates https://gis.stackexchange.com/questions/365584/convert-utm-zone-into-epsg-code

Example

>>> from geowatch.utils.util_gis import *  # NOQA
>>> epsg_code = utm_epsg_from_latlon(0, 0)
>>> print('epsg_code = {!r}'.format(epsg_code))
epsg_code = 32631
class geowatch.utils.util_gis.UTM_TransformContext(data_crs84)[source]

Bases: object

Helper to project into UTM space, perform some transform, and then project back to the original CRS.

Currently only supports CRS84

Example

>>> import kwimage
>>> from geowatch.utils.util_gis import *  # NOQA
>>> data_crs84 = kwimage.Polygon.random()
>>> with UTM_TransformContext(data_crs84) as self:
>>>     orig_utm_poly = kwimage.Polygon.coerce(self.geoms_utm.iloc[0])
>>>     new_utm_poly = orig_utm_poly.scale(2, about='center')
>>>     self.finalize(new_utm_poly)
>>> final_result = kwimage.Polygon.coerce(self.final_geoms_crs84.iloc[0])
>>> naive_result = data_crs84.scale(2, about='center')
>>> # Note the subtle difference in the naive vs context result
>>> print(f'final_result={final_result}')
>>> print(f'naive_result={naive_result}')
Parameters:

data_crs84 (Coercible[GeoSeries]) – something we know how to transform into a GeoSeries

finalize(final_utm)[source]
Parameters:

final_utm (Coercible[GeoSeries]) – something coercable to geometry in UTM coordinates

geowatch.utils.util_gis.find_local_meter_epsg_crs(geom_crs84)[source]

Find the “best” meter based CRS for a smallish geographic region.

Currently this only returns UTM zones. Might be better to return an Albers projection if the geometry spans more than one UTM zone.

Parameters:

geom_crs84 (shapely.geometry.base.BaseGeometry) – shapely geometry in CRS84 (lon/lat wgs84)

Returns:

epsg code

Return type:

int

References

[1] https://gis.stackexchange.com/questions/148181/choosing-projection-crs-for-short-distance-based-analysis/148187 [2] http://projfinder.com/

Example

>>> from geowatch.utils.util_gis import *  # NOQA
>>> import kwimage
>>> geom_crs84 = kwimage.Polygon.random().translate(-0.5).scale((180, 90)).to_shapely()
>>> epsg_zone = find_local_meter_epsg_crs(geom_crs84)
SeeAlso:

GeoDataFrame.estimate_utm_crs

Todo

  • [ ] Albers?

  • [ ] Better UTM zone intersection

  • [ ] Fix edge cases

geowatch.utils.util_gis.check_latlons(lat, lon)[source]

Quick check to see if latitudes and longitudes are valid.

Longitude (x) is always between -180 and 180 (degrees east) Latitude (y) is always between -90 and 90 (degrees north)

Example

>>> from geowatch.utils.util_gis import *  # NOQA
>>> import pytest
>>> assert not check_latlons(1000, 1000)
>>> assert check_latlons(0, 0)
geowatch.utils.util_gis.coerce_geojson_datas(arg, format='dataframe', allow_raw=False, workers=0, mode='thread', verbose=1, desc=None, parse_float=None)[source]

Attempts to resolve an argument into multiple geojson datas.

Multiple threads / processes are used to load the specified information and the function generates dictionaries of information containing the file path and the loaded data as they become available.

The argument can be:

  1. A path to a geojson file (or a list of them)

  2. A glob string specifying multiple geojson files (or a list of them)

  3. A path to a json manifest file.

  4. If allow_raw is True, then the input can be a raw json string, dict, or GeoDataFrame.

Parameters:
  • arg (str | PathLike | List[str | PathLike]) – an argument that is coerceable to one or more GeoDataFrames.

  • format (str) – Indicates the returned format of the data. Can be ‘dataframe’ where the ‘data’ key will be a GeoDataFrame, or ‘json’ where the raw json data will be returned.

  • allow_raw (bool) – if True, we will also check if the arguments are raw json / geopandas data that can be loaded. In general try not to enable this.

  • workers (int) – number of io workers

  • mode (str) – concurrent executor mode. Can be ‘serial’, ‘thread’, or ‘process’.

  • desc (str) – custom message for progress bar.

Yields:

List[Dict[str, Any | GeoDataFrame | Dict]]

A list of dictionaries formated with the keys:

  • fpath (str): the file path the data was loaded from (

    if applicable)

  • data (GeoDataFrame | dict):

    the data loaded in the requested format

SeeAlso:
  • load_site_or_region_dataframes - the function that does the loading

    after the arguments are coerced.

Example

>>> # xdoctest: +SKIP("failing on CI. unsure why")
>>> from geowatch.utils.util_gis import *  # NOQA
>>> from geowatch.demo.metrics_demo import generate_demodata
>>> info1 = generate_demodata.generate_demo_metrics_framework_data(roi='DR_R001')
>>> info2 = generate_demodata.generate_demo_metrics_framework_data(roi='DR_R002')
>>> info3 = generate_demodata.generate_demo_metrics_framework_data(roi='DR_R003')
>>> info4 = generate_demodata.generate_demo_metrics_framework_data(roi='DR_R012')
>>> info5 = generate_demodata.generate_demo_metrics_framework_data(roi='DR_R022')
>>> #
>>> region_fpaths = sorted(info1['true_region_dpath'].glob('*.geojson'))
>>> site_fpaths = sorted(info1['true_site_dpath'].glob('*.geojson'))
>>> #
>>> import json
>>> manifest_fpath1 =  info1['output_dpath'] / 'demo_manifest1.json'
>>> manifest_fpath2 =  info1['output_dpath'] / 'demo_manifest2.json'
>>> manifest_data1 = {
>>>     'files': [str(p) for p in region_fpaths[0:2]]
>>> }
>>> manifest_data2 = {
>>>     'files': [str(p) for p in region_fpaths[3:4]]
>>> }
>>> manifest_fpath1.write_text(json.dumps(manifest_data1))
>>> manifest_fpath2.write_text(json.dumps(manifest_data2))
>>> variants = []
>>> # ==========
>>> # Test Cases
>>> # ==========
>>> #
>>> # List of region files
>>> arg = region_fpaths
>>> result = list(coerce_geojson_datas(arg))
>>> assert len(result) == 5
>>> #
>>> # Glob for region files
>>> arg = str(info1['true_region_dpath']) + '/*R*2*.geojson'
>>> result = list(coerce_geojson_datas(arg))
>>> assert len(result) == 3
>>> #
>>> # Manifest file
>>> arg = manifest_fpath1
>>> result = list(coerce_geojson_datas(arg))
>>> assert len(result) == 2
>>> #
>>> # Manifest file glob
>>> arg = str(manifest_fpath1.parent / '*.json')
>>> result = list(coerce_geojson_datas(arg))
>>> assert len(result) == 3
>>> #
>>> # Manifest file glob and a region path
>>> arg = [manifest_fpath2, region_fpaths[0]]
>>> result = list(coerce_geojson_datas(arg))
>>> assert len(result) == 2
>>> #
>>> # Site glob and a manifest glob
>>> arg = [str(info1['true_site_dpath']) + '/DR_R002_*.geojson',
...        str(manifest_fpath1 + '*')]
>>> result = list(coerce_geojson_datas(arg))
>>> assert len(result) == 9
>>> #
>>> # Site directory and manifest file.
>>> arg = [str(info1['true_site_dpath']),
...        str(manifest_fpath1 + '*')]
>>> result = list(coerce_geojson_datas(arg))
>>> assert len(result) == 31
>>> # Test raw loading and format swapping
>>> from geowatch.utils import util_gis
>>> arg = util_gis.demo_regions_geojson_text()
>>> result1 = list(coerce_geojson_datas(arg, allow_raw=False))
>>> assert len(result1) == 0
>>> result2 = list(coerce_geojson_datas(arg, allow_raw=True))
>>> assert len(result2) == 1
>>> arg = result2
>>> result3 = list(coerce_geojson_datas(arg, format='dataframe', allow_raw=True))
>>> assert result3 == result2
>>> result4 = list(coerce_geojson_datas(arg, format='json', allow_raw=True))
>>> assert isinstance(result4[0]['data'], dict)
>>> result5 = list(coerce_geojson_datas(
>>>     result4, format='dataframe', allow_raw=True))
>>> assert isinstance(result4[0]['data'], dict)
>>> #
>>> # Test nothing case
>>> assert len(list(coerce_geojson_datas([], allow_raw=True))) == 0
geowatch.utils.util_gis.coerce_geojson_paths(data, return_manifests=False)[source]

Resolves the argument to a list of geojson paths.

Parameters:
  • data (Any) – argument to coerce. The argument can be a full path, a glob string, a path to a manifest file or any combination of the previous in a list. If it is a manifest file. It must be a json/yaml file with a top-level “files” key that points to a list of the actual file paths.

  • return_manifests (bool) – if True additionally returns paths to any intermediate manifest files.

Example

>>> from geowatch.utils.util_gis import *  # NOQA
>>> import json
>>> from geowatch.demo.metrics_demo import generate_demodata
>>> # Setup a bunch of geojson files
>>> outdir = ub.Path.appdir("geowatch/tests/gis/coerce_geojson_v1")
>>> # outdir.delete().ensuredir()
>>> info1 = generate_demodata.generate_demo_metrics_framework_data(roi='DR_R001', outdir=outdir)
>>> info2 = generate_demodata.generate_demo_metrics_framework_data(roi='DR_R002', outdir=outdir)
>>> info3 = generate_demodata.generate_demo_metrics_framework_data(roi='DR_R003', outdir=outdir)
>>> info4 = generate_demodata.generate_demo_metrics_framework_data(roi='DR_R012', outdir=outdir)
>>> info5 = generate_demodata.generate_demo_metrics_framework_data(roi='DR_R022', outdir=outdir)
>>> region_fpaths = sorted(info1['true_region_dpath'].glob('*.geojson'))
>>> site_fpaths = sorted(info1['true_site_dpath'].glob('*.geojson'))
>>> manifest_fpath1 =  info1['output_dpath'] / 'demo_manifest1.json'
>>> manifest_data1 = {'files': [str(p) for p in region_fpaths[0:2]]}
>>> geojson_dpath = info1['true_site_dpath']
>>> manifest_fpath1.write_text(json.dumps(manifest_data1))
>>> # Test manifest case
>>> geojson_fpaths = coerce_geojson_paths(manifest_fpath1)
>>> assert len(geojson_fpaths) == 2
>>> # Test directory case
>>> geojson_fpaths = coerce_geojson_paths(geojson_dpath)
>>> assert len(geojson_fpaths) == 15, 'maybe cache is bad?'
>>> # Test glob case
>>> geojson_fpaths = coerce_geojson_paths(geojson_dpath / '*R001_*')
>>> assert len(geojson_fpaths) == 3
>>> # Test list of files and globstr
>>> data = [geojson_dpath / '*R002_*'] + geojson_fpaths
>>> geojson_fpaths = coerce_geojson_paths(data)
>>> assert len(geojson_fpaths) == 6
>>> # Test manifest case2
>>> info = coerce_geojson_paths(manifest_fpath1, return_manifests=True)
>>> assert len(info['manifest_fpaths']) == 1
>>> assert len(info['geojson_fpaths']) == 2
geowatch.utils.util_gis.load_geojson_datas(geojson_fpaths, format='dataframe', workers=0, mode='thread', verbose=1, desc=None, yield_after_submit=False, parse_float=None)[source]

Generator that loads sites (and the path they loaded from) in parallel

Parameters:
  • geojson_fpaths (Iterable[PathLike]) – geojson paths to load

  • format (str) – Can be “dataframe” (for pandas) or “json” (for nested dict / list)

  • workers (int) – number of background loading workers

  • mode (str) – concurrent executor mode

  • desc (str) – overwrite message for the progress bar

  • yield_after_submit (bool) – backend argument that will yield None after the data is submitted to force the data loading to start processing in the background.

Todo

pass something in to allow for better progress management when a lot of functions are calling this.

Yields:

Dict – containing keys, ‘fpath’ and ‘gdf’.

SeeAlso:
  • coerce_geojson_datas - the coercable version of this function.

  • coerce_geojson_paths - only coerces paths

Example

>>> from geowatch.utils.util_gis import *  # NOQA
>>> import ubelt as ub
>>> dpath = ub.Path.appdir('geowatch', 'tests', 'util_gis', 'load_geojson_data')
>>> dpath.ensuredir()
>>> fpath = dpath / 'data.geojson'
>>> fpath.write_text(demo_regions_geojson_text())
>>> # Test both format loaders work correctly.
>>> gdf = list(load_geojson_datas([fpath], format='dataframe'))[0]['data']
>>> dct = list(load_geojson_datas([fpath], format='json'))[0]['data']
>>> import geopandas as gpd
>>> assert isinstance(gdf, gpd.GeoDataFrame)
>>> assert isinstance(dct, dict)
geowatch.utils.util_gis.crs_geojson_to_gdf(geometry, crs_info=None)[source]

TODO: fix the name and scope of this function.

The idea is to convert one of our CRS aware geojson geometries to a GeoDataFrame for manipulation.

Parameters:
  • geometry – This is a geojson geometry object with that has a crs_info object in its properties.

  • crs_info (Dict | None) – if the geojson does not have a crs_info in its properties you can specify it here.

Returns:

gpd.GeoDataFrame

Example

>>> # xdoctest: +REQUIRES(env:LARGE_DOWNLOAD)
>>> from geowatch.gis.geotiff import *  # NOQA
>>> from geowatch.utils.util_gis import *  # NOQA
>>> import kwimage
>>> from geowatch.demo.dummy_demodata import dummy_rpc_geotiff_fpath
>>> gpath_or_ref = gpath = dummy_rpc_geotiff_fpath()
>>> info = geotiff_crs_info(gpath)
>>> # Case 1: Traditional CRS84 Geojson
>>> geometry = info['geos_corners']
>>> gdf = crs_geojson_to_gdf(geometry)
>>> assert gdf.crs.axis_info[0].name == 'Geodetic longitude'
>>> assert gdf.crs.axis_info[1].name == 'Geodetic latitude'
>>> # Case 2: Authority Compliant WGS84 Geojson
>>> wgs84_poly = kwimage.Polygon.coerce(info['wgs84_corners'])
>>> wgs84_geometry = wgs84_poly.to_geojson()
>>> wgs84_geometry['properties'] = {'crs_info': info['wgs84_crs_info']}
>>> crs84_gdf = crs_geojson_to_gdf(wgs84_geometry)
>>> assert crs84_gdf.crs.axis_info[0].name == 'Geodetic longitude'
>>> assert crs84_gdf.crs.axis_info[1].name == 'Geodetic latitude'
>>> # Case 3: UTM
>>> utm_poly = kwimage.Polygon.coerce(info['utm_corners'])
>>> utm_geometry = utm_poly.to_geojson()
>>> utm_geometry['properties'] = {'crs_info': info['utm_crs_info']}
>>> utm_gdf = crs_geojson_to_gdf(utm_geometry)
>>> assert utm_gdf.crs.axis_info[0].name == 'Easting'
>>> assert utm_gdf.crs.axis_info[1].name == 'Northing'
geowatch.utils.util_gis.coerce_crs(crs_info)[source]

Get a pyproj CRS from something they should be inferrable from

Example

>>> from geowatch.gis.geotiff import *  # NOQA
>>> from geowatch.utils.util_gis import *  # NOQA
>>> coerce_crs('crs84')
>>> coerce_crs(['EPSG', '4326'])