geowatch.utils.process_context module

Defines the ProcessContext object, which is what mlops expects jobs to be wrapped in.

Todo

  • [ ] Make “most” telemetry opt-in

class geowatch.utils.process_context.ProcessContext(name=None, type='process', args=None, config=None, extra=None, track_emissions=False, request_all_telemetry=True, request_most_telemetry=True)[source]

Bases: object

Context manager to track the context under which a result was computed.

This tracks things like start / end time. The command line that can reproduce the process (assuming an appropriate environment. The configuration the process was run with. The machine details the process was run on. The power usage / carbon emissions the process used, and other information.

Parameters:
  • args (str | List[str]) – This should be the sys.argv or the command line string that can be used to rerun the process

  • config (Dict) – This should be a configuration dictionary (likely based on sys.argv)

  • name (str) – the name of this process

  • type (str) – The type of this process (usually keep the default of process)

  • request_all_telemetry (bool) – if False, telemetry is disabled. This is forced to False if PROCESS_CONTEXT_DISABLE_MOST_TELEMETRY is in the environment.

  • request_most_telemetry (bool) – if False, telemetry is disabled. This is forced to False if PROCESS_CONTEXT_DISABLE_ALL_TELEMETRY is in the environment.

Note

This module provides telemetry, which records user-identifiable information. While useful, it does raise ethical concerns about user privacy, and the people running this code have a right to know about it and opt out. In the future we will change our policy to opt-in, but for system stability, we are not changing defaults.

Note

There are two levels of telemetry.

Enviornment telemetry. These are things like the machine the code was run on. Use PROCESS_CONTEXT_DISABLE_MOST_TELEMETRY=0 to opt-out.

The start / stop / sys.argv / config objects are necessary for mlops to do anything. But these can leak information by containing system paths. Emissions is also in this category. Use PROCESS_CONTEXT_DISABLE_ALL_TELEMETRY to opt out.

CommandLine

xdoctest -m geowatch.utils.process_context ProcessContext

Example

>>> from geowatch.utils.process_context import *
>>> import torch
>>> import rich
>>> device = torch.device(0) if torch.cuda.is_available() else torch.device('cpu')
>>> # Adding things like disk info an tracking emission usage
>>> self = ProcessContext(track_emissions='offline')
>>> obj1 = self.start().stop()
>>> self.add_disk_info('.')
>>> self.add_device_info(device)
>>> #
>>> # Telemetry can be mostly disabled
>>> self = ProcessContext(track_emissions='offline', request_most_telemetry=False)
>>> obj2 = self.start().stop()
>>> self.add_disk_info('.')
>>> self.add_device_info(device)
>>> # Telemetry can be completely disabled
>>> self = ProcessContext(track_emissions='offline', request_all_telemetry=False)
>>> obj3 = self.start().stop()
>>> self.add_disk_info('.')
>>> self.add_device_info(device)
>>> rich.print('full_telemetry = {}'.format(ub.urepr(obj1, nl=3)))
>>> rich.print('some_telemetry = {}'.format(ub.urepr(obj2, nl=3)))
>>> rich.print('no_telemetry = {}'.format(ub.urepr(obj3, nl=3)))

Example

>>> from geowatch.utils.process_context import *
>>> # flush can measure intermediate progress
>>> self = ProcessContext(track_emissions='offline')
>>> self.add_disk_info('.')
>>> obj1 = self.start().flush()
>>> obj1_orig = obj1.copy()
>>> obj2 = self.stop()
write_invocation(invocation_fpath)[source]

Write a helper file that contains a locally reproducable invocation of this process.

start()[source]
flush()[source]
stop()[source]
add_device_info(device)[source]

Add information about a torch device that was used in this process.

Does nothing if telemetry is disabled.

Parameters:

device (torch.device) – torch device to add info about

add_disk_info(path)[source]

Add information about a storage disk that was used in this process

Does nothing if telemetry is disabled.

geowatch.utils.process_context.jsonify_config(config)[source]

Converts an object to a jsonifiable config as best as possible

class geowatch.utils.process_context.Reconstruction[source]

Bases: object

geowatch.utils.process_context.main()[source]

Simple CLI to get hardware measurements that process context would provide.