Prerequisites¶
Running smartflow requires setting up aws and kubectl. We have streamlined aws and kubctl install instructions but you may also fine the official resources useful:
Install:
AWS CLI tool
aws
(https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)Kubernetes command-line tool
kubectl
(https://kubernetes.io/docs/tasks/tools/#kubectl)
AWS Configuration¶
Run aws configure
. This common will ask you for some parameters:
aws account id
: 023300502152region
: us-west-2user name
,access key id
,secret key
: Coordinate with Yoni to get these if you don’t have them already
Configuring kubctl
to reach smartflow¶
(From David Joy)
Once kubectl
and aws
are installed you’ll want to configure it to be able to reach the cluster where Smartflow is running, here’s a little bash script that should do that for you:
ENVIRONMENT_NAME=kitware-prod-v2
export AWS_PROFILE="iarpa"
AWS_ACCOUNT_ID=$(aws sts --profile "$AWS_PROFILE" get-caller-identity --query "Account" --output text)
echo "Verify this is your correct kitware-smart AWS Account ID"
echo "AWS_ACCOUNT_ID = $AWS_ACCOUNT_ID"
AWS_REGION=us-west-2
aws eks --profile iarpa --region $AWS_REGION update-kubeconfig \
--name "smartflow-${ENVIRONMENT_NAME}-eks" \
--role-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:role/smartflow-${ENVIRONMENT_NAME}-${AWS_REGION}-eks-admin"
Connecting to Smartflow¶
You can forward the Smartflow GUI port to your local machine with the following command:
kubectl -n airflow port-forward service/airflow-webserver 8080:8080
And then reach the GUI at: http://localhost:8080
Launch a KIT_TRAIN_WITH_YAML
job¶
Find the KIT_TRAIN_WITH_YAML
job in the main interface. To launch, click the green “Play” button, and choose to “Trigger DAG w/ config”. Currently available options:
smart_version_tag
: Defaults to “main”. This is the git tag for the smart repo that you want to run your experiment. You will want to change this if you have changes in a feature branch you want to try.dataset_name
: Defaults to “Drop4-BAS”. This is the name of the dataset stored on EFS that your model will be trained against. The assumed full path to the training kwcoco file (at this time) will be/efs/work/greenwell/data/smart_data_dvc/$DATASET_NAME/data_train.kwcoco.json
.experiment_name
: Defaults to “Baseline”. Determines, among other things, where the default logs will write to.experiment_yaml_relpath
: The relative path within the smart codebase to a yaml file describing your fusion experiment.
Copying your own DAG to Smartflow’s S3 Bucket¶
aws s3 --profile iarpa cp $DAG_FNAME s3://smartflow-023300502152-us-west-2/smartflow/env/kitware-prod-v2/dags/$DAG_FNAME
Logging into a running DAG step / K8s pod¶
kubectl -n airflow get pods
kubectl -n airflow exec -it pods/$POD_ADDR -- bash