Skip to content

WATonomous/wato_world

Repository files navigation

wato_world

Dockerized 3D auto-labeling pipeline for the WATonomous self driving car (dubbed EVE). Encompasses an offline batch system that turns rosbags (12 cameras + LiDAR + ego-pose) into 3D box tracks with class labels.

See docs/architecture.md (TODO) for the full pipeline description. This README covers infra only.

Layout

wato_world/
├── watod                    # entrypoint (mirrors wato_monorepo/watod)
├── watod-config.sh          # user-editable defaults
├── watod_scripts/           # helpers invoked by watod
├── src/                         # one Python package per pipeline component
│   ├── common/                  # shared lib: storage, schemas, geometry, calib
│   ├── ingest/
│   ├── perception_2d/
│   ├── lidar_preprocessing/
│   ├── proposal_generation/
│   ├── tracking/
│   ├── label_refinement/
│   ├── open_vocab_discovery/
│   └── student_training/
├── docker/                      # one Dockerfile per component + base + template
├── modules/                     # docker-compose.{yaml,infra,dev,gpu}.yaml
├── config/                      # prompts.yaml, pipeline.yaml, component_versions.yaml
├── data/                        # bind-mounted into containers (git-ignored)
└── notebooks/                   # ad-hoc analysis (rerun viewer scripts, etc.)

Quickstart

# 1. Edit defaults if needed.
$EDITOR watod-config.sh
# Optional: cp watod-config.local.sh.example watod-config.local.sh

# 2. Symlink watod into your PATH (one-time).
./watod install

# 3. Bring up a component.
watod -c ingest up

# 4. Run a component on a bag.
watod run ingest my_bag

# 5. Open a dev shell in a component container with source bind-mounted.
watod -c perception_2d:dev up
watod -t perception_2d_dev
> pytest /ws/src/perception_2d/tests

# 6. Tear everything down.
watod down all

Components

Component Purpose Image base GPU
ingest Decode rosbag → frames + lidar + poses + frame_index CPU no
perception_2d GroundingDINO + SAM 2 + DEVA + DINOv2 + x-cam merge CUDA yes
lidar_preprocessing Motion comp, static/dynamic split, ground mesh CPU no
proposal_generation LiDAR detector + Segment-Lift-Fit + fusion CUDA yes
tracking 3D Kalman + masklet association + DINOv2 ReID CUDA yes (light)
label_refinement Multimodal LabelFormer (bootstrap → learned) CUDA yes
open_vocab_discovery Rare-class discovery branch CUDA yes
student_training BEVFusion / TransFusion student training CUDA yes

Each component's Python package lives at src/<component>/src/wato_<component>/ and is pip-installed editable inside the container. Components communicate only through artifacts in data/artifacts/ (or s3://wato-world/... in prod) — no in-process imports across component boundaries.

Storage

  • Artifact store: data/artifacts/ bind-mounted at /data/artifacts. All paths flow through wato_common.storage, which uses fsspec so the same code works against s3://... URIs in prod.
  • Metadata index: artifact files themselves. Components write Parquet indexes, JSON manifests, and quality reports under data/artifacts/; no database service is required for the current pipeline.
  • Versioning: each component's output is namespaced by version (perception_2d/v1/...). Bump the version in config/component_versions.yaml whenever the model checkpoint or output schema changes.

Configuration

  • watod-config.sh — committed defaults (active components, GPU flag, registry).
  • watod-config.local.sh — optional, git-ignored, sourced after the main config. Use it to override per-host values.
  • modules/.env — auto-generated by watod_scripts/watod-setup-env.sh on every watod invocation. Never edit by hand.

Development

# Lint/format locally.
pip install pre-commit && pre-commit install
pre-commit run --all-files

# Run a component's tests inside its dev container.
watod test ingest

Build order (recommended)

  1. Skeleton + infra (this repo as-is): watod -c all build succeeds.
  2. Ingest end-to-end on one bag.
  3. Host-side rerun viewer (notebooks/).
  4. LiDAR preprocessing (CPU).
  5. 2D perception (heavy GPU pass).
  6. Proposal generation, LiDAR-only first; add SLF lift second.
  7. Tracking.
  8. Bootstrap label refinement (geometric only) → first auto-labels.
  9. Learned label refinement, open-vocabulary discovery, student training.

About

WATonomous offline auto 3D-annotation pipeline

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors