Troubleshooting

Common failure modes across isolated environments, overlays, snapshots, diagnostics, and governed operations.

Use this page when an environment fails to provision, route, restore data, or pass governance checks.

First Triage Commands

microstax env get <env-id>
microstax env status <env-id>
microstax env logs <env-id>
microstax env traces <env-id>
microstax env diagnose <env-id>

If you need raw cluster detail:

kubectl get pods -n <namespace>
kubectl describe pod <pod-name> -n <namespace>
kubectl get ingress,svc,networkpolicy -n <namespace>

Provisioning Problems

Environment stuck in provisioning

Likely causes:

  • image pull failure
  • invalid probe configuration
  • missing secret or volume
  • snapshot restore job still running or failing

Check:

  • pod events for ErrImagePull or CrashLoopBackOff
  • init or restore jobs
  • Blueprint paths and resource settings

Environment enters error

Run:

microstax env diagnose <env-id>
microstax env logs <env-id>

Common causes:

  • bad image tag
  • invalid resource spec
  • missing dependency service
  • malformed snapshot or mock configuration

Routing And Overlay Problems

Overlay traffic is not hitting the overlay

Check:

  • the overlay was created with routing.mode: overlay
  • the request includes the correct header value, usually x-msx-env
  • overlayId matches the intended routing target
  • propagateHeader is enabled when downstream services must remain in overlay context

Overlay provisions too much or too little

Possible causes:

  • baseline mismatch
  • service names differ between baseline and overlay
  • provider mappings or inheritance assumptions are wrong

What to verify:

  • service names are stable across Blueprints
  • baselineId points to the intended parent environment
  • only changed services are present in the overlay Blueprint when using sparse workflows

Baseline promotion is blocked

This usually means governance policy rejected the action. Check:

microstax governance logs
microstax org compliance

Snapshot And Seed Problems

Snapshot restore fails

Check:

  • engine matches the actual datastore
  • sourceSecretRef points to a real secret and key
  • snapshot storage settings are correct
  • sanitization rules reference valid fields

Seed packages do not run as expected

Check:

  • the seed package exists in the registry
  • target service naming matches the environment
  • the environment is already healthy before seed execution

If both snapshot and seeds are configured, expect snapshot restore first and additive seeding after.

Mocking And Shadow Problems

Mock does not replace or mirror traffic correctly

Verify:

  • mock.enabled: true
  • the behavior mode is valid
  • deployment.mode is one of replace, sidecar, or mirror
  • the referenced OpenAPI or proto file exists

Behavioral diffs are empty or confusing

Check:

  • traffic is actually reaching the mirrored path
  • mirror percentage is non-zero
  • the compared baseline and shadow environments are the intended pair

API And UI Access Problems

CLI cannot reach the API

Check:

  • --api URL or MICROSTAX_API
  • API health endpoint
  • firewall, DNS, or local port assumptions
curl http://localhost:3001/health

Dashboard shows missing logs or stale status

Check:

  • API health
  • WebSocket or polling path availability
  • environment namespace still exists

When To Escalate

Escalate to platform operators when:

  • the cluster is healthy but multiple environments fail the same way
  • governance blocks an action you believe should be allowed
  • snapshot storage or sanitization policies are failing centrally
  • routing problems affect multiple overlays or baselines
Troubleshooting | MicroStax Documentation