Skip to content

Conversation

@e-gineer
Copy link
Contributor

🎯 Purpose

This PR introduces experimental tiered CI workflows to address current CI reliability issues. This is a SAFE EXPERIMENT that doesn't affect production CI.

⚠️ Problem Statement

Current CI has critical issues:

  • 50% failure rate (10 failures in last 20 runs)
  • 7-10 minute feedback time on every commit
  • Heavy infrastructure tests in unit test suite causing random failures
  • No test categorization - all tests treated equally
  • Lost productivity: ~2.5 developer-hours/day debugging failures

💡 Solution: Tiered Testing

Tier 1: Quick Checks (< 2 minutes)

Workflow: 10-test-quick-EXPERIMENTAL.yaml

  • Triggers: Every push to test/** branches
  • What: Fast unit tests only (go test -short)
  • Goal: Immediate feedback

Tier 2: Standard Suite (< 10 minutes)

Workflow: 11-test-standard-EXPERIMENTAL.yaml

  • Triggers: Manual (workflow_dispatch)
  • What: All unit tests + 4 core acceptance tests
  • Goal: Comprehensive validation

Tier 3: Full Suite (< 20 minutes)

Workflow: 11-test-full-EXPERIMENTAL.yaml

  • Triggers: Manual (workflow_dispatch)
  • What: All 21 acceptance tests
  • Goal: Complete coverage

🔒 Safety

These workflows are EXPERIMENTAL and SAFE:

  • ✅ Only trigger on test/** branches
  • ✅ Don't interfere with production CI
  • ✅ Can be tested in isolation
  • ✅ Easy to rollback (just don't merge)

📊 Expected Outcomes

Metric Current Target Improvement
Success Rate 50% 95%+ 2x better
Time to Feedback 7-10 min < 2 min 5x faster
Wasted CI Time 100 min/day < 10 min/day 90% reduction
Dev Hours Lost 2.5/day 0.3/day 8x better

🧪 Testing Plan

  1. This PR tests Quick Checks

    • Quick tests will run automatically
    • Should complete in < 2 minutes
    • We'll push several commits to validate
  2. Manual Standard Test

    • Trigger via Actions tab: "Run workflow"
    • Should complete in < 10 minutes
    • Core tests should catch 90%+ of issues
  3. Manual Full Test

    • Trigger via Actions tab: "Run workflow"
    • Should complete in < 20 minutes
    • All 21 acceptance tests
  4. Metrics Collection

    • Run 5-10 iterations
    • Track: duration, success rate, which tier caught bugs
    • Compare with production CI

📝 Files Changed

New Workflows

  • .github/workflows/10-test-quick-EXPERIMENTAL.yaml - Quick tests
  • .github/workflows/11-test-standard-EXPERIMENTAL.yaml - Standard suite
  • .github/workflows/11-test-full-EXPERIMENTAL.yaml - Full suite

Documentation

See .ai/wip/ci-optimization/ for:

  • Comprehensive analysis
  • Implementation plan
  • Success metrics

🎬 Next Steps

If this experiment succeeds:

  1. Gather feedback from team
  2. Create production branch (remove EXPERIMENTAL)
  3. Modify existing workflows with conditional triggers
  4. Add documentation
  5. Merge to develop

If it doesn't work:

  1. Don't merge this PR
  2. No impact to production
  3. Learn and iterate

🔍 How to Test

Test Quick Workflow (Automatic)

Just push commits to this PR - quick tests run automatically!

Test Standard Workflow (Manual)

  1. Go to Actions tab
  2. Select "11 - Test: Standard Suite (EXPERIMENTAL)"
  3. Click "Run workflow"
  4. Select branch: test/ci-tiered-testing
  5. Click green "Run workflow" button

Test Full Workflow (Manual)

  1. Go to Actions tab
  2. Select "11 - Test: Full Suite (EXPERIMENTAL)"
  3. Click "Run workflow"
  4. Select branch: test/ci-tiered-testing
  5. Click green "Run workflow" button

📚 Background

✅ Success Criteria

  • Quick tests consistently complete in < 2 minutes
  • Quick tests have 95%+ success rate
  • Standard tests catch 90%+ of bugs
  • Full tests maintain 100% coverage
  • At least 5 successful test runs
  • Team approves approach

🤝 Feedback Welcome!

This is an experiment to improve our CI reliability and developer experience. Please:

  • Try running the workflows
  • Report any issues or suggestions
  • Share your experience with feedback times

🤖 Generated with Claude Code

Co-Authored-By: Claude [email protected]

e-gineer and others added 2 commits November 12, 2025 19:23
Add three experimental workflows to test tiered CI approach:

1. Quick Tests (< 2 min)
   - Runs on every push to test/** branches
   - Uses -short flag for fast feedback
   - Workflow: 10-test-quick-EXPERIMENTAL.yaml

2. Standard Suite (< 10 min)
   - Manual trigger only
   - All unit tests + 4 core acceptance tests
   - Workflow: 11-test-standard-EXPERIMENTAL.yaml

3. Full Suite (< 20 min)
   - Manual trigger only
   - All 21 acceptance tests
   - Workflow: 11-test-full-EXPERIMENTAL.yaml

These experimental workflows:
- Only trigger on test/** branches
- Don't interfere with production CI
- Allow us to test the tiered approach safely

Goal: Improve CI success rate from 50% to 95%+ and reduce
feedback time from 7-10 min to < 2 min.

Related to issue: CI reliability improvements
See: .ai/wip/ci-optimization/ for detailed analysis

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Fixed triggers to implement proper tiered testing:

**Tier 1 - Quick Tests** (< 2 min):
- Trigger: Every PR push + push to develop/main
- Goal: Fast feedback on every commit

**Tier 2 - Standard Suite** (< 10 min):
- Trigger: PR ready for review + push to develop/main
- Goal: Comprehensive validation before merge

**Tier 3 - Full Suite** (< 20 min):
- Trigger: Push to develop (post-merge) + nightly + label "test:full"
- Goal: Complete coverage

This matches the original plan: test actual development code with
progressive levels of thoroughness as code moves toward release.

Previous version incorrectly limited triggers to test/** branches only.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@e-gineer
Copy link
Contributor Author

🔧 Fix Applied: Proper Tiered Triggers

Just pushed a fix - the original commit had overly conservative triggers (only test/** branches). Updated to implement the proper tiered strategy:

✅ Corrected Trigger Strategy

Tier 1: Quick Checks - 10-test-quick-EXPERIMENTAL.yaml

Triggers:

  • ✅ Every pull_request push (all PRs, not just test branches)
  • ✅ Push to develop and main branches
  • ✅ Manual via workflow_dispatch

What it does:

  • Runs go test -short (< 30s timeout)
  • Skips expensive tests
  • Fast feedback: < 2 minutes

Tier 2: Standard Suite - 11-test-standard-EXPERIMENTAL.yaml

Triggers:

  • pull_request type ready_for_review (when you mark PR ready)
  • ✅ Push to develop and main branches
  • ✅ Manual via workflow_dispatch

What it does:

  • All unit tests (no -short flag)
  • 4 core acceptance tests (installation, plugin, connection_config, service)
  • Comprehensive: < 10 minutes

Tier 3: Full Suite - 11-test-full-EXPERIMENTAL.yaml

Triggers:

  • ✅ Push to develop (after merge)
  • ✅ Nightly at 2 AM (cron: '0 2 * * *')
  • pull_request with label test:full
  • ✅ Manual via workflow_dispatch

What it does:

  • All unit tests
  • All 21 acceptance tests
  • Complete coverage: < 20 minutes

🎯 The Progressive Strategy

PR Draft → Quick tests (2 min) → Fast feedback
    ↓
PR Ready → Standard tests (10 min) → Pre-merge validation  
    ↓
Merge → Full tests (20 min) → Post-merge verification
    ↓
Nightly → Full tests → Regression detection

This matches the original goal: test actual development code with increasing thoroughness as code progresses toward release.

The -EXPERIMENTAL suffix is just so these workflows can coexist with the current production workflow while we validate the approach. Once proven, we'll:

  1. Remove -EXPERIMENTAL suffix
  2. Modify the existing 11-test-acceptance.yaml to be less aggressive
  3. Enjoy faster, more reliable CI! 🎉

Nightly runs are redundant when running full tests on every merge to develop.

Benefits of removing nightly:
- Reduces CI resource usage
- Simplifies maintenance
- Full tests still run on:
  - Every merge to develop (frequent)
  - Manual trigger (when needed)
  - PR with "test:full" label (on-demand)

Nightly only makes sense if:
- Merges are infrequent (days/weeks apart)
- Tests have time-dependent edge cases
- External dependencies need regular validation

For active development with frequent merges, running full tests
on every develop merge provides sufficient coverage.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants