Skip to content

Conversation

@withinfocus
Copy link
Contributor

@withinfocus withinfocus commented Nov 26, 2025

🎟️ Tracking

https://bitwarden.slack.com/archives/C065AUDER62/p1764108324477319

📔 Objective

Adds automated validation to ensure database migrations are added in chronological order.

Changes

  • Added a new check-migration-order job to the test-database.yml workflow
  • For pull requests: Compares migration files against the latest main branch to ensure new migrations come after all existing ones
  • For pushes: Compares against the previous commit to validate chronological ordering
  • Blocks merge if migrations are out of order with clear error messages and remediation steps

Why

Database migrations must execute in chronological order. This check prevents merge conflicts and ensures migrations run in the correct sequence by validating that new migration filenames (format: YYYY-MM-DD_NN_Description.sql) sort alphabetically after existing ones.

⏰ Reminders before review

  • Contributor guidelines followed
  • All formatters and local linters executed and passed
  • Written new unit and / or integration tests where applicable
  • Protected functional changes with optionality (feature flags)
  • Used internationalization (i18n) for all UI strings
  • CI builds passed
  • Communicated to DevOps any deployment requirements
  • Updated any necessary documentation (Confluence, contributing docs) or informed the documentation team

🦮 Reviewer guidelines

  • 👍 (:+1:) or similar for great changes
  • 📝 (:memo:) or ℹ️ (:information_source:) for notes or general info
  • ❓ (:question:) for questions
  • 🤔 (:thinking:) or 💭 (:thought_balloon:) for more open inquiry that's not quite a confirmed issue and could potentially benefit from discussion
  • 🎨 (:art:) for suggestions / improvements
  • ❌ (:x:) or ⚠️ (:warning:) for more significant problems or concerns needing attention
  • 🌱 (:seedling:) or ♻️ (:recycle:) for future improvements or indications of technical debt
  • ⛏ (:pick:) for minor or nitpick changes

@withinfocus withinfocus marked this pull request as ready for review November 26, 2025 19:57
@withinfocus withinfocus requested a review from a team as a code owner November 26, 2025 19:57
@withinfocus withinfocus requested review from a team and dani-garcia November 26, 2025 19:57
@claude
Copy link
Contributor

claude bot commented Nov 26, 2025

Claude finished @withinfocus's task —— View job


Review Update - Recent Changes

  • Analyze previous review findings
  • Check commit history since last review
  • Read workflow implementation
  • Identify new issues and track previous findings
  • Post comprehensive review

Summary of Changes Since Last Review

The author has made 4 commits addressing previous review feedback:

  1. 72aef86 - Made regex permissive to allow past incorrectly-named migrations (addresses Finding 1)
  2. 1ab6ad3 - Removed temporary file cleanup (addresses Finding 4)
  3. 98c2894 - Added clarifying comment about script location
  4. 2f1fe50 - Made string checks more robust

Status of Previous Findings

Finding 1 (Format enforcement) - ❌ PARTIALLY FIXED, NEW ISSUE INTRODUCED

The author made the regex permissive ([0-9]{1,2} instead of [0-9]{2}), which allows existing migrations like 2025-10-3_00_*.sql to pass. However, this creates a new problem: the script enforces strict format only for NEW migrations but allows looser format for existing ones.

Critical flaw: The migration 2025-10-3_00_AddOrganizationIntegration...sql (line 284 in my check) will sort AFTER 2025-10-30_00_OrganizationReport_UpdateMetrics.sql in ASCII/C locale because string comparison is lexicographic: "2025-10-3" > "2025-10-30" (character '3' > character '3' at position 8, comparing "3_" vs "30").

This means:

  • The chronological validation logic is fundamentally broken for the existing incorrectly-named migration
  • New migrations dated after October 30 but before November will fail validation incorrectly
  • The script's assumption that filename sorting equals chronological sorting is violated

Recommendation: Either:

  1. Rename 2025-10-3_00_*.sql to 2025-10-03_00_*.sql (preferred - fixes root cause)
  2. Use date parsing instead of string comparison (complex, error-prone)

Finding 2 (git fetch error handling) - ⚠️ UNFIXED

Line .github/workflows/test-database.yml:284 still lacks error handling:

git fetch origin main:main

If this fails silently, validation proceeds with stale/missing data, producing incorrect results.


Finding 3 (Temporary file cleanup) - ✅ FIXED

The author intentionally removed cleanup per commit 1ab6ad3, noting the files are in /tmp/ and automatically cleaned up by the runner.


Finding 4 (Multiple new migrations validation) - 💭 BEHAVIOR CONFIRMED

The script validates each new migration against LAST_MAIN_MIGRATION but not against each other. This is acceptable behavior - if a PR adds migrations out of order relative to each other, they'll fail the chronological check anyway.


Finding 5 (Enhanced error messages) - 🎨 UNFIXED

Error messages at .github/workflows/test-database.yml:348-354 could still be enhanced with git command examples for developers.


New Critical Finding

Finding 6 - ❌ Broken chronological ordering due to existing non-compliant migration

The repository contains 2025-10-3_00_AddOrganizationIntegration_ReadByTeamsConfigurationTenantIdTeamIdStoredProcedure.sql which lacks the leading zero in the day field.

Problem: Lexicographic string comparison (used by the script) produces different ordering than chronological date ordering:

# String comparison (what the script does):
"2025-10-3" > "2025-10-30"  # TRUE - compares character by character
                             # At position 8: '3' (from "10-3") > '3' (from "10-30") is false
                             # At position 9: '_' (from "3_") > '0' (from "30") is TRUE

# Chronological comparison (what we want):
2025-10-03 < 2025-10-30     # TRUE

Impact:

  • Any new migration with dates between 2025-10-30 and 2025-11-03 will incorrectly fail validation
  • The sorting order of migrations doesn't reflect chronological order
  • Future developers will be confused by cryptic validation failures

Evidence from codebase:

2025-10-28_00_AddOrganizationReportMetricColumns.sql
2025-10-30_00_OrganizationReport_UpdateMetrics.sql
2025-10-3_00_AddOrganizationIntegration_...sql      ← Sorts AFTER 10-30!
2025-11-04_00_BlockClaimedDomainAccountCreationPolicy.sql

Recommended fix: Rename the problematic file before merging this PR:

git mv util/Migrator/DbScripts/2025-10-3_00_AddOrganizationIntegration_ReadByTeamsConfigurationTenantIdTeamIdStoredProcedure.sql \
     util/Migrator/DbScripts/2025-10-03_00_AddOrganizationIntegration_ReadByTeamsConfigurationTenantIdTeamIdStoredProcedure.sql

Good Practices Observed

  • Used LC_ALL=C for consistent locale-independent sorting
  • Comprehensive error messages explaining the required format
  • Handled both PR and push event types with appropriate logic
  • Added helpful examples in error messages

Action Items for Author

  1. CRITICAL: Rename 2025-10-3_00_AddOrganizationIntegration_ReadByTeamsConfigurationTenantIdTeamIdStoredProcedure.sql to include leading zero (2025-10-03_00_...)
  2. HIGH: Add error handling for git fetch command (.github/workflows/test-database.yml:284)
  3. MEDIUM: Consider enhancing error messages with specific git commands to help developers fix issues

@codecov
Copy link

codecov bot commented Nov 26, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 57.12%. Comparing base (d3e4ac7) to head (72aef86).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6652      +/-   ##
==========================================
+ Coverage   53.27%   57.12%   +3.84%     
==========================================
  Files        1906     1906              
  Lines       84973    84973              
  Branches     7639     7639              
==========================================
+ Hits        45268    48538    +3270     
+ Misses      37952    34607    -3345     
- Partials     1753     1828      +75     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 26, 2025

Logo
Checkmarx One – Scan Summary & Details61cda105-d16d-4e41-af23-4b3ec347c3d0

Great job! No new security vulnerabilities introduced in this pull request

export LC_ALL=C
# Get list of migrations from main branch
git fetch origin main:main
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding 1 continued: This fetch command should have error handling. If it fails, the script will continue with a stale or non-existent local main branch.

Suggestion:

if ! git fetch origin main:main; then
  echo "ERROR: Failed to fetch main branch"
  exit 1
fi

MIGRATION_NAME=$(basename "$migration")
# Validate migration filename format (YYYY-MM-DD_NN_Description.sql with leading zeros)
if ! [[ "$MIGRATION_NAME" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}_[0-9]{2}_.+\.sql$ ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding 1 - Format enforcement issue: This regex requires leading zeros ([0-9]{2} for month and day), but the repository already contains at least one migration that violates this:

2025-10-3_00_AddOrganizationIntegration_ReadByTeamsConfigurationTenantIdTeamIdStoredProcedure.sql

This will cause validation failures on PRs that rebase against main if the problematic migration gets included.

Options:

  1. Rename existing non-compliant migrations before merging this PR (preferred for consistency)
  2. Make regex more permissive: ^[0-9]{4}-[0-9]{1,2}-[0-9]{1,2}_[0-9]{2}_.+\.sql$

fi
# Compare migration name with last main migration
if [[ "$MIGRATION_NAME" < "$LAST_MAIN_MIGRATION" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 Finding 5 - Multiple new migrations: When a PR adds multiple migrations (e.g., 2025-11-27_00_A.sql and 2025-11-26_00_B.sql), this only validates each against LAST_MAIN_MIGRATION, not against each other.

The out-of-order new migrations would be caught (B would fail since it's < A alphabetically), but the error message might be confusing since it references the main branch rather than the other new migration.

Is this the intended behavior, or should we also validate ordering within the set of new migrations?

# Save to temporary files for comparison
echo "$MAIN_MIGRATIONS" > /tmp/main_migrations.txt
echo "$CURRENT_MIGRATIONS" > /tmp/current_migrations.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Finding 4 - Cleanup: Consider adding cleanup for temporary files at the end of the script:

rm -f /tmp/main_migrations.txt /tmp/current_migrations.txt

echo "New migrations must have dates/names that come after existing migrations."
echo "Expected format: YYYY-MM-DD_NN_Description.sql (with leading zeros)"
echo ""
echo "To fix this issue:"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎨 Finding 6 - Enhanced error message: Consider adding the specific git command to help developers fix the issue:

echo "To fix this issue:"
echo "  1. Rename using git:"
echo "     git mv util/Migrator/DbScripts/$MIGRATION_NAME util/Migrator/DbScripts/YYYY-MM-DD_NN_NewName.sql"
echo "  2. Choose a date after $LAST_MAIN_MIGRATION"
echo "  3. Commit the change: git commit -m 'Rename migration to fix ordering'"

export LC_ALL=C
# Get list of migrations from main branch
git fetch origin main:main
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Missing error handling: If git fetch fails, the script continues with potentially stale or missing data, leading to incorrect validation results.

Suggested change
git fetch origin main:main
if ! git fetch origin main:main; then
echo "ERROR: Failed to fetch main branch"
exit 1
fi

# Validate NEW migration filename format - enforce strict format for new migrations
# Required format: YYYY-MM-DD_NN_Description.sql (with leading zeros in date and sequence)
if ! [[ "$MIGRATION_NAME" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}_[0-9]{2}_.+\.sql$ ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical: Regex allows non-compliant migration that breaks sorting

The repository contains 2025-10-3_00_AddOrganizationIntegration_ReadByTeamsConfigurationTenantIdTeamIdStoredProcedure.sql (note: 10-3 not 10-03).

Problem: Lexicographic string comparison produces incorrect ordering:

  • "2025-10-3" sorts AFTER "2025-10-30" in ASCII comparison
  • At character position 9: '3' (from "10-3_") vs '0' (from "10-30")
  • Character '3' has higher ASCII value than '0', so "10-3" > "10-30"

Impact:

  • New migrations dated 2025-10-30 through 2025-11-03 will incorrectly fail validation
  • Chronological order is violated in the migration list
  • Sorting doesn't reflect actual date order

Fix required before merging: Rename the non-compliant migration:

git mv util/Migrator/DbScripts/2025-10-3_00_AddOrganizationIntegration_ReadByTeamsConfigurationTenantIdTeamIdStoredProcedure.sql \
     util/Migrator/DbScripts/2025-10-03_00_AddOrganizationIntegration_ReadByTeamsConfigurationTenantIdTeamIdStoredProcedure.sql

Then restore strict regex validation:

Suggested change
if ! [[ "$MIGRATION_NAME" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}_[0-9]{2}_.+\.sql$ ]]; then
if ! [[ "$MIGRATION_NAME" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}_[0-9]{2}_.+\.sql$ ]]; then

echo " 3. Include a 2-digit sequence number (e.g., _00, _01)"
echo " 4. Have a filename that sorts after the last migration in main"
echo ""
echo "To fix this issue:"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎨 Enhancement: Add specific git commands to help developers fix issues:

Suggested change
echo "To fix this issue:"
echo "To fix this issue:"
echo " 1. Rename using git:"
echo " git mv util/Migrator/DbScripts/$MIGRATION_NAME util/Migrator/DbScripts/YYYY-MM-DD_NN_NewName.sql"
echo " 2. Choose a date after $LAST_MAIN_MIGRATION"
echo " 3. Commit: git commit -m 'Rename migration to fix chronological ordering'"
echo ""

@withinfocus
Copy link
Contributor Author

Had some good back and forth from Claude that I wanted to commit explicitly, but I think this is at a good point. What it doesn't know is we will not ever introduce past migrations, so I don't think we should do anything further here.

@withinfocus withinfocus requested a review from eliykat November 26, 2025 21:08
Copy link
Member

@eliykat eliykat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for following this up! I think it'll be useful.

I'm not sure how to handle the incorrectly named migration, seems like a fair comment by Claude, but fixing the name will cause it to be re-run. I guess that's OK as long as we verify that it's properly idempotent (our scripts should be but this is not tested/enforced).

Comment on lines +318 to +319
echo "ERROR: Migration '$MIGRATION_NAME' does not match required format"
echo "Required format: YYYY-MM-DD_NN_Description.sql"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 Also a nice improvement here.

echo "SUCCESS: All new migrations are correctly named and in chronological order"
shell: bash

- name: Validate new migrations for push
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume these 2 jobs are identical except for the source of the current migrations? I suggest that the comparison logic could be moved into a script for reuse, e.g.

// job-specific logic to get the migrations and write them to temp files

// then run the comparison/validation
sh verify-migrations.sh /tmp/prev_migrations.txt /tmp/current_migrations.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants