Skip to content

Conversation

@MikeMcQuaid
Copy link
Member

WIP but to give an idea so @Rylan12 can take a look

Copilot AI review requested due to automatic review settings November 16, 2025 15:13
Copilot finished reviewing on behalf of MikeMcQuaid November 16, 2025 15:16
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the cask API generation code by extracting it into a reusable generate_cask_api! method in the Homebrew::API module, and adds functionality to generate a new combined internal JSON file that includes both formulae and casks data for each platform tag.

Key changes:

  • Extracted cask API generation logic from generate-cask-api.rb into Homebrew::API.generate_cask_api! method
  • Added generation of combined packages.#{bottle_tag}.json files containing both formulae and casks data
  • Added strip_unneeded_internal_fields method (currently unused) to API module

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.

File Description
Library/Homebrew/dev-cmd/generate-cask-api.rb Simplified to delegate to the new Homebrew::API.generate_cask_api! method
Library/Homebrew/api.rb Added generate_cask_api! method, cask_html_template helper, and strip_unneeded_internal_fields method
Library/Homebrew/dev-cmd/generate-formula-api.rb Calls generate_cask_api! to retrieve cask data and generates combined packages.#{bottle_tag}.json files

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 146 to 147
core_aliases:, core_renames:, core_tap_migrations:,
cask_renames:, cask_tap_migrations: }
Copy link

Copilot AI Nov 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent indentation in hash literal. The hash should have consistent indentation across all lines. Lines 146-147 should be indented to align with line 145.

Suggested change
core_aliases:, core_renames:, core_tap_migrations:,
cask_renames:, cask_tap_migrations: }
core_aliases:, core_renames:, core_tap_migrations:,
cask_renames:, cask_tap_migrations: }

Copilot uses AI. Check for mistakes.
Comment on lines 166 to 207
sig { params(json: T::Hash[String, T.untyped]).returns(T::Hash[String, T.untyped]) }
def self.strip_unneeded_internal_fields(json)
json.slice(*%w[
caveats
uses_from_macos
requirements
specs
_dependencies
desc
homepage
license
revision
version_scheme
urls
versions
no_autobump_msg
bottle
pour_bottle_only_if
keg_only_reason
deprecation_date
deprecation_replacement_formula
deprecation_replacement_cask
disable_date
disable_reason
disable_replacement_formula
disable_replacement_cask
conflicts_with
conflicts_with_reasons
link_overwrite
post_install_defined
service
tap_git_head
oldnames
oldname
aliases
versioned_formulae
ruby_source_path
ruby_source_checksum
ruby_source_sha256
])
end

Copy link

Copilot AI Nov 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strip_unneeded_internal_fields method is defined but never used in this PR or in the visible code. If this method is not needed yet, consider removing it or adding a comment explaining its future use.

Suggested change
sig { params(json: T::Hash[String, T.untyped]).returns(T::Hash[String, T.untyped]) }
def self.strip_unneeded_internal_fields(json)
json.slice(*%w[
caveats
uses_from_macos
requirements
specs
_dependencies
desc
homepage
license
revision
version_scheme
urls
versions
no_autobump_msg
bottle
pour_bottle_only_if
keg_only_reason
deprecation_date
deprecation_replacement_formula
deprecation_replacement_cask
disable_date
disable_reason
disable_replacement_formula
disable_replacement_cask
conflicts_with
conflicts_with_reasons
link_overwrite
post_install_defined
service
tap_git_head
oldnames
oldname
aliases
versioned_formulae
ruby_source_path
ruby_source_checksum
ruby_source_sha256
])
end

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's worth programmatically aligning this with what's actually called in Formulary::load_formula_from_json!. I know there have been some cases where an item was added to the API but not to the API loader, so I wonder if we can avoid that.

I will brainstorm, but that's definitely something outside the scope of this initial PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's worth programmatically aligning this with what's actually called in Formulary::load_formula_from_json!. I know there have been some cases where an item was added to the API but not to the API loader, so I wonder if we can avoid that.

I will brainstorm, but that's definitely something outside the scope of this initial PR

Yeh, I agree. I wonder if something like a Sorbet struct would be best fit here.

Comment on lines 129 to 133
internal_formulae = all_formulae.to_h do |name, formula|
formula = Homebrew::API.merge_variations(formula, bottle_tag:)

[name, formula]
end
Copy link

Copilot AI Nov 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code duplication: internal_formulae is essentially recreating the same data structure as the formulae variable created earlier (lines 78-122), but without the transformation to the condensed format. Consider if this duplication is necessary or if the full formula data structure should be reused from all_formulae directly.

Copilot uses AI. Check for mistakes.
Comment on lines 137 to 147
formulae = internal_formulae
casks = casks_json_contents.fetch(:casks)
core_aliases = aliases
core_renames = renames
core_tap_migrations = tap_migrations
cask_renames = casks_json_contents.fetch(:renames)
cask_tap_migrations = casks_json_contents.fetch(:tap_migrations)

internal_json_contents = { formulae:, casks:,
core_aliases:, core_renames:, core_tap_migrations:,
cask_renames:, cask_tap_migrations: }
Copy link

Copilot AI Nov 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The intermediate variable assignments on lines 137-143 add unnecessary complexity without providing clarity. The variables are only used once in the hash literal below. Consider either using the values directly in the hash or adding comments to explain why these intermediate variables are needed.

Suggested change
formulae = internal_formulae
casks = casks_json_contents.fetch(:casks)
core_aliases = aliases
core_renames = renames
core_tap_migrations = tap_migrations
cask_renames = casks_json_contents.fetch(:renames)
cask_tap_migrations = casks_json_contents.fetch(:tap_migrations)
internal_json_contents = { formulae:, casks:,
core_aliases:, core_renames:, core_tap_migrations:,
cask_renames:, cask_tap_migrations: }
internal_json_contents = {
formulae: internal_formulae,
casks: casks_json_contents.fetch(:casks),
core_aliases: aliases,
core_renames: renames,
core_tap_migrations: tap_migrations,
cask_renames: casks_json_contents.fetch(:renames),
cask_tap_migrations: casks_json_contents.fetch(:tap_migrations),
}

Copilot uses AI. Check for mistakes.
core_aliases:, core_renames:, core_tap_migrations:,
cask_renames:, cask_tap_migrations: }
unless args.dry_run?
puts "api/internal/packages.#{bottle_tag}.json"
Copy link

Copilot AI Nov 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug output left in code. The puts statement appears to be debug code that should be removed or replaced with proper logging using ohai or similar Homebrew logging methods.

Suggested change
puts "api/internal/packages.#{bottle_tag}.json"
ohai "api/internal/packages.#{bottle_tag}.json"

Copilot uses AI. Check for mistakes.
[name, formula]
end

casks_json_contents = all_casks.fetch(bottle_tag.to_sym)
Copy link

Copilot AI Nov 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using fetch without a default value will raise a KeyError if the key doesn't exist. Since this is called with dry_run: true on line 71, consider verifying that generate_cask_api! returns all expected bottle tags in its hash, or handle the potential KeyError exception.

Suggested change
casks_json_contents = all_casks.fetch(bottle_tag.to_sym)
casks_json_contents = all_casks.fetch(bottle_tag.to_sym, { casks: [], renames: {}, tap_migrations: {} })

Copilot uses AI. Check for mistakes.
File.write("api/cask-source/#{name}.rb", cask_source)
File.write("cask/#{name}.html", html_template_name)
end
rescue
Copy link

Copilot AI Nov 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bare rescue without specifying an exception type will catch all exceptions including SystemExit and SignalException, which can make debugging difficult and prevent proper program termination. Consider rescuing a specific exception type or at minimum StandardError.

Suggested change
rescue
rescue StandardError

Copilot uses AI. Check for mistakes.
end
rescue
onoe "Error while generating data for cask '#{path.stem}'."
Kernel.raise
Copy link

Copilot AI Nov 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent use of Kernel.raise vs raise. On line 415, Kernel.raise is used, but on line 450, a bare raise is used within the rescue block. For consistency and to satisfy Sorbet's strict typing requirements, use Kernel.raise consistently throughout the new method.

Copilot uses AI. Check for mistakes.
Copy link
Member

@Rylan12 Rylan12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will take a closer look tomorrow, but I think this seems like a good approach!

I think we should deprecate generate-cask-api and generate-formula-api in favor of a single generate-package-api that does all of the api files except analytics, unless there's still a good reason to keep them separate (I don't remember what the initial thinking was)

Edit: one potential reason might be to keep generation time low, since the formula/cask jobs could run in parallel?

Comment on lines 166 to 207
sig { params(json: T::Hash[String, T.untyped]).returns(T::Hash[String, T.untyped]) }
def self.strip_unneeded_internal_fields(json)
json.slice(*%w[
caveats
uses_from_macos
requirements
specs
_dependencies
desc
homepage
license
revision
version_scheme
urls
versions
no_autobump_msg
bottle
pour_bottle_only_if
keg_only_reason
deprecation_date
deprecation_replacement_formula
deprecation_replacement_cask
disable_date
disable_reason
disable_replacement_formula
disable_replacement_cask
conflicts_with
conflicts_with_reasons
link_overwrite
post_install_defined
service
tap_git_head
oldnames
oldname
aliases
versioned_formulae
ruby_source_path
ruby_source_checksum
ruby_source_sha256
])
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's worth programmatically aligning this with what's actually called in Formulary::load_formula_from_json!. I know there have been some cases where an item was added to the API but not to the API loader, so I wonder if we can avoid that.

I will brainstorm, but that's definitely something outside the scope of this initial PR

@MikeMcQuaid MikeMcQuaid marked this pull request as draft November 17, 2025 08:59
Copilot AI review requested due to automatic review settings November 18, 2025 03:09
Copilot finished reviewing on behalf of Rylan12 November 18, 2025 03:12
@Rylan12
Copy link
Member

Rylan12 commented Nov 18, 2025

I pushed some changes, moving the generation to a new generate-package-api command that can accept --only-core or --only-cask to work on only a particular repo.

It's very much a work in progress, but hopefully you can see the vision. I left several to-do comments where I didn't finish working.

I also haven't yet done any cask filtering, and realized I removed the filtering function you had written, apologies! (It's still in the original commit, I didn't force push. Just not the latest version)

As always, feel free to modify/change direction/revert as desired.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 14 comments.

Files not reviewed (1)
  • Library/Homebrew/sorbet/rbi/dsl/homebrew/dev_cmd/generate_package_api.rbi: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 221 to 222
# TODO: only include the relevant bottle info
# I think sample code used to live in `generate-formula-api`, but this commit removes it...
Copy link

Copilot AI Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO comment indicates uncertainty about how to properly handle bottle information. The old code in generate-formula-api.rb had logic to extract specific bottle fields like pkg_version, version_scheme, rebuild, sha256, and dependencies. This comment suggests that logic may not have been properly ported yet.

The old code contained logic like:

version = Version.new(formula.dig("versions", "stable"))
pkg_version = PkgVersion.new(version, formula["revision"])
# ... more bottle-specific processing

If this processing is missing, the bottle data structure in the internal API might be incorrect or incomplete.

Copilot uses AI. Check for mistakes.
service
tap_git_head
oldnames
oldname
Copy link

Copilot AI Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent field naming: The method includes both oldname (line 199) and oldnames (line 198) in the list. Based on the PROPERTIES definition in InternalFormulaHash, only oldnames (plural) exists as a valid field. The singular oldname appears to be incorrect and should likely be removed unless there's a specific reason for keeping both variants.

Suggested change
oldname

Copilot uses AI. Check for mistakes.
caveats
uses_from_macos
requirements
specs
Copy link

Copilot AI Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specs field (line 172) is included in the slice list but doesn't appear in the InternalFormulaHash PROPERTIES definition. This field may be unused or incorrectly named. Verify whether this field exists in the actual data structure or if it should be removed from the list.

Suggested change
specs

Copilot uses AI. Check for mistakes.

unless dry_run?
directories = ["_data/#{type}", "api/#{type}", type.to_s]
directories << "api/cask_source" if type == :cask
Copy link

Copilot AI Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The directory cleanup logic is inconsistent. Line 54 shows that "api/cask_source" is added to the directories list when type == :cask, but on line 53 the base directories only include "api/#{type}" which would be "api/cask". However, the actual directory that should be in the list is "api/cask-source" (with a hyphen), not "api/cask_source" (with an underscore).

This mismatch means:

  1. The old code created/cleaned up "api/cask-source" (hyphenated)
  2. The new code would try to clean up "api/cask_source" (underscored)

This should be corrected to:

directories << "api/cask-source" if type == :cask
Suggested change
directories << "api/cask_source" if type == :cask
directories << "api/cask-source" if type == :cask

Copilot uses AI. Check for mistakes.
Comment on lines +72 to +73
# TODO: double check that this is fine for formulae, since they used -1 before for some reason
latest_macos = MacOSVersion.new(HOMEBREW_MACOS_NEWEST_SUPPORTED).to_sym
Copy link

Copilot AI Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment suggests that formulae used -1 before (referring to HOMEBREW_MACOS_NEWEST_UNSUPPORTED - 1), but the code now uses HOMEBREW_MACOS_NEWEST_SUPPORTED for both formulae and casks.

The old formula code used:

MacOSVersion.new((HOMEBREW_MACOS_NEWEST_UNSUPPORTED.to_i - 1).to_s).to_sym

While the old cask code used:

MacOSVersion.new(HOMEBREW_MACOS_NEWEST_SUPPORTED).to_sym

This is a potential breaking change that changes the macOS version used for formula generation. This should either be intentional and the TODO removed, or the original logic should be preserved with different values for formulae vs casks.

Suggested change
# TODO: double check that this is fine for formulae, since they used -1 before for some reason
latest_macos = MacOSVersion.new(HOMEBREW_MACOS_NEWEST_SUPPORTED).to_sym
# Use HOMEBREW_MACOS_NEWEST_UNSUPPORTED - 1 for formulae, HOMEBREW_MACOS_NEWEST_SUPPORTED for casks
latest_macos = if type == :formula
MacOSVersion.new((HOMEBREW_MACOS_NEWEST_UNSUPPORTED.to_i - 1).to_s).to_sym
else
MacOSVersion.new(HOMEBREW_MACOS_NEWEST_SUPPORTED).to_sym
end

Copilot uses AI. Check for mistakes.
versioned_formulae
ruby_source_path
ruby_source_checksum
ruby_source_sha256
Copy link

Copilot AI Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method includes both ruby_source_checksum (line 203) and ruby_source_sha256 (line 204) in the list. Based on the InternalFormulaHash PROPERTIES definition, only ruby_source_checksum exists. The ruby_source_sha256 field appears to be extraneous or incorrectly named and should likely be removed unless there's a specific data structure that uses this alternate field name.

Suggested change
ruby_source_sha256

Copilot uses AI. Check for mistakes.
Comment on lines +167 to +175
redirect_from_string = ("redirect_from: /formula-linux/#{title}\n" if type == :formula)

<<~EOS
---
title: '#{title}'
layout: #{type}
#{redirect_from_string}---
{{ content }}
EOS
Copy link

Copilot AI Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a potential formatting issue with the YAML front matter. When redirect_from_string is nil (for cask type), this will produce an extra blank line before the closing ---, resulting in:

---
title: 'example'
layout: cask

---

This should be:

---
title: 'example'
layout: cask
---

Consider using a conditional to handle the trailing newline, e.g.:

redirect_from_line = "redirect_from: /formula-linux/#{title}" if type == :formula

<<~EOS
  ---
  title: '#{title}'
  layout: #{type}
  #{redirect_from_line}
  ---
  {{ content }}
EOS
Suggested change
redirect_from_string = ("redirect_from: /formula-linux/#{title}\n" if type == :formula)
<<~EOS
---
title: '#{title}'
layout: #{type}
#{redirect_from_string}---
{{ content }}
EOS
lines = [
"---",
"title: '#{title}'",
"layout: #{type}"
]
lines << "redirect_from: /formula-linux/#{title}" if type == :formula
lines << "---"
"#{lines.join("\n")}\n{{ content }}\n"

Copilot uses AI. Check for mistakes.
Comment on lines +86 to +89
# TODO: add cask-source
File.write("_data/#{type}/#{name.tr("+", "_")}.json", "#{json}\n")
File.write("api/#{type}/#{name}.json", json_template(type: type))
File.write("#{type}/#{name}.html", html_template(name, type: type))
Copy link

Copilot AI Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO comment indicates that cask source file writing is not implemented. The old code in generate-cask-api.rb included:

File.write("api/cask-source/#{name}.rb", cask_source)

However, this functionality is missing in the new implementation. The cask_source variable would need to be obtained from path.read and written to the appropriate file. This is a regression from the previous implementation.

Suggested change
# TODO: add cask-source
File.write("_data/#{type}/#{name.tr("+", "_")}.json", "#{json}\n")
File.write("api/#{type}/#{name}.json", json_template(type: type))
File.write("#{type}/#{name}.html", html_template(name, type: type))
# Write cask source file if type is :cask
File.write("_data/#{type}/#{name.tr("+", "_")}.json", "#{json}\n")
File.write("api/#{type}/#{name}.json", json_template(type: type))
File.write("#{type}/#{name}.html", html_template(name, type: type))
if type == :cask
# The cask hash should have a :path key with the path to the cask file
cask_path = hash[:path] || hash["path"]
if cask_path && File.exist?(cask_path)
cask_source = File.read(cask_path)
File.write("api/cask_source/#{name}.rb", cask_source)
end
end

Copilot uses AI. Check for mistakes.
tag = Utils::Bottles::Tag.from_symbol(tag)
bottle_collector.add tag, checksum: Checksum.new(data["sha256"]), cellar: :any
end
# odeprecated "brew generate-formula-api", "brew generate-package-api --only-core"
Copy link

Copilot AI Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented-out odeprecated lines should either be uncommented to properly deprecate these commands, or removed entirely if deprecation is not desired. Leaving them commented suggests the intent was to deprecate but it's currently not functional.

If deprecation is intended:

odeprecated "brew generate-formula-api", "brew generate-package-api --only-core"
Suggested change
# odeprecated "brew generate-formula-api", "brew generate-package-api --only-core"
odeprecated "brew generate-formula-api", "brew generate-package-api --only-core"

Copilot uses AI. Check for mistakes.
renames = {}
variation_casks = all_casks.to_h do |token, cask|
cask = Homebrew::API.merge_variations(cask, bottle_tag:)
# odeprecated "brew generate-cask-api", "brew generate-package-api --only-cask"
Copy link

Copilot AI Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented-out odeprecated lines should either be uncommented to properly deprecate these commands, or removed entirely if deprecation is not desired. Leaving them commented suggests the intent was to deprecate but it's currently not functional.

If deprecation is intended:

odeprecated "brew generate-cask-api", "brew generate-package-api --only-cask"
Suggested change
# odeprecated "brew generate-cask-api", "brew generate-package-api --only-cask"
odeprecated "brew generate-cask-api", "brew generate-package-api --only-cask"

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings November 20, 2025 04:23
Copilot finished reviewing on behalf of Rylan12 November 20, 2025 04:26
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 16 comments.

Files not reviewed (1)
  • Library/Homebrew/sorbet/rbi/dsl/homebrew/dev_cmd/generate_package_api.rbi: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +57 to +101
FileUtils.rm_rf "_data/formula_canonical.json" if type == :formula
FileUtils.rm_rf directories
FileUtils.mkdir_p directories
end

Homebrew.with_no_api_env do
tap_migrations_json = JSON.dump(tap.tap_migrations)
File.write("api/#{type}_tap_migrations.json", tap_migrations_json) unless dry_run?

if type == :formula
Formulary.enable_factory_cache!
::Formula.generating_hash!
else
::Cask::Cask.generating_hash!
end

# TODO: double check that this is fine for formulae, since they used -1 before for some reason
latest_macos = MacOSVersion.new(HOMEBREW_MACOS_NEWEST_SUPPORTED).to_sym
Homebrew::SimulateSystem.with(os: latest_macos, arch: :arm) do
all_packages = if type == :formula
formulae(tap)
else
casks(tap)
end

return if dry_run?

all_packages.each do |name, hash|
json = JSON.pretty_generate(hash)

# TODO: add cask-source
File.write("_data/#{type}/#{name.tr("+", "_")}.json", "#{json}\n")
File.write("api/#{type}/#{name}.json", json_template(type: type))
File.write("#{type}/#{name}.html", html_template(name, type: type))
end
end

renames = if type == :formula
tap.formula_renames.merge(tap.alias_table)
else
tap.cask_renames
end

canonical_json = JSON.pretty_generate(renames)
File.write("_data/#{type}_canonical.json", "#{canonical_json}\n") unless dry_run?
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original generate-cask-api.rb created files at _data/cask_canonical.json but the new code on line 101 uses _data/#{type}_canonical.json. For casks, this would be _data/cask_canonical.json, which matches. However, for formulae it would be _data/formula_canonical.json.

The original generate-formula-api.rb used _data/formula_canonical.json, so this is consistent. However, line 57 only removes _data/formula_canonical.json when type == :formula, but the cleanup on line 58 removes all directories. This could lead to inconsistent state if there's a partial failure.

Consider making the cleanup logic more consistent and ensuring all related files for a type are cleaned up together.

Copilot uses AI. Check for mistakes.
Comment on lines +84 to +90
all_packages.each do |name, hash|
json = JSON.pretty_generate(hash)

# TODO: add cask-source
File.write("_data/#{type}/#{name.tr("+", "_")}.json", "#{json}\n")
File.write("api/#{type}/#{name}.json", json_template(type: type))
File.write("#{type}/#{name}.html", html_template(name, type: type))
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the generate_api! method, when type == :cask, the code iterates over all_packages (line 84) but doesn't have access to the original path object needed to read the cask source file. The original implementation had:

tap.cask_files.each do |path|
  cask_source = path.read
  File.write("api/cask-source/#{name}.rb", cask_source)

The refactored code calls casks(tap) which returns a hash without path information, making it impossible to implement the cask source file writing. The casks method should either return paths alongside the hashes, or the source writing logic needs to be handled differently.

Suggested change
all_packages.each do |name, hash|
json = JSON.pretty_generate(hash)
# TODO: add cask-source
File.write("_data/#{type}/#{name.tr("+", "_")}.json", "#{json}\n")
File.write("api/#{type}/#{name}.json", json_template(type: type))
File.write("#{type}/#{name}.html", html_template(name, type: type))
if type == :cask
tap.cask_files.each do |path|
name = path.basename(".rb").to_s
hash = all_packages[name]
next unless hash
json = JSON.pretty_generate(hash)
cask_source = path.read
File.write("_data/#{type}/#{name.tr("+", "_")}.json", "#{json}\n")
File.write("api/#{type}/#{name}.json", json_template(type: type))
File.write("#{type}/#{name}.html", html_template(name, type: type))
File.write("api/cask_source/#{name}.rb", cask_source)
end
else
all_packages.each do |name, hash|
json = JSON.pretty_generate(hash)
File.write("_data/#{type}/#{name.tr("+", "_")}.json", "#{json}\n")
File.write("api/#{type}/#{name}.json", json_template(type: type))
File.write("#{type}/#{name}.html", html_template(name, type: type))
end

Copilot uses AI. Check for mistakes.
Comment on lines +142 to +156
# @formulae ||= T.let(T.must(tap.formula_names.slice(0, 2)).to_h do |name|
@formulae ||= T.let(tap.formula_names.to_h do |name|
debug_load!(name, type: :formula)
formula = Formulary.factory(name)
[formula.name, formula.to_hash_with_variations]
rescue
onoe "Error while generating data for formula '#{name}'."
raise
end, T.nilable(T::Hash[String, T.untyped]))
end

sig { params(tap: Tap).returns(T::Hash[String, T.untyped]) }
def casks(tap)
reset_debugging!
# @casks ||= T.let(T.must(tap.cask_files.slice(0, 2)).to_h do |path|
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented-out code on lines 142 and 156 should be removed before merging. These appear to be debug/testing code that limits the generation to only 2 items.

If this is intentional for testing during development, consider adding a proper flag or configuration option instead of leaving commented code.

Copilot uses AI. Check for mistakes.
Comment on lines +217 to +251
class InternalCaskHash < T::Struct
include CompactSerializable

# TODO: simplify these types when possible
PROPERTIES = T.let({
artifacts: T::Array[T.untyped],
auto_updates: T::Boolean,
caveats: T::Array[String],
conflicts_with: T::Array[String],
container: T::Hash[String, T.untyped],
depends_on: ::Cask::DSL::DependsOn,
deprecation_date: String,
deprecation_reason: String,
desc: String,
disable_date: String,
disable_reason: String,
homepage: String,
name: T::Array[String],
rename: T::Array[String],
sha256: Checksum,
url: ::Cask::URL,
url_specs: T::Hash[Symbol, T.untyped],
version: String,
}.freeze, T::Hash[Symbol, T.untyped])

PROPERTIES.each do |property, type|
const property, T.nilable(type)
end

sig { params(hash: T::Hash[String, T.untyped], bottle_tag: ::Utils::Bottles::Tag).returns(InternalCaskHash) }
def self.from_hash(hash, bottle_tag:)
hash = Homebrew::API.merge_variations(hash, bottle_tag: bottle_tag).transform_keys(&:to_sym)
new(**hash.slice(*PROPERTIES.keys))
end
end
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The InternalCaskHash class (lines 217-251) is defined but never used. Line 118 has it commented out in favor of Homebrew::API.merge_variations.

Either implement the usage of InternalCaskHash or remove this dead code. If it's planned for future use, add a comment explaining why it's not yet used.

Suggested change
class InternalCaskHash < T::Struct
include CompactSerializable
# TODO: simplify these types when possible
PROPERTIES = T.let({
artifacts: T::Array[T.untyped],
auto_updates: T::Boolean,
caveats: T::Array[String],
conflicts_with: T::Array[String],
container: T::Hash[String, T.untyped],
depends_on: ::Cask::DSL::DependsOn,
deprecation_date: String,
deprecation_reason: String,
desc: String,
disable_date: String,
disable_reason: String,
homepage: String,
name: T::Array[String],
rename: T::Array[String],
sha256: Checksum,
url: ::Cask::URL,
url_specs: T::Hash[Symbol, T.untyped],
version: String,
}.freeze, T::Hash[Symbol, T.untyped])
PROPERTIES.each do |property, type|
const property, T.nilable(type)
end
sig { params(hash: T::Hash[String, T.untyped], bottle_tag: ::Utils::Bottles::Tag).returns(InternalCaskHash) }
def self.from_hash(hash, bottle_tag:)
hash = Homebrew::API.merge_variations(hash, bottle_tag: bottle_tag).transform_keys(&:to_sym)
new(**hash.slice(*PROPERTIES.keys))
end
end

Copilot uses AI. Check for mistakes.
Comment on lines +19 to +26
module StringifySymbols
sig { params(value: T.nilable(T.any(String, Symbol))).returns(T.nilable(String)) }
def to_symbolized_string(value)
return value unless value.is_a? Symbol

":#{value}"
end
end
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The StringifySymbols module is defined but the to_symbolized_string method is only called once in the entire file (line 164), where it's called as self.to_symbolized_string. This suggests the module is included in FormulaHash, but that's not clear from the code structure.

The method should be made a private instance method of FormulaHash or the module usage should be clarified with proper documentation.

Copilot uses AI. Check for mistakes.
Comment on lines +82 to +90
return if dry_run?

all_packages.each do |name, hash|
json = JSON.pretty_generate(hash)

# TODO: add cask-source
File.write("_data/#{type}/#{name.tr("+", "_")}.json", "#{json}\n")
File.write("api/#{type}/#{name}.json", json_template(type: type))
File.write("#{type}/#{name}.html", html_template(name, type: type))
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The early return on line 82 when dry_run? is true means the rest of the method never executes, including the canonical JSON generation (lines 94-101). This is different from the original implementation which would still execute the canonical JSON logic even in dry-run mode (it just wouldn't write files).

This means in dry-run mode, the refactored code doesn't test whether the canonical JSON generation would work. Consider moving the dry-run check inside the file writing operations instead of returning early.

Suggested change
return if dry_run?
all_packages.each do |name, hash|
json = JSON.pretty_generate(hash)
# TODO: add cask-source
File.write("_data/#{type}/#{name.tr("+", "_")}.json", "#{json}\n")
File.write("api/#{type}/#{name}.json", json_template(type: type))
File.write("#{type}/#{name}.html", html_template(name, type: type))
# In dry-run mode, still generate JSON/HTML, but do not write files.
all_packages.each do |name, hash|
json = JSON.pretty_generate(hash)
# TODO: add cask-source
unless dry_run?
File.write("_data/#{type}/#{name.tr("+", "_")}.json", "#{json}\n")
File.write("api/#{type}/#{name}.json", json_template(type: type))
File.write("#{type}/#{name}.html", html_template(name, type: type))
end

Copilot uses AI. Check for mistakes.
Comment on lines +73 to +74
# TODO: double check that this is fine for formulae, since they used -1 before for some reason
latest_macos = MacOSVersion.new(HOMEBREW_MACOS_NEWEST_SUPPORTED).to_sym
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO comment on line 73 questions whether the macOS version logic is correct for formulae. The original code used HOMEBREW_MACOS_NEWEST_UNSUPPORTED.to_i - 1 for formulae but HOMEBREW_MACOS_NEWEST_SUPPORTED for casks.

This difference was intentional in the original code and may represent a functional change. The comment should either be resolved before merging or expanded to explain the decision to unify this behavior.

Suggested change
# TODO: double check that this is fine for formulae, since they used -1 before for some reason
latest_macos = MacOSVersion.new(HOMEBREW_MACOS_NEWEST_SUPPORTED).to_sym
# Use previous macOS version for formulae (as in original code), and newest supported for casks.
latest_macos = if type == :formula
MacOSVersion.new(HOMEBREW_MACOS_NEWEST_UNSUPPORTED.to_i - 1).to_sym
else
MacOSVersion.new(HOMEBREW_MACOS_NEWEST_SUPPORTED).to_sym
end

Copilot uses AI. Check for mistakes.
Comment on lines +205 to +216
module CompactSerializable
extend T::Helpers

requires_ancestor { T::Struct }

sig { params(args: T.untyped).returns(String) }
def to_json(*args)
# TODO: this should recursively remove nils from nested hashes/arrays too
serialize.compact.to_json(*args)
end
end

Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CompactSerializable module is defined twice - once in generator.rb (lines 205-215) and once in formula_hash.rb (lines 8-17). These have different implementations:

  • In generator.rb: overrides to_json to compact and serialize
  • In formula_hash.rb: overrides serialize to use compact_blank

This duplication will cause issues. The module should be defined in only one place and imported where needed.

Suggested change
module CompactSerializable
extend T::Helpers
requires_ancestor { T::Struct }
sig { params(args: T.untyped).returns(String) }
def to_json(*args)
# TODO: this should recursively remove nils from nested hashes/arrays too
serialize.compact.to_json(*args)
end
end

Copilot uses AI. Check for mistakes.
Comment on lines +142 to +164
# @formulae ||= T.let(T.must(tap.formula_names.slice(0, 2)).to_h do |name|
@formulae ||= T.let(tap.formula_names.to_h do |name|
debug_load!(name, type: :formula)
formula = Formulary.factory(name)
[formula.name, formula.to_hash_with_variations]
rescue
onoe "Error while generating data for formula '#{name}'."
raise
end, T.nilable(T::Hash[String, T.untyped]))
end

sig { params(tap: Tap).returns(T::Hash[String, T.untyped]) }
def casks(tap)
reset_debugging!
# @casks ||= T.let(T.must(tap.cask_files.slice(0, 2)).to_h do |path|
@casks ||= T.let(tap.cask_files.to_h do |path|
debug_load!(path.stem, type: :cask)
cask = ::Cask::CaskLoader.load(path)
[cask.token, cask.to_hash_with_variations]
rescue
onoe "Error while generating data for cask '#{path.stem}'."
raise
end, T.nilable(T::Hash[String, T.untyped]))
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formulae and casks methods use instance variables for memoization (@formulae and @casks), but this could cause issues when the same Generator instance is used to generate data multiple times with different parameters.

Since these methods are called multiple times in generate_packages_api! (lines 113, 117), the memoization should be safe. However, consider whether a fresh Generator instance should be created for each generation run, or if the memoization is intentional for performance.

Suggested change
# @formulae ||= T.let(T.must(tap.formula_names.slice(0, 2)).to_h do |name|
@formulae ||= T.let(tap.formula_names.to_h do |name|
debug_load!(name, type: :formula)
formula = Formulary.factory(name)
[formula.name, formula.to_hash_with_variations]
rescue
onoe "Error while generating data for formula '#{name}'."
raise
end, T.nilable(T::Hash[String, T.untyped]))
end
sig { params(tap: Tap).returns(T::Hash[String, T.untyped]) }
def casks(tap)
reset_debugging!
# @casks ||= T.let(T.must(tap.cask_files.slice(0, 2)).to_h do |path|
@casks ||= T.let(tap.cask_files.to_h do |path|
debug_load!(path.stem, type: :cask)
cask = ::Cask::CaskLoader.load(path)
[cask.token, cask.to_hash_with_variations]
rescue
onoe "Error while generating data for cask '#{path.stem}'."
raise
end, T.nilable(T::Hash[String, T.untyped]))
@formulae_by_tap ||= {}
tap_key = tap.respond_to?(:name) ? tap.name : tap.to_s
return @formulae_by_tap[tap_key] if @formulae_by_tap.key?(tap_key)
@formulae_by_tap[tap_key] = tap.formula_names.to_h do |name|
debug_load!(name, type: :formula)
formula = Formulary.factory(name)
[formula.name, formula.to_hash_with_variations]
rescue
onoe "Error while generating data for formula '#{name}'."
raise
end
@formulae_by_tap[tap_key]
end
sig { params(tap: Tap).returns(T::Hash[String, T.untyped]) }
def casks(tap)
reset_debugging!
@casks_by_tap ||= {}
tap_key = tap.respond_to?(:name) ? tap.name : tap.to_s
return @casks_by_tap[tap_key] if @casks_by_tap.key?(tap_key)
@casks_by_tap[tap_key] = tap.cask_files.to_h do |path|
debug_load!(path.stem, type: :cask)
cask = ::Cask::CaskLoader.load(path)
[cask.token, cask.to_hash_with_variations]
rescue
onoe "Error while generating data for cask '#{path.stem}'."
raise
end
@casks_by_tap[tap_key]

Copilot uses AI. Check for mistakes.
Comment on lines +210 to +213
sig { params(args: T.untyped).returns(String) }
def to_json(*args)
# TODO: this should recursively remove nils from nested hashes/arrays too
serialize.compact.to_json(*args)
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The to_json method signature on line 210 uses args: T.untyped but should likely use a more specific type or variadic parameters. The typical signature for to_json would be:

sig { params(args: T.untyped, kwargs: T.untyped).returns(String) }

However, given this is overriding a method, verify what the expected signature should be.

Suggested change
sig { params(args: T.untyped).returns(String) }
def to_json(*args)
# TODO: this should recursively remove nils from nested hashes/arrays too
serialize.compact.to_json(*args)
sig { params(args: T.untyped, kwargs: T.untyped).returns(String) }
def to_json(*args, **kwargs)
# TODO: this should recursively remove nils from nested hashes/arrays too
serialize.compact.to_json(*args, **kwargs)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants