-
Notifications
You must be signed in to change notification settings - Fork 4.1k
feat: OpenTelemetry configuration and BaseApp instrumentation #25516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…dk context directly now
see telemetry/readme.md |
|
reviewers: telemetry is automagically instantiated via side-effect import init from this package file: https://github.com/cosmos/cosmos-sdk/blob/cfaffe2b286442a262e506845cdf2d05fb6ed2dc/telemetry/config.go to setup tracers/meters you then do a var/init pattern: var (
tracer = otel.Tracer("cosmos-sdk/baseapp")
meter = otel.Meter("cosmos-sdk/baseapp")
blockCnt metric.Int64Counter
txCnt metric.Int64Counter
)
func init() {
var err error
blockCnt, err = meter.Int64Counter("block.count")
if err != nil {
panic(err)
}
txCnt, err = meter.Int64Counter("tx.count")
if err != nil {
panic(err)
}
}please share opinions on this pattern. the other method we discussed was having some sort of telemetry struct which houses our meters/tracers which gets passed around. but seems more non-standard based on otel docs. |
telemetry/config.go
Outdated
| ) | ||
|
|
||
| const ( | ||
| OtelConfigEnvVar = "OTEL_EXPERIMENTAL_CONFIG_FILE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked previously about how using init avoided atomic reference checks via the delegation field in otel, but I do agree that it would be nice to keep things more idiomatic.
Is there no way to parse the node dir passed via app config or cli and assume that the otel config is in the node's config directory?
Would that require us storing all of the meter variables internal to the app structs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would just require us to initialize telemetry somewhere after the clientContext has been built with all the flags, envs, tomls, etc.
perhaps if we go this way, we can just chuck a default otel config in the home as part of appd init?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how any code inside of init can safely be aware of anything besides environment variables. We could technically have it parse os.Args, but that would be super brittle IMHO.
Waiting to intialize until after we've parsed flags, toml, etc. would just mean that you always pay the atomic.Value cost even with an otherwise noop config. Not the end of the world and maybe it will never show up in any benchmarks given everything else going on, but there is some non-zero cost
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i just debugged this, if i initialize this in the start command, it ends up as an siCounter meter, which has the delegate atomic load.
if we just do init, we get the non-delegate loading version. but yeah, not sure how impactful this will be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update: after finding that the delegate meter is only a few ns (2-5) we aligned on a late-initialization of the otel SDK in favor of improved devx. The otel SDK now reads from ~/.node_home/config/otel.yaml as part of appd start. An empty otel.yaml file is written to ~/.node_home/config during appd init.
| _ "cosmossdk.io/api/cosmos/app/v1alpha1" | ||
| fmt "fmt" | ||
| io "io" | ||
| reflect "reflect" |
Check notice
Code scanning / CodeQL
Sensitive package import Note
| _ "cosmossdk.io/api/cosmos/app/v1alpha1" | ||
| fmt "fmt" | ||
| io "io" | ||
| reflect "reflect" |
Check notice
Code scanning / CodeQL
Sensitive package import Note
| import ( | ||
| fmt "fmt" | ||
| io "io" | ||
| reflect "reflect" |
Check notice
Code scanning / CodeQL
Sensitive package import Note
| _ "cosmossdk.io/api/cosmos/app/v1alpha1" | ||
| fmt "fmt" | ||
| io "io" | ||
| reflect "reflect" |
Check notice
Code scanning / CodeQL
Sensitive package import Note
telemetry/config.go
Outdated
| // InitializeOpenTelemetry initializes the OpenTelemetry SDK. | ||
| // We assume that the otel configuration file is in `~/.<your_node_home>/config/otel.yaml`. | ||
| // An empty otel.yaml is automatically placed in the directory above in the `appd init` command. | ||
| func InitializeOpenTelemetry(homeDir string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer we just pass in a filename here instead of a dir. Then the user can use whatever filename they want. That would be better for testing too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, a more future proof approach here would be either an options struct, ex:
type Options struct {
ConfigFilename string
}
func InitializeOpenTelemetry(Options) error {or vararg Options ex:
type Option interface {
applyOption(*options)
}
func WithConfigFile(filename string) Option { ...}
func InitializeOpenTelemetry(...Option) error {otel go universally uses the ...Option pattern FWIW
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description
This PR:
telemetry/package via https://pkg.go.dev/go.opentelemetry.io/contrib/otelconf/v0.3.0telemetry/which are based on https://pkg.go.dev/github.com/hashicorp/go-metrics which is under-maintained and requires mutex locks and map lookups on every telemetry methodBaseAppwith OpenTelemetry spans and block and tx counter metricsTestingMainfunction for tests to export telemetry dataconfigures thelog/slogdefault logger to send logs to OpenTelemetry and allows otelslog bridged to be used for logging. NOTE: this leaves a bit of a disconnect between the SDK's existing logging infrastructure which currently just writes to stdout - this can either be default with in this PR or dealt with in a follow upThe OpenTelemetry go libraries are very actively maintained, most vendors in the space are adding OpenTelemetry support and generally it seems like the industry is headed in this direction. Much of our existing telemetry code is to configure basic telemetry exporting, but with otelconf declarative config, we don't need to maintain any of this ourselves and the out of the box experience is quite simple even for usage in testing.
Closes: SDK-427