Skip to content

fix(kafka): raise default protocol version from 0.10.2.0 to 2.1.0#7890

Open
officialasishkumar wants to merge 2 commits intocadence-workflow:masterfrom
officialasishkumar:fix/kafka-default-version-too-low
Open

fix(kafka): raise default protocol version from 0.10.2.0 to 2.1.0#7890
officialasishkumar wants to merge 2 commits intocadence-workflow:masterfrom
officialasishkumar:fix/kafka-default-version-too-low

Conversation

@officialasishkumar
Copy link
Copy Markdown
Contributor

@officialasishkumar officialasishkumar commented Apr 4, 2026

What changed?

Raised the default Kafka protocol version from 0.10.2.0 to 2.1.0 and applied consistent version configuration to the producer path. Closes #7757.

Why?

Sarama ≥ v1.45 performs API-version negotiation when connecting to Kafka brokers. When the client advertises protocol version 0.10.2.0 (the previous hard-coded default), the handshake with modern brokers (Kafka ≥ 2.x) can fail, crashing the Cadence server on startup with kafka: client has run out of available brokers to talk to.

Additionally, the producer creation path (newProducerByTopic) was not setting Config.Version at all, leaving it at Sarama's internal minimum. This meant producers and consumers could negotiate using different protocol versions, leading to inconsistent behavior.

The new default 2.1.0 (released November 2018) is old enough to be universally compatible with any Kafka cluster still in production, yet new enough for Sarama's API negotiation to succeed. Existing deployments that explicitly set the kafka.version config are unaffected — the default only applies when no version is specified.

How did you test it?

go build ./common/messaging/kafka/...
go test -v -run TestDefaultKafkaVersion -count=1 ./common/messaging/kafka/...
go test -v ./common/messaging/kafka/...

Verified that sarama.ParseKafkaVersion("2.1.0") succeeds (covered by the new TestDefaultKafkaVersion test). Both consumer and producer paths now resolve the version identically.

Potential risks

The default version change is backward-compatible: deployments already setting kafka.version in their config are unaffected. The only behavioral difference is for deployments relying on the implicit 0.10.2.0 default, and those deployments are exactly the ones that were broken by the Sarama upgrade (#7757). Kafka brokers running 2.1.0 or newer (released Nov 2018) are supported, which covers all reasonable production deployments.

Release notes

The default Kafka protocol version has been raised from 0.10.2.0 to 2.1.0 to fix startup failures when running Cadence with Sarama ≥ v1.45 against modern Kafka clusters (see #7757). Deployments that explicitly set kafka.version in their configuration are unaffected. Additionally, producers now respect the configured Kafka version (previously only consumers did).

Documentation Changes

No documentation changes needed. The kafka.version config field is set per-deployment in environment-specific config files (not in config/base.yaml, which only contains logging defaults).

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Raises Cadence’s default Kafka protocol version to avoid startup failures with Sarama’s API-version negotiation and ensures both consumer and producer paths apply the same version configuration.

Changes:

  • Introduces a defaultKafkaVersion constant (2.1.0) and uses it when kafka.version is unset.
  • Applies sarama.Config.Version consistently for both consumer and producer creation.
  • Adds a unit test ensuring the default version string is parseable by Sarama.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
common/messaging/kafka/client_impl.go Adds new default Kafka version constant and applies version parsing/configuration to consumer + producer paths.
common/messaging/kafka/client_impl_test.go Adds test to validate the default Kafka version constant is parseable.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread common/messaging/kafka/client_impl.go Outdated
func (c *clientImpl) NewConsumer(app, consumerName string) (messaging.Consumer, error) {
topics := c.config.GetTopicsForApplication(app)
// All defaut values are copied from uber/kafka-clientImpl bo keep the same behavior
// All default values are copied from uber/kafka-clientImpl to keep the same behavior
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment references "uber/kafka-clientImpl", but this project name/path doesn't appear anywhere else in the codebase and is likely a typo (the earlier comment above mentions "uber-go/kafka-client"). Consider correcting the reference to the actual library/module name to avoid confusion for future readers.

Suggested change
// All default values are copied from uber/kafka-clientImpl to keep the same behavior
// All default values are copied from uber-go/kafka-client to keep the same behavior

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, updated the reference to "uber-go/kafka-client".

Comment thread common/messaging/kafka/client_impl.go Outdated
Comment on lines +142 to +150
kafkaVersion := c.config.Version
if kafkaVersion == "" {
kafkaVersion = defaultKafkaVersion
}

version, err := sarama.ParseKafkaVersion(kafkaVersion)
if err != nil {
return nil, err
}
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kafka version resolution logic is duplicated between NewConsumer and newProducerByTopic (read config.Version, fall back to defaultKafkaVersion, parse). Consider extracting this into a small helper (e.g., on clientImpl) so future changes only need to be made in one place and can be unit-tested directly.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. Extracted a resolveKafkaVersion() helper on clientImpl that both NewConsumer and newProducerByTopic now call.

Comment on lines +39 to +47
// defaultKafkaVersion is the Kafka protocol version used when the user does not
// specify one in the configuration. The previous default ("0.10.2.0") is too low
// for Sarama ≥ v1.45 which performs API-version negotiation: modern brokers
// (Kafka ≥ 1.0) may refuse the handshake when the client advertises such an
// old version, causing "client has run out of available brokers" errors.
// Version 2.1.0 (released Nov 2018) is old enough to be universally supported
// by any Kafka cluster still in production, yet new enough for Sarama's API
// negotiation to succeed.
defaultKafkaVersion = "2.1.0"
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description mentions that config/base.yaml already documents the kafka.version field, but in this repo config/base.yaml currently only contains logging defaults and no Kafka configuration. Either update the description or add/adjust documentation in the appropriate config template so users can discover the kafka.version setting.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, config/base.yaml only has logging defaults. I'll update the PR description to remove that claim.

@neil-xie
Copy link
Copy Markdown
Member

neil-xie commented Apr 7, 2026

Thanks for fixing this!
One quick question, did you try running docker to start the server to check if it can start successfully?

Sarama ≥ v1.45 performs Kafka API-version negotiation on connect.
When the client advertises protocol version 0.10.2.0 (the previous
hard-coded default) against a modern broker (Kafka ≥ 2.x), the
handshake can fail with "client has run out of available brokers".

Raise the fallback to 2.1.0, which is old enough (Nov 2018) to be
compatible with every Kafka cluster still in production, yet new
enough for the negotiation to succeed.

Additionally, the producer path (newProducerByTopic) was not
setting Config.Version at all, leaving it at Sarama's internal
minimum. Apply the same version resolution so that both consumers
and producers use a consistent, configurable protocol version.

Closes cadence-workflow#7757

Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
Fixed the comment to reference "uber-go/kafka-client" instead of
"uber/kafka-clientImpl". Extracted Kafka version resolution into a
helper method to eliminate duplication between consumer and producer
creation paths.

Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
@officialasishkumar officialasishkumar force-pushed the fix/kafka-default-version-too-low branch from 3db5980 to 5916dfb Compare April 7, 2026 20:44
@officialasishkumar
Copy link
Copy Markdown
Contributor Author

officialasishkumar commented Apr 7, 2026

i built a local auto-setup image from the branch, and started docker/docker-compose-async-wf-kafka.yml against that image. I then registered test-domain-pr7890, enabled queue1 for it, read the config back successfully, and confirmed in the cadence-worker logs that it created and started the async Kafka consumer on kafka:9092. Cadence and cadence-web both came up cleanly, and I did not hit any Kafka startup errors during the run.

@gitar-bot
Copy link
Copy Markdown

gitar-bot bot commented Apr 7, 2026

Code Review ✅ Approved

Kafka default protocol version upgraded from 0.10.2.0 to 2.1.0 with comment corrections and version helper extraction. No issues found.

Rules ✅ All requirements met

Repository Rules

PR Description Quality Standards: The PR description includes all required sections with substantive content covering what changed, why the version change was necessary, test commands, backward compatibility, and release notes.

2 rules not applicable. Show all rules by commenting gitar display:verbose.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@officialasishkumar
Copy link
Copy Markdown
Contributor Author

officialasishkumar commented Apr 12, 2026

Are there any other review comments for this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Default Kafka protocol version cap is too low in Cadence 1.3.6

3 participants