]> git.feebdaed.xyz Git - 0xmirror/grpc-go.git/log
0xmirror/grpc-go.git
3 weeks agointernal/xds: change xds_resolver to use dependency manager (#8711)
eshitachandwani [Wed, 3 Dec 2025 07:33:35 +0000 (13:03 +0530)]
internal/xds: change xds_resolver to use dependency manager (#8711)

This change is part of
[A74](https://github.com/grpc/proposal/blob/master/A74-xds-config-tears.md)
implementation.

This PR removes the listener and route watchers from resolver and
changes it so that we get the resources from xds dependency manager.

RELEASE NOTES:
* xds/resolver:
* Changes the behavior such that getting no matching virtual host in a
route resource will now drop any previous resource and report the error.
* Changes the behavior so that receiving a configuration error (LDS/RDS
ambient error) after a successful update will now only be logged, and
the system will continue using the previous resource to avoid transient
channel failures

3 weeks agotransport/client: Return status code `Unknown` on malformed grpc-status (#8735)
eshitachandwani [Wed, 3 Dec 2025 07:33:17 +0000 (13:03 +0530)]
transport/client: Return status code `Unknown` on malformed grpc-status (#8735)

Fixes : https://github.com/grpc/grpc-go/issues/8713

Unknown is defined as follows in
[grpc/grpc@master/doc/statuscodes.md](https://github.com/grpc/grpc/blob/master/doc/statuscodes.md)

```
Unknown error. For example, this error may be returned when a Status value received
from another address space belongs to an error space that is not known in this
address space. Also errors raised by APIs that do not return enough error information
may be converted to this error.
```
It also mentions of returning Unknown for parsing errors in the table in
the above doc.

We are currently returning Internal for status parsing errors as well,
which is contrary to what is mentioned in the above spec. This PR
changes it to return Unknown.

RELEASE NOTES:
* transport/client : Return status code `Unknown` on malformed
grpc-status.

3 weeks agostats/otel: a79 scaffolding to register an async gauge metric and api to record it...
Madhav Bissa [Tue, 2 Dec 2025 11:54:45 +0000 (17:24 +0530)]
stats/otel: a79 scaffolding to register an async gauge metric and api to record it- part 1 (#8731)

Addresses
https://github.com/grpc/proposal/blob/master/A79-non-per-call-metrics-architecture.md
This PR creates scaffolding to register an async gauge metric. It adds
an AsyncMetricsRecorder interface that defines the api for recording an
int64 async gauge metric.

RELEASE NOTES:
* stats/otel: Add scaffolding to register an async gauge metric. Add an
AsyncMetricsRecorder interface that defines the api for recording an
int64 async gauge metric.

3 weeks agoxdsclient/xdsresource: add AutoHostRewrite and Endpoint Hostname support (#8728)
Pranjali-2501 [Mon, 1 Dec 2025 16:13:52 +0000 (21:43 +0530)]
xdsclient/xdsresource: add AutoHostRewrite and Endpoint Hostname support (#8728)

This PR implements the validation logic and extracting per endpoint
Hostname attributes from xDS resources for [gRFC
A81](https://github.com/grpc/proposal/blob/master/A81-xds-authority-rewriting.md)

### Key Changes:

1.  **RDS Resource Validation :**
* The boolean value of `RouteAction.auto_host_rewrite` is extracted from
the RDS resource and stored in route struct
* This field is only set to `true` in the parsed route struct if the
`trusted_xds_server` option is present in the `ServerConfig` and the
global environment variable for authority overriding is enabled.

2.  **EDS Resource Validation:**
* The `Endpoint.hostname` field is extracted from the EDS resource and
will be stored as a `hostname` string in parsed endpoint struct. It will
be changed to be an per-endpoint resolver attribute in a follow-up PR.

RELEASE NOTES: None

4 weeks agocmd/protoc-gen-go-grpc: bump -version to 1.6.0 for release (#8724)
Arjan Singh Bal [Thu, 27 Nov 2025 07:29:16 +0000 (12:59 +0530)]
cmd/protoc-gen-go-grpc: bump -version to 1.6.0 for release (#8724)

Addresses: #8642

RELEASE NOTES: N/A

4 weeks agoxds: make it possible to create a StringMatcher from arguments outside of test code...
Easwar Swaminathan [Wed, 26 Nov 2025 18:56:53 +0000 (10:56 -0800)]
xds: make it possible to create a StringMatcher from arguments outside of test code (#8723)

Changes in this PR:

- Add a new constructors for StringMatcher that can be shared between test and non-test code. This will be used as part of an internal feature to support ext_authz.
- Create new pointers to match strings instead of using the ones from the proto. This would ensure that the xDS proto structs (which are usually huge) can be garbage collected earlier that currently.
- Fixes a bug involving the regex matcher, which should not be considering the ignore_case field, but was.

RELEASE NOTES:

* xds: Fix a bug in StringMatcher where regexes would match incorrectly when ignore_case is set to true.

4 weeks agoxds/bootstrap: add `trusted_xds_server` server feature (#8692)
Pranjali-2501 [Tue, 25 Nov 2025 19:54:49 +0000 (01:24 +0530)]
xds/bootstrap: add `trusted_xds_server` server feature (#8692)

This PR implements the Bootstrap config changes for [gRFC
A81](https://github.com/grpc/proposal/blob/master/A81-xds-authority-rewriting.md).

Authority rewriting is a security-sensitive feature that should only be
enabled when the xDS server is explicitly trusted to provide such
configuration. gRFC A81 specifies that this trust is indicated by adding
`trusted_xds_server` to the server_features list for a given server in
the bootstrap file.

RELEASE NOTES: None

4 weeks agoxdsclient: call resourceError methods from serializer (#8725)
eshitachandwani [Tue, 25 Nov 2025 11:36:47 +0000 (17:06 +0530)]
xdsclient: call resourceError methods from serializer (#8725)

Fixes an issue where ResourceError methods were incorrectly called
outside of the serializer's scope.

Updates the documentation for the ResourceWatcher interface to
explicitly state the guarantee that all its defined methods will always
be called from the serializer.

RELEASE NOTES: None

4 weeks agobuild(deps): bump golang.org/x/crypto from 0.43.0 to 0.45.0 (#8722)
dependabot[bot] [Tue, 25 Nov 2025 06:45:16 +0000 (12:15 +0530)]
build(deps): bump golang.org/x/crypto from 0.43.0 to 0.45.0 (#8722)

Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from
0.43.0 to 0.45.0.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/golang/crypto/commit/4e0068c0098be10d7025c99ab7c50ce454c1f0f9"><code>4e0068c</code></a>
go.mod: update golang.org/x dependencies</li>
<li><a
href="https://github.com/golang/crypto/commit/e79546e28b85ea53dd37afe1c4102746ef553b9c"><code>e79546e</code></a>
ssh: curb GSSAPI DoS risk by limiting number of specified OIDs</li>
<li><a
href="https://github.com/golang/crypto/commit/f91f7a7c31bf90b39c1de895ad116a2bacc88748"><code>f91f7a7</code></a>
ssh/agent: prevent panic on malformed constraint</li>
<li><a
href="https://github.com/golang/crypto/commit/2df4153a0311bdfea44376e0eb6ef2faefb0275b"><code>2df4153</code></a>
acme/autocert: let automatic renewal work with short lifetime certs</li>
<li><a
href="https://github.com/golang/crypto/commit/bcf6a849efcf4702fa5172cb0998b46c3da1e989"><code>bcf6a84</code></a>
acme: pass context to request</li>
<li><a
href="https://github.com/golang/crypto/commit/b4f2b62076abeee4e43fb59544dac565715fbf1e"><code>b4f2b62</code></a>
ssh: fix error message on unsupported cipher</li>
<li><a
href="https://github.com/golang/crypto/commit/79ec3a51fcc7fbd2691d56155d578225ccc542e2"><code>79ec3a5</code></a>
ssh: allow to bind to a hostname in remote forwarding</li>
<li><a
href="https://github.com/golang/crypto/commit/122a78f140d9d3303ed3261bc374bbbca149140f"><code>122a78f</code></a>
go.mod: update golang.org/x dependencies</li>
<li><a
href="https://github.com/golang/crypto/commit/c0531f9c34514ad5c5551e2d6ce569ca673a8afd"><code>c0531f9</code></a>
all: eliminate vet diagnostics</li>
<li><a
href="https://github.com/golang/crypto/commit/0997000b45e3a40598272081bcad03ffd21b8adb"><code>0997000</code></a>
all: fix some comments</li>
<li>Additional commits viewable in <a
href="https://github.com/golang/crypto/compare/v0.43.0...v0.45.0">compare
view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=golang.org/x/crypto&package-manager=go_modules&previous-version=0.43.0&new-version=0.45.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts page](https://github.com/grpc/grpc-go/network/alerts).

</details>

RELEASE NOTES: None

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: eshitachandwani <emchandwani@google.com>
5 weeks agodns: drop test depending on invalid URL "dns://::1/foo.bar.com" (#8716)
Damien Neil [Fri, 21 Nov 2025 22:35:25 +0000 (14:35 -0800)]
dns: drop test depending on invalid URL "dns://::1/foo.bar.com" (#8716)

Go 1.26's url.Parse will reject invalid URLs containing unbracketed
colons in the hostname.

For example, Go 1.25 and earlier are willing to parse the URLs
"https://localhost:80:443" (hostname:"localhost:80", port:443)
and "https://::1" (hostname:":", port:1).

The test TestCustomAuthority contains a case which depends on
url.Parse("dns://::1/foo.bar.com") succeeding. In Go 1.26, this
case will fail.

Drop the test as not exercising a useful path: This URL is invalid
and earlier Go versions being willing to parse it was a bug.
The correct URL is "dns://[::1]/foo.bar.com" (which is also
exercised by TestCustomAuthority).

RELEASE NOTES:
* client: Reject target URLs containing unbracketed colons in the
hostname in Go version 1.26+.

5 weeks agobuild(deps): bump golang.org/x/crypto from 0.43.0 to 0.45.0 in /security/advancedtls...
dependabot[bot] [Fri, 21 Nov 2025 07:01:40 +0000 (12:31 +0530)]
build(deps): bump golang.org/x/crypto from 0.43.0 to 0.45.0 in /security/advancedtls/examples (#8717)

Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from
0.43.0 to 0.45.0.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/golang/crypto/commit/4e0068c0098be10d7025c99ab7c50ce454c1f0f9"><code>4e0068c</code></a>
go.mod: update golang.org/x dependencies</li>
<li><a
href="https://github.com/golang/crypto/commit/e79546e28b85ea53dd37afe1c4102746ef553b9c"><code>e79546e</code></a>
ssh: curb GSSAPI DoS risk by limiting number of specified OIDs</li>
<li><a
href="https://github.com/golang/crypto/commit/f91f7a7c31bf90b39c1de895ad116a2bacc88748"><code>f91f7a7</code></a>
ssh/agent: prevent panic on malformed constraint</li>
<li><a
href="https://github.com/golang/crypto/commit/2df4153a0311bdfea44376e0eb6ef2faefb0275b"><code>2df4153</code></a>
acme/autocert: let automatic renewal work with short lifetime certs</li>
<li><a
href="https://github.com/golang/crypto/commit/bcf6a849efcf4702fa5172cb0998b46c3da1e989"><code>bcf6a84</code></a>
acme: pass context to request</li>
<li><a
href="https://github.com/golang/crypto/commit/b4f2b62076abeee4e43fb59544dac565715fbf1e"><code>b4f2b62</code></a>
ssh: fix error message on unsupported cipher</li>
<li><a
href="https://github.com/golang/crypto/commit/79ec3a51fcc7fbd2691d56155d578225ccc542e2"><code>79ec3a5</code></a>
ssh: allow to bind to a hostname in remote forwarding</li>
<li><a
href="https://github.com/golang/crypto/commit/122a78f140d9d3303ed3261bc374bbbca149140f"><code>122a78f</code></a>
go.mod: update golang.org/x dependencies</li>
<li><a
href="https://github.com/golang/crypto/commit/c0531f9c34514ad5c5551e2d6ce569ca673a8afd"><code>c0531f9</code></a>
all: eliminate vet diagnostics</li>
<li><a
href="https://github.com/golang/crypto/commit/0997000b45e3a40598272081bcad03ffd21b8adb"><code>0997000</code></a>
all: fix some comments</li>
<li>Additional commits viewable in <a
href="https://github.com/golang/crypto/compare/v0.43.0...v0.45.0">compare
view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=golang.org/x/crypto&package-manager=go_modules&previous-version=0.43.0&new-version=0.45.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts page](https://github.com/grpc/grpc-go/network/alerts).

</details>

RELEASE NOTES: N/A

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: eshitachandwani <emchandwani@google.com>
5 weeks agomem: Allocate at 4KiB boundaries in the fallback buffer pool. (#8705)
Chris Carlon [Mon, 17 Nov 2025 22:37:38 +0000 (17:37 -0500)]
mem: Allocate at 4KiB boundaries in the fallback buffer pool. (#8705)

By rounding up to the nearest page, we avoid repeatedly allocating
similar sizes if requests happen to arrive in roughly increasing order.

The GCS client sends messages with 2MiB of data repeatedly when writing
a large object. Therefore it has to repeatedly allocate just over 2MiB.
This ultimately results in many, many allocations in the fallback buffer
pool. In practice rounding up yields at least a 10x reduction in RAM
when running 100 concurrent large writes. This is probably not unique to
GCS: anyone who sends large messages may be affected.

This change in simpleBufferPool seems worthwhile vs. adding a tier. We
use simpleBufferPool for any size greater than 1MiB, so this effectively
lets us discover a reasonably tight tier around any large message size
that comes in frequently. It increases infrequent allocation sizes by no
more than 0.4%.

RELEASE NOTES:
* mem: round up to nearest 4KiB for pool allocations larger than 1MiB

5 weeks agointernal/xds: move the LDS and RDS watchers to dependency manager (#8651)
eshitachandwani [Sun, 16 Nov 2025 09:24:58 +0000 (14:54 +0530)]
internal/xds: move the LDS and RDS watchers to dependency manager  (#8651)

This PR moves the LDS and RDS watchers to dependency manager without
changing the current functionality or behaviour. This is a part of
implementation of gRFC
[A74](https://github.com/grpc/proposal/blob/master/A74-xds-config-tears.md).

RELEASE NOTES: None

---------

Co-authored-by: Easwar Swaminathan <easwars@google.com>
6 weeks agotransport/client: Return `Unknown` on missing or unparsable grpc-status (#8702)
Madhav Bissa [Fri, 14 Nov 2025 08:52:32 +0000 (14:22 +0530)]
transport/client: Return `Unknown` on missing or unparsable grpc-status (#8702)

See https://github.com/grpc/grpc/blob/master/doc/statuscodes.md for more
details.

RELEASE NOTES:
* transport/client : Return Unknown on missing or unparsable
grpc-status.

6 weeks agotest: use the connectivity state watcher API (#8708)
Easwar Swaminathan [Fri, 14 Nov 2025 07:54:36 +0000 (23:54 -0800)]
test: use the connectivity state watcher API (#8708)

The current tests are using an LB policy to record the state transitions
of subchannels. But these tests are meant to test the connectivity state
transitions of the channel.

Also, in a follow-up PR I will be making the change to transition the
channel to CONNECTING as soon as it exits IDLE
(https://github.com/grpc/grpc-go/issues/7686). With that change, the
current test don't work anymore. So, we will have to change these tests
anyway.

So, I took the opportunity to clean up them and use the connectivity
state watcher API to record the state transitions of the grpc channel
and compare them against the expected states.

RELEASE NOTES: none

6 weeks agoclient: move tests that rely on `Dial` to a separate file (#8707)
Easwar Swaminathan [Fri, 14 Nov 2025 07:38:16 +0000 (23:38 -0800)]
client: move tests that rely on `Dial` to a separate file (#8707)

This PR only moves some tests around, does not make any changes to the
tests.

Moving tests that rely on the behavior of `Dial` to a separate file
makes them easier to find and work with.

I will be cleaning up these tests a little in a follow up PR, and will
be adding a new test for https://github.com/grpc/grpc-go/issues/7686

RELEASE NOTES: none

6 weeks agopriority: add a test helper to override the init timeout in tests (#8704)
Easwar Swaminathan [Thu, 13 Nov 2025 21:56:13 +0000 (13:56 -0800)]
priority: add a test helper to override the init timeout in tests (#8704)

This is an initial cleanup before the actual work required for
https://github.com/grpc/grpc-go/issues/8516

The way the init timeout is currently overridden in existing tests is
unnecessary complicated and hard to read. This simplifies things and
ensures that the previous pattern is not followed when new tests are
written.

RELEASE NOTES: none

6 weeks agoxdsclient: stop batching writes on the ADS stream (#8627)
Easwar Swaminathan [Tue, 11 Nov 2025 17:40:12 +0000 (09:40 -0800)]
xdsclient: stop batching writes on the ADS stream (#8627)

Fixes https://github.com/grpc/grpc-go/issues/8125

#### The original race in the xDS client:
- Resource watch is cancelled by the user of the xdsClient (e.g.
xdsResolver)
- xdsClient removes the resource from its cache and queues an
unsubscribe request to the ADS stream.
- A watch for the same resource is registered immediately, and the
xdsClient instructs the ADS stream to subscribe (as it's not in cache).
- The ADS stream sends a redundant request (same resources, version,
nonce) which the management server ignores.
- The new resource watch sees a "resource-not-found" error once the
watch timer fires.

#### The original fix:
Delay the resource's removal from the cache until the unsubscribe
request was transmitted over the wire, a change implemented in
https://github.com/grpc/grpc-go/pull/8369. However, this solution
introduced new complications:
- The resource's removal from the xdsClient's cache became an
asynchronous operation, occurring while the unsubscribe request was
being sent.
- This asynchronous behavior meant the state maintained within the ADS
stream could still diverge from the cache's state.
- A critical section was absent between the ADS stream's message
transmission logic and the xdsClient's cache access, which is performed
during subscription/unsubscription by its users.

#### The root cause of the previous seen races can be put down two
things:
- Batching of writes for subscribe and unsubscribe calls
- After batching, it may appear that nothing has changed in the list of
subscribed resources, even though a resource was removed and added
again, and therefore the management server would not send any response.
It is important that the management server see the exact sequence of
subscribe and unsubscribe calls.
- State maintained in the ADS stream going out of sync with the state
maintained in the resource cache

#### How does this PR address the above issue?
This PR simplifies the implementation of the ADS stream by removing two
pieces of functionality
- Stop batching of writes on the ADS stream
- If the user registers multiple watches, e.g. resource `A`, `B`, and
`C`, the stream would now send three requests: `[A]`, `[A B]`, `[A B
C]`.
- Queue the exact request to be sent out based on the current state
- As part of handling a subscribe/unsubscribe request, the ADS stream
implementation will queue the exact request to be sent out. When
asynchronously sending the request out, it will not use the current
state, but instead just write the queued request on the wire.
- Don't buffer writes when waiting for flow control
- Flow control is already blocking reads from the stream. Blocking
writes as well during this period might provide some additional flow
control, but not much, and removing this logic simplifies the stream
implementation quite a bit.

RELEASE NOTES:
- xdsclient: fix a race in the xdsClient that could lead to
resource-not-found errors

6 weeks agoxds/resolver: Optimize Interceptor Chain Construction (#8641)
Easwar Swaminathan [Tue, 11 Nov 2025 17:34:57 +0000 (09:34 -0800)]
xds/resolver: Optimize Interceptor Chain Construction (#8641)

#### Existing behavior:
- At routing time, when an RPC matches a route and a cluster is
selected, the interceptor chain for that specific RPC is built.
- This chain is built on a per-RPC basis.
- A subsequent RPC that matches the exact same route and cluster will
trigger the entire chain reconstruction again, even if no configuration
has changed.

#### New  behavior:
- The interceptor chain is now pre-built for every route and every
pickable cluster associated with that route.
- The chains are constructed once when the config selector is built.

#### Other changes:
- Existing unit tests have been converted to be more e2e style tests.
- This lays the necessary groundwork for upcoming changes to the filter
API, specifically to support filter state retention

RELEASE NOTES: NONE

6 weeks agocredentials/xds: fix goroutine leak in testServer (#8699)
Pranjali-2501 [Mon, 10 Nov 2025 10:31:28 +0000 (16:01 +0530)]
credentials/xds: fix goroutine leak in testServer (#8699)

Fixes #8694

This PR fixes a goroutine leak in `credentials/xds/xds_client_test.go`.

Previously, the `testServer` used standard `Send()` calls . If a test
timed out or failed before reading the expected value, the `testServer`
goroutine would block indefinitely on the channel, causing a leak.

Replaced blocking `Send` calls with `SendContext` in `handleConn`. This
ensures that if the test ends (canceling the context), the `testServer`
stops trying to send and exits its goroutine gracefully.

RELEASE NOTES: None

6 weeks agotest: Fix goroutine leak in TestParsedTarget_WithCustomDialer (#8698)
Pranjali-2501 [Mon, 10 Nov 2025 10:30:18 +0000 (16:00 +0530)]
test: Fix goroutine leak in TestParsedTarget_WithCustomDialer (#8698)

Fixes #8695

Fixes a goroutine leak in clientconn_parsed_target_test.go where
TestParsedTarget_WithCustomDialer() could leave dialer goroutines
blocked on sending to addrCh if the test finished early or stopped
reading.

This change replaces the blocking channel send with a select statement
using a timeout/context to ensure goroutines can always exit.

RELEASE NOTES: None

7 weeks agotransport: Set buffer pool in tests (#8688)
Arjan Singh Bal [Tue, 4 Nov 2025 06:11:41 +0000 (11:41 +0530)]
transport: Set buffer pool in tests (#8688)

This PR correctly sets the buffer pool for test clients and servers not
created through the public gRPC API. This allows non-test code to assume
the buffer pool is always present.

RELEASE NOTES: N/A

7 weeks agoxds bootstrap: enable using JWT Call Credentials (part 2 for A97) (#8536)
Dimitar Pavlov [Mon, 3 Nov 2025 11:15:23 +0000 (11:15 +0000)]
xds bootstrap: enable using JWT Call Credentials (part 2 for A97) (#8536)

Part two for https://github.com/grpc/proposal/pull/492 (A97), following
#8431 .

What this PR does is:

- update `internal/xds/bootstrap` with support for loading multiple
PerRPCCallCredentials specifed in a new `call_creds` field in the
boostrap file as per A97
- adjust `xds/internal/xdsclient/clientimpl.go`to use the call
credentials when constructing the client
- update `xds/bootstrap` to register the `jwtcreds` call credentials and
make them available if `GRPC_EXPERIMENTAL_XDS_BOOTSTRAP_CALL_CREDS` is
enabled

Relates to https://github.com/istio/istio/issues/53532

RELEASE NOTES:
- xds: add support for loading a JWT from file and use it as Call
Credentials (A97). To enable this feature, set the environment variable
`GRPC_EXPERIMENTAL_XDS_BOOTSTRAP_CALL_CREDS` to `true` (case
insensitive).

7 weeks agotransport: Remove buffer copies while writing HTTP/2 Data frames (#8667)
Arjan Singh Bal [Mon, 3 Nov 2025 09:03:17 +0000 (14:33 +0530)]
transport: Remove buffer copies while writing HTTP/2 Data frames (#8667)

This PR removes 2 buffer copies while writing data frames to the
underlying net.Conn: one [within
gRPC](https://github.com/grpc/grpc-go/blob/58d4b2b1492dbcfdf26daa7ed93830ebb871faf1/internal/transport/controlbuf.go#L1009-L1022)
and the other [in the
framer](https://cs.opensource.google/go/x/net/+/master:http2/frame.go;l=743;drc=6e243da531559f8c99439dabc7647dec07191f9b).
Care is taken to avoid any extra heap allocations which can affect
performance for smaller payloads.

A [CL](https://go-review.git.corp.google.com/c/net/+/711620) is out for
review which allows using the framer to write frame headers. This PR
duplicates the header writing code as a temporary workaround. This PR
will be merged only after the CL is merged.

## Results

### Small payloads
Performance for small payloads increases slightly due to the reduction
of a `deferred` statement.
```
$ go run benchmark/benchmain/main.go -benchtime=60s -workloads=unary \
   -compression=off -maxConcurrentCalls=120 -trace=off \
   -reqSizeBytes=100 -respSizeBytes=100 -networkMode=Local -resultFile="${RUN_NAME}"

$ go run benchmark/benchresult/main.go unary-before unary-after
               Title       Before        After Percentage
            TotalOps      7600878      7653522     0.69%
             SendOps            0            0      NaN%
             RecvOps            0            0      NaN%
            Bytes/op     10007.07     10000.89    -0.07%
           Allocs/op       146.93       146.91     0.00%
             ReqT/op 101345040.00 102046960.00     0.69%
            RespT/op 101345040.00 102046960.00     0.69%
            50th-Lat    833.724µs    830.041µs    -0.44%
            90th-Lat   1.281969ms   1.275336ms    -0.52%
            99th-Lat   2.403961ms   2.360606ms    -1.80%
             Avg-Lat    946.123µs    939.734µs    -0.68%
           GoVersion     go1.24.8     go1.24.8
         GrpcVersion   1.77.0-dev   1.77.0-dev
```

### Large payloads
Local benchmarks show a ~5-10% regression with 1 MB payloads on my dev
machine. The profiles show increased time spent in the copy operation
[inside the buffered
writer](https://github.com/grpc/grpc-go/blob/58d4b2b1492dbcfdf26daa7ed93830ebb871faf1/internal/transport/http_util.go#L334).
Counterintuitively, copying the grpc header and message data into a
larger buffer increased the performance by 4% (compared to master).

To validate this behaviour (extra copy increasing performance) I ran
[the k8s benchmark for 1MB
payloads](https://github.com/grpc/grpc/blob/65c9be86830b0e423dd970c066c69a06a9240298/tools/run_tests/performance/scenario_config.py#L291-L305)
and 100 concurrent streams which showed ~5% increase in QPS without the
copies across multiple runs. Adding a copy reduced the performance.

Load test config file:
[loadtest.yaml](https://github.com/user-attachments/files/23055312/loadtest.yaml)

```
# 30 core client and server
Before
QPS: 498.284 (16.6095/server core)
Latencies (50/90/95/99/99.9%-ile): 233256/275972/281250/291803/298533 us
Server system time: 93.0164
Server user time:   142.533
Client system time: 97.2688
Client user time:   144.542

After
QPS: 526.776 (17.5592/server core)
Latencies (50/90/95/99/99.9%-ile): 211010/263189/270969/280656/288828 us
Server system time: 96.5959
Server user time:   147.668
Client system time: 101.973
Client user time:   150.234

# 8 core client and server
Before
QPS: 291.049 (36.3811/server core)
Latencies (50/90/95/99/99.9%-ile): 294552/685822/903554/1.48399e+06/1.50757e+06 us
Server system time: 49.0355
Server user time:   87.1783
Client system time: 60.1945
Client user time:   103.633

After
QPS: 334.119 (41.7649/server core)
Latencies (50/90/95/99/99.9%-ile): 279395/518849/706327/1.09273e+06/1.11629e+06 us
Server system time: 69.3136
Server user time:   102.549
Client system time: 80.9804
Client user time:   107.103
```

RELEASE NOTES:
* transport: Avoid two buffer copies when writing Data frames.

8 weeks agotransport: Avoid buffer copies when reading Data frames (#8657)
Arjan Singh Bal [Fri, 31 Oct 2025 06:27:49 +0000 (11:57 +0530)]
transport: Avoid buffer copies when reading Data frames (#8657)

This change incorporates changes from
https://github.com/golang/go/issues/73560 to split reading HTTP/2 frame
headers and payloads. If the frame is not a Data frame, it's read
through the standard library framer as before. For Data frames, the
payload is read directly into a buffer from the buffer pool to avoid
copying it from the framer's buffer.

## Testing
For 1 MB payloads, this results in ~4% improvement in throughput.

```sh
# test command
go run benchmark/benchmain/main.go -benchtime=60s -workloads=streaming \
   -compression=off -maxConcurrentCalls=120 -trace=off \
   -reqSizeBytes=1000000 -respSizeBytes=1000000 -networkMode=Local -resultFile="${RUN_NAME}"

# comparison
go run benchmark/benchresult/main.go streaming-before streaming-after
               Title       Before        After Percentage
            TotalOps        87536        91120     4.09%
             SendOps            0            0      NaN%
             RecvOps            0            0      NaN%
            Bytes/op   4074102.92   4070489.30    -0.09%
           Allocs/op        83.60        76.55    -8.37%
             ReqT/op 11671466666.67 12149333333.33     4.09%
            RespT/op 11671466666.67 12149333333.33     4.09%
            50th-Lat  78.209875ms  75.159943ms    -3.90%
            90th-Lat 117.764228ms   107.8697ms    -8.40%
            99th-Lat 146.935704ms 139.069685ms    -5.35%
             Avg-Lat  82.310691ms  79.073282ms    -3.93%
           GoVersion     go1.24.7     go1.24.7
         GrpcVersion   1.77.0-dev   1.77.0-dev
```

For smaller payloads, the difference in minor.
```sh
go run benchmark/benchmain/main.go -benchtime=60s -workloads=streaming \
   -compression=off -maxConcurrentCalls=120 -trace=off \
   -reqSizeBytes=100 -respSizeBytes=100 -networkMode=Local -resultFile="${RUN_NAME}"

go run benchmark/benchresult/main.go streaming-before streaming-after
               Title       Before        After Percentage
            TotalOps     21490752     21477822    -0.06%
             SendOps            0            0      NaN%
             RecvOps            0            0      NaN%
            Bytes/op      1902.92      1902.94     0.00%
           Allocs/op        29.21        29.21     0.00%
             ReqT/op 286543360.00 286370960.00    -0.06%
            RespT/op 286543360.00 286370960.00    -0.06%
            50th-Lat    352.505µs    352.247µs    -0.07%
            90th-Lat    433.446µs    434.907µs     0.34%
            99th-Lat    536.445µs    539.759µs     0.62%
             Avg-Lat    333.403µs    333.457µs     0.02%
           GoVersion     go1.24.7     go1.24.7
         GrpcVersion   1.77.0-dev   1.77.0-dev
```

RELEASE NOTES:
* transport: Avoid a buffer copy when reading data.

8 weeks agoprotoc-gen-go-grpc: Update supported edition to 2024 (#8685)
Mike Kruskal [Thu, 30 Oct 2025 18:41:22 +0000 (11:41 -0700)]
protoc-gen-go-grpc: Update supported edition to 2024 (#8685)

Fixes: #8642
grpc-go isn't doing anything that should be affected by edition 2024, so
it should already support it. This simply advertises it so protoc will
send it edition 2024 protos without requiring an
`--experimental_editions` flag.

relnotes for cmd/protoc-gen-go-grpc:
* Add support for protobuf edition 2024.

RELEASE NOTES: N/A

8 weeks agoadvancedtls: Apply defaults before version check (#8684)
James O'Gorman [Thu, 30 Oct 2025 16:48:21 +0000 (16:48 +0000)]
advancedtls: Apply defaults before version check (#8684)

Prior to this change, creating an Options like

    tlsOpts := &advancedtls.Options{
IdentityOptions:
advancedtls.IdentityCertificateOptions{IdentityProvider: certProvider},
        // Note: Only MinTLSVersion is set, not MaxTLSVersion
        MinTLSVersion: tls.VersionTLS13,
    }

Would result in error:

    the minimum TLS version is larger than the maximum TLS version

The documentation for the Options struct states that the default for
MaxTLSVersion is TLS 1.3 but the default was being applied after
checking whether MinTLSVersion > MaxTLSVersion.

The defaults are now applied first, prior to any checks.

Fixes #8649

Thank you for your PR. Please read and follow
https://github.com/grpc/grpc-go/blob/master/CONTRIBUTING.md, especially
the
"Guidelines for Pull Requests" section, and then delete this text before
entering your PR description.

RELEASE NOTES: none

8 weeks agoDocumentation: Format file with internal formatter and fix typo in test name (#8681)
Arjan Singh Bal [Thu, 30 Oct 2025 10:51:10 +0000 (16:21 +0530)]
Documentation: Format file with internal formatter and fix typo in test name (#8681)

This change formats a file using `mdformat` to fix findings of an
internal linter.

The following fixes are made:
* Wrap lines at 80 columns.
* Ensure there is only one `h1` block.

This change also fixes a typo in test names.

RELEASE NOTES: N/A

8 weeks agodeps: Bump dependencies for all modules (#8680)
Arjan Singh Bal [Thu, 30 Oct 2025 08:31:04 +0000 (14:01 +0530)]
deps: Bump dependencies for all modules (#8680)

RELEASE NOTES: N/A

8 weeks agoxds/internal: remove Generic resource Decoder and add concrete functions (#8652)
vinothkumarr227 [Thu, 30 Oct 2025 08:00:03 +0000 (08:00 +0000)]
xds/internal: remove Generic resource Decoder and add concrete functions (#8652)

Addresses: https://github.com/grpc/grpc-go/issues/8381

RELEASE NOTES: None

8 weeks agoChange version to 1.78.0-dev (#8679)
Arjan Singh Bal [Thu, 30 Oct 2025 06:06:35 +0000 (11:36 +0530)]
Change version to 1.78.0-dev (#8679)

RELEASE NOTES: N/A

8 weeks agotransport: Reduce heap allocations (#8668)
Arjan Singh Bal [Thu, 30 Oct 2025 05:43:39 +0000 (11:13 +0530)]
transport: Reduce heap allocations (#8668)

## Benchmarks
```sh
# Test command
$  go run benchmark/benchmain/main.go -benchtime=60s -workloads=unary \
   -compression=off -maxConcurrentCalls=200 -trace=off \
   -reqSizeBytes=100 -respSizeBytes=100 -networkMode=Local -resultFile="${RUN_NAME}"

$ go run benchmark/benchresult/main.go unary-before unary-after
               Title       Before        After Percentage
            TotalOps      7801951      7889246     1.12%
             SendOps            0            0      NaN%
             RecvOps            0            0      NaN%
            Bytes/op     10005.90      9911.48    -0.94%
           Allocs/op       146.91       143.91    -2.04%
             ReqT/op 104026013.33 105189946.67     1.12%
            RespT/op 104026013.33 105189946.67     1.12%
            50th-Lat   1.375183ms   1.360319ms    -1.08%
            90th-Lat   2.293816ms   2.249015ms    -1.95%
            99th-Lat   3.162307ms    3.13568ms    -0.84%
             Avg-Lat   1.536462ms   1.519465ms    -1.11%
           GoVersion     go1.24.8     go1.24.8
         GrpcVersion   1.77.0-dev   1.77.0-dev
```

RELEASE NOTES: N/A

8 weeks agodeps: update all dependencies (#8673)
Easwar Swaminathan [Tue, 28 Oct 2025 23:22:20 +0000 (16:22 -0700)]
deps: update all dependencies (#8673)

RELEASE NOTES: none

8 weeks agopickfirst: Remove old pickfirst (#8672)
Arjan Singh Bal [Tue, 28 Oct 2025 16:59:58 +0000 (22:29 +0530)]
pickfirst: Remove old pickfirst (#8672)

Fixes: #8561
Addresses: #6472

The new pickfirst has been the default since [gRPC Go
v1.71.0](https://github.com/grpc/grpc-go/releases/tag/v1.71.0) and all
reported bugs have been fixed. This PR removes the old pickfirst policy
completely.

The exported symbols in the `pickfirstleaf` package are retained with a
deprecation notice for removal after one release.

RELEASE NOTES:
* pickfirst: Remove the old `pick_first` LB policy. The new `pick_first`
has been the default since v 1.71.0.

8 weeks agodocumentation: fix typos in benchmark and auth docs (#8674)
eshitachandwani [Tue, 28 Oct 2025 07:59:20 +0000 (13:29 +0530)]
documentation: fix typos in benchmark and auth docs (#8674)

RELEASE NOTES: N/A

8 weeks agomem: Remove Reader interface and export the concrete struct (#8669)
Arjan Singh Bal [Tue, 28 Oct 2025 06:40:03 +0000 (12:10 +0530)]
mem: Remove Reader interface and export the concrete struct (#8669)

This PR changes the exported slice reader from an interface to a
concrete struct.

This approach follows the precedent set by standard library packages,
such as [`bufio`'s `bufio.Reader`](https://pkg.go.dev/bufio#Reader).
This interface was not intended for users to implement, and gRPC does
not plan to provide alternative implementations. Users who require an
interface for abstraction or testing can define one in their own
packages.

This change provides two main advantages:

* Performance: It avoids a couple of heap allocations per stream that
were previously required to hold the interface value.
* Maintainability: Adding new methods to the concrete struct is a
backward-compatible change, whereas adding methods to an interface is a
breaking change.

## Benchmarks
```sh
# test command
$ go run benchmark/benchmain/main.go -benchtime=60s -workloads=unary \
   -compression=off -maxConcurrentCalls=200 -trace=off \
   -reqSizeBytes=100 -respSizeBytes=100 -networkMode=Local -resultFile="${RUN_NAME}"

$ go run benchmark/benchresult/main.go unary-before unary-after
               Title       Before        After Percentage
            TotalOps      7801951      7883976     1.05%
             SendOps            0            0      NaN%
             RecvOps            0            0      NaN%
            Bytes/op     10005.90      9951.01    -0.54%
           Allocs/op       146.91       144.90    -1.36%
             ReqT/op 104026013.33 105119680.00     1.05%
            RespT/op 104026013.33 105119680.00     1.05%
            50th-Lat   1.375183ms   1.359194ms    -1.16%
            90th-Lat   2.293816ms   2.258941ms    -1.52%
            99th-Lat   3.162307ms   3.157381ms    -0.16%
             Avg-Lat   1.536462ms   1.520149ms    -1.06%
           GoVersion     go1.24.8     go1.24.8
         GrpcVersion   1.77.0-dev   1.77.0-dev
```

RELEASE NOTES:
* mem: Replace the `Reader` interface with a struct.

8 weeks agomem: Avoid clearing new buffers and clear buffers from simpleBufferPools (#8670)
Arjan Singh Bal [Tue, 28 Oct 2025 05:33:08 +0000 (11:03 +0530)]
mem: Avoid clearing new buffers and clear buffers from simpleBufferPools (#8670)

RELEASE NOTES:
* mem:
  * Avoid clearing buffers which newly allocated.
  * Clear large buffers (> 1MB) before re-using.

8 weeks agobenchmark/client: add context for cancellation (#8614)
Ivan Mamaev [Tue, 28 Oct 2025 04:55:12 +0000 (07:55 +0300)]
benchmark/client: add context for cancellation (#8614)

Fixes: #8596
RELEASE NOTES: None

2 months agoxds/googlec2p: support custom bootstrap config per channel. (#8648)
Pranjali-2501 [Mon, 27 Oct 2025 08:21:01 +0000 (13:51 +0530)]
xds/googlec2p: support custom bootstrap config per channel. (#8648)

xds/googlec2p: Fix channel-specific xDS bootstrap configurations by
allowing xdsclient creation with per-target config. Removes global
fallback config usage, enabling multiple distinct xDS clients to coexist
in the same process.

2 months agooutlierdetection: add metrics specified in gRFC A91 (#8644)
Sotiris Nanopoulos [Tue, 21 Oct 2025 19:15:11 +0000 (15:15 -0400)]
outlierdetection: add metrics specified in gRFC A91 (#8644)

Implements gRFC A91:
https://github.com/grpc/proposal/blob/master/A91-outlier-detection-metrics.md

### Notable implementation detals
* `grpc.lb.backend_service` is not implemented yet (marked as optional
in the gRFC)
* modifies the tests to make sure we can cover all the cases for
`enforced`/`unenforced` without repeating the test setup.

RELEASE NOTES:

* outlierdetection: add metrics for enforced
(grpc.lb.outlier_detection.ejections_enforced) and unenforced
(grpc.lb.outlier_detection.ejections_unenforced) outlier ejections.

---------

Signed-off-by: sotiris <sotiris.nanopoulos@reddit.com>
Co-authored-by: Pardhu Konakanchi <pardhukonakanchi@berkeley.edu>
Co-authored-by: Pardhu Konakanchi <46410151+PardhuKonakanchi@users.noreply.github.com>
Co-authored-by: eshitachandwani <59800922+eshitachandwani@users.noreply.github.com>
2 months agocredentials/tls: Revert removal of ALPN flag from #8660 (#8664)
Arjan Singh Bal [Tue, 21 Oct 2025 18:47:01 +0000 (00:17 +0530)]
credentials/tls: Revert removal of ALPN flag from #8660 (#8664)

Original PR: https://github.com/grpc/grpc-go/pull/8660

This reverts commit 0037c61d300991605f745bba4d145f406c6e392d.

## Why

There are internal users of this flag that need to be updated. Internal
issue to track removal: b/454048967.

RELEASE NOTES: N/A

2 months agocleanup: replace dial with newclient (#8602)
vinothkumarr227 [Tue, 21 Oct 2025 05:16:23 +0000 (05:16 +0000)]
cleanup: replace dial with newclient  (#8602)

Fixes: https://github.com/grpc/grpc-go/issues/7049
RELEASE NOTES: None

2 months agocredentials/tls: Remove environment variable for disabling ALPN (#8660)
Arjan Singh Bal [Tue, 21 Oct 2025 05:14:02 +0000 (10:44 +0530)]
credentials/tls: Remove environment variable for disabling ALPN (#8660)

Related issue: https://github.com/grpc/grpc-go/issues/434

RELEASE NOTES:
* credentials/tls: Remove the `GRPC_ENFORCE_ALPN_ENABLED` environment
variable. ALPN is now enforced by default. Users who must disable ALPN
enforcement can temporarily use the [experimental transport
credentials](https://pkg.go.dev/google.golang.org/grpc@v1.76.0/experimental/credentials).
These experimental credentials will be removed in an upcoming release;
users who depend on them must vendor this version of gRPC or copy the
relevant code into their own codebase.

2 months agostats: Re-use objects while calling multiple Handlers (#8639)
Arjan Singh Bal [Fri, 17 Oct 2025 06:49:47 +0000 (12:19 +0530)]
stats: Re-use objects while calling multiple Handlers (#8639)

This PR improves performance by eliminating heap allocations when
multiple stats handlers are configured.

Previously, iterating through a list of handlers caused one heap
allocation per handler for each RPC. This change introduces a Handler
that combines multiple Handlers and implements the `Handler` interface.
The combined handler delegates calls to the handlers it contains.

This approach allows gRPC clients and servers to operate as if there
were only a single `Handler` registered, simplifying the internal logic
and removing the per-RPC allocation overhead. To avoid any performance
impact when stats are disabled, the combined `Handler` is only created
when at least one handler is registered.

# Tested
Since existing benchmarks don't register stats handler, I modified the
benchmark to add 2 stats handlers each on the server and client
(https://github.com/grpc/grpc-go/pull/8639/commits/36ba616d7c40deb1cb79d5c8d1636057f94fc88a).

```sh
# test command
go run benchmark/benchmain/main.go -benchtime=60s -workloads=unary \
   -compression=off -maxConcurrentCalls=200 -trace=off \
   -reqSizeBytes=100 -respSizeBytes=100 -networkMode=Local -resultFile="${RUN_NAME}"

# results
go run benchmark/benchresult/main.go unary-before unary-after
               Title       Before        After Percentage
            TotalOps      7336128      7638892     4.13%
             SendOps            0            0      NaN%
             RecvOps            0            0      NaN%
            Bytes/op     12382.19     11467.03    -7.39%
           Allocs/op       173.93       165.91    -4.60%
             ReqT/op  97815040.00 101851893.33     4.13%
            RespT/op  97815040.00 101851893.33     4.13%
            50th-Lat   1.463345ms   1.403011ms    -4.12%
            90th-Lat   2.557136ms    2.46828ms    -3.47%
            99th-Lat   3.073264ms   3.080081ms     0.22%
             Avg-Lat   1.634153ms   1.569391ms    -3.96%
           GoVersion     go1.24.7     go1.24.7
         GrpcVersion   1.77.0-dev   1.77.0-dev

```

RELEASE NOTES:
* stats: Reduce heap allocations when multiple stats Handlers are
registered.

2 months agodeps: update dependencies for all modules (#8653)
Easwar Swaminathan [Thu, 16 Oct 2025 18:28:41 +0000 (11:28 -0700)]
deps: update dependencies for all modules (#8653)

There is a new version of the envoy protos (v1.35.0) that was released
recently that contains proto changes required for the ext_authz support
that I'm currently working on.

Ran the following command twice (as mentioned in our release docs):
```
for x in $(find . -name 'go.mod' | xargs dirname | sort); do
  pushd "${x}"
  go get -u ./...
  go mod tidy -compat=1.24
  popd
done
```

RELEASE NOTES: none

2 months agotransport: Ensure stream context is cancelled in test (#8647)
Arjan Singh Bal [Tue, 14 Oct 2025 17:41:15 +0000 (23:11 +0530)]
transport: Ensure stream context is cancelled in test (#8647)

Fixes: #8646
The server stream's timer to monitor the deadline is closed when the
stream's cancel method is invoked.

https://github.com/grpc/grpc-go/blob/2d922719c02bb46f34482d592c35e72dc4a9ad92/internal/transport/http2_server.go#L623-L637

The cancel method is called when `closeStream` is called, just before it
calls `deleteStream`.

https://github.com/grpc/grpc-go/blob/2d922719c02bb46f34482d592c35e72dc4a9ad92/internal/transport/http2_server.go#L1347-L1357

The cancel method is not called in
[`deleteStream`](https://github.com/grpc/grpc-go/blob/2d922719c02bb46f34482d592c35e72dc4a9ad92/internal/transport/http2_server.go#L1302).

This change invokes `deleteStream` through `closeStream` in the flaking
test to ensure the stream is always cancelled to avoid leaking timers.

RELEASE NOTES: N/A

2 months agoxdsclient: move listener resource type implementation to match external xdsclient...
Easwar Swaminathan [Tue, 14 Oct 2025 17:18:24 +0000 (10:18 -0700)]
xdsclient: move listener resource type implementation to match external xdsclient API (#8640)

Addresses https://github.com/grpc/grpc-go/issues/8381

This PR *only* changes the implementation of the listener resource type
to adhere to the external xdsclient API. The other resource type
implementations will be handled in subsequent PRs, and once all resource
type implementations have switched to the external xdsclient API, we can
get rid of some of existing APIs.

RELEASE NOTES: none

2 months agodelegatingresolver: add default port to addresses (#8613)
eshitachandwani [Tue, 14 Oct 2025 09:01:27 +0000 (14:31 +0530)]
delegatingresolver: add default port to addresses (#8613)

Fixes: https://github.com/grpc/grpc-go/issues/8607
RELEASE NOTES:
- Fixes a bug where default port 443 was not being added to addresses
without port being sent to proxy.
- Adds a new environment variable
`GRPC_EXPERIMENTAL_ENABLE_DEFAULT_PORT_FOR_PROXY_TARGET` for adding a
default port to addresses being sent to proxy which is set by default.

2 months agostats/opentelemetry: Add support for optional label `grpc.lb.backend_service` in...
Madhav Bissa [Tue, 14 Oct 2025 06:21:35 +0000 (11:51 +0530)]
stats/opentelemetry: Add support for optional label `grpc.lb.backend_service` in per-call metrics (#8637)

Addresses
[A89](https://github.com/grpc/proposal/blob/master/A89-backend-service-metric-label.md)

RELEASE NOTES:
* stats/opentelemetry: Add support for optional label
`grpc.lb.backend_service` in per-call metrics

2 months agoxdsclient: fix the flaky ADS stream restart test (#8631)
Easwar Swaminathan [Thu, 9 Oct 2025 06:52:26 +0000 (23:52 -0700)]
xdsclient: fix the flaky ADS stream restart test (#8631)

The ADS stream restart test can be flaky for the following reason:
- It requests a CDS resource and unrequests it before the stream breaks.
- And then once the stream restarts, it verifies that this resource is
not requested again.
- But the ACK for this resource may or may not be received at the
management server before the stream breaks. This can falsely cause the
test to conclude that the request was re-requested after the restart.

This PR changes the test in the following ways:
- Use a single resource
- Verify ACK before the stream is restarted

Ran a million times without flakes on Forge.

RELEASE NOTES: NONE

2 months agotransport: Replace closures with interfaces to avoid heap allocations (#8630)
Arjan Singh Bal [Thu, 9 Oct 2025 04:19:03 +0000 (09:49 +0530)]
transport: Replace closures with interfaces to avoid heap allocations (#8630)

In Go, creating a closure results in a heap allocation if the compiler
determines the closure might outlive the function in which it was
created. This change removes two such closures, replacing them with
interfaces that are implemented by the `ClientStream` and `ServerStream`
structs.

While this pattern may slightly reduce readability, the performance
benefit is worthwhile, as this transport code is executed for every new
stream. This reduces allocs/unary RPC by 2.5%.

## Testing
```sh

# test command
 go run benchmark/benchmain/main.go -benchtime=60s -workloads=unary \
   -compression=off -maxConcurrentCalls=500 -trace=off \
   -reqSizeBytes=100 -respSizeBytes=100 -networkMode=Local -resultFile="${RUN_NAME}"   -recvBufferPool=simple

# results
go run benchmark/benchresult/main.go unary-before unary-after
               Title       Before        After Percentage
            TotalOps      7593738      7708364     1.51%
             SendOps            0            0      NaN%
             RecvOps            0            0      NaN%
            Bytes/op     10218.45     10185.84    -0.32%
           Allocs/op       164.85       160.84    -2.43%
             ReqT/op 101249840.00 102778186.67     1.51%
            RespT/op 101249840.00 102778186.67     1.51%
            50th-Lat   3.617561ms   3.568623ms    -1.35%
            90th-Lat   5.218682ms   5.131828ms    -1.66%
            99th-Lat   6.052632ms   5.950261ms    -1.69%
             Avg-Lat   3.948414ms   3.889006ms    -1.50%
           GoVersion     go1.24.4     go1.24.4
         GrpcVersion   1.77.0-dev   1.77.0-dev

```

RELEASE NOTES: N/A

2 months agobenchmark/benchmain: Enable buffer pooling by default (#8638)
Arjan Singh Bal [Thu, 9 Oct 2025 04:03:51 +0000 (09:33 +0530)]
benchmark/benchmain: Enable buffer pooling by default (#8638)

This PR enables buffer pooling in the benchmark test to align with the
library's current default configuration.

The benchmark originally disabled buffer pooling because the feature was
opt-in when introduced (#5862). Since buffer pooling is now enabled by
default (#7356), this change ensures the benchmark accurately measures
the performance of gRPC's default behavior.

RELEASE NOTES: N/A

2 months agoexperimental/stats: Add up down counter for A94 (#8581)
Madhav Bissa [Wed, 8 Oct 2025 20:29:58 +0000 (01:59 +0530)]
experimental/stats: Add up down counter for A94 (#8581)

Part 1 for
[A94](https://github.com/grpc/proposal/blob/master/A94-subchannel-otel-metrics.md)
Adds up down counter boiler plate code

RELEASE NOTES:

* experimental/stats: Add up down counter in experimental stats

2 months agoclient: ignore http status header for gRPC streams (#8548)
Madhav Bissa [Wed, 8 Oct 2025 20:28:35 +0000 (01:58 +0530)]
client: ignore http status header for gRPC streams (#8548)

Fixes https://github.com/grpc/grpc-go/issues/8486

When a gRPC response is received with content type application/grpc, we
then do not expect any information in the http status and the status
information needs to be conveyed by gRPC status only.

In case of missing gRPC status, we will throw an Internal error instead
of Unknown in accordance with https://grpc.io/docs/guides/status-codes/

Changes :
- Ignore http status in case of content type application/grpc
- Change the default rawStatusCode to return Internal for missing grpc
status

RELEASE NOTES:
* client : Ignore the HTTP header status for gRPC streams and return
Internal error for missing gRPC status.

2 months agoxds: Store WeightedClusters as a slice instead of a map inside the Route (#8632)
Easwar Swaminathan [Wed, 8 Oct 2025 18:20:00 +0000 (11:20 -0700)]
xds: Store WeightedClusters as a slice instead of a map inside the Route (#8632)

Reasons for this change:
- The `WeightedClusters` field is never used as a map.
- Weighted clusters are stored as a list in the original envoy proto as
well.
- In tests that require deterministic WRR behavior, we use the
`testutils.NewTestWRR` to get rid of the randomness. But the output
still depends on the order in which items are added to the WRR. Maps in
Go are non-deterministic.

RELEASE NOTES: N/A

2 months agotransport: Increment metrics only when the stream is active (#8573)
Elric [Tue, 7 Oct 2025 22:25:07 +0000 (07:25 +0900)]
transport: Increment metrics only when the stream is active (#8573)

Fixes: https://github.com/grpc/grpc-go/issues/8529
This PR fixes to increment metrics only when the stream is active which
is found in the activeStreams map.

#### as-is
- The deleteStream was incrementing channelz metrics every time it was
called, even when stream was already removed from activeStreams or not
exists in activeStreams.

#### to-be
- Added check to ensure metrics are only incremented once when a stream
is actually removed from activeStreams.

RELEASE NOTES:
* server: Fix a bug that caused overcounting of channelz metrics for
successful and failed streams.

2 months agoclusterimpl: remove unused code in tests (#8634)
Easwar Swaminathan [Tue, 7 Oct 2025 22:09:49 +0000 (15:09 -0700)]
clusterimpl: remove unused code in tests (#8634)

RELEASE NOTES: NONE

2 months agoxds/clusterimpl: Convert existing unit tests to e2e style (3/N) (#8616)
Pranjali-2501 [Tue, 7 Oct 2025 18:57:36 +0000 (00:27 +0530)]
xds/clusterimpl: Convert existing unit tests to e2e style (3/N) (#8616)

2 months agotransport: Reduce pointer usage in Stream structs (#8624)
Arjan Singh Bal [Tue, 7 Oct 2025 09:41:48 +0000 (15:11 +0530)]
transport: Reduce pointer usage in Stream structs (#8624)

The pprof profiles for unary RPC benchmarks indicate significant time
spent in `runtime.mallocgc` and `runtime.gcBgMarkWorker`. This indicates
gRPC is spending significant CPU cycles allocating or garbage
collecting.

This change reduces the number of pointer fields in the structs that
represent client and server stream. This will reduce number of memory
allocations (faster) and also reduce pressure on garbage collector
(faster garbage collections) since the GC doesn't need to scan
non-pointer fields. For structs which were stored as pointers to ensure
values are not copied, a `noCopy` struct is embedded that will cause `go
vet` to fail if copies are performed. Non-pointer fields are also moved
to the end of the struct to improve allocation speed.

## Results
There are improvements in QPS, latency and allocs/op for unary RPCs.

```sh
# test command
go run benchmark/benchmain/main.go -benchtime=60s -workloads=unary \
   -compression=off -maxConcurrentCalls=500 -trace=off \
   -reqSizeBytes=100 -respSizeBytes=100 -networkMode=Local -resultFile="${RUN_NAME}"   -recvBufferPool=simple

go run benchmark/benchresult/main.go unary-before unary-after
               Title       Before        After Percentage
            TotalOps      7690250      7991877     3.92%
             SendOps            0            0      NaN%
             RecvOps            0            0      NaN%
            Bytes/op     10218.14     10084.00    -1.31%
           Allocs/op       164.85       151.85    -7.89%
             ReqT/op 102536666.67 106558360.00     3.92%
            RespT/op 102536666.67 106558360.00     3.92%
            50th-Lat    3.57283ms   3.435143ms    -3.85%
            90th-Lat   5.152403ms   4.979906ms    -3.35%
            99th-Lat   5.985282ms   5.827893ms    -2.63%
             Avg-Lat    3.89872ms   3.750449ms    -3.80%
           GoVersion     go1.24.4     go1.24.4
         GrpcVersion   1.77.0-dev   1.77.0-dev
```

## Resources
*
go/go/performance?polyglot=open-source#application-spends-too-much-on-gc-or-allocations
* go/go/performance?polyglot=open-source#memory-optimizations

RELEASE NOTES:
* transport: Reduce pointer usage to lower garbage collection pressure
and improve unary RPC performance.

2 months agoxds/clusterimpl: Convert existing unit tests to e2e style (2/N) (#8576)
Pranjali-2501 [Fri, 3 Oct 2025 20:00:14 +0000 (01:30 +0530)]
xds/clusterimpl: Convert existing unit tests to e2e style (2/N) (#8576)

2 months agotesting: SPIFFE Bundle Maps - Swap to a real unsupported key type (#8626)
Gregory Cooke [Fri, 3 Oct 2025 19:32:56 +0000 (15:32 -0400)]
testing: SPIFFE Bundle Maps - Swap to a real unsupported key type (#8626)

EC Keys are actually supported. The test using this file previously
failed because we had `EC` in the `kty` field, but not the associated
`crv`, `x`, and `y` values. Change this test to use an actual
unsupported key type.

RELEASE NOTES: N/A

2 months agoclient: Add error log for missing health package import during health check (#8595)
Arjan Singh Bal [Fri, 3 Oct 2025 04:08:57 +0000 (09:38 +0530)]
client: Add error log for missing health package import during health check (#8595)

Addresses: https://github.com/grpc/grpc-go/issues/8590

When using the health producer for health checks, and the health package
is not imported by the application, a no op health producer is used
without logging any errors. This PR adds an error log similar to the one
for the old health checks started by the subchannel.

https://github.com/grpc/grpc-go/blob/e35080456c7071b6d55887080ef78809c163640c/clientconn.go#L1475-L1481

RELEASE NOTES: N/A

2 months agopickfirstleaf: fix bug in address de-duplication (#8611)
Arjan Singh Bal [Fri, 3 Oct 2025 04:05:19 +0000 (09:35 +0530)]
pickfirstleaf: fix bug in address de-duplication (#8611)

Due to a bug in the new pickfirst balancer, it wasn't de-duplicating
addresses in the resolver update. The only user visible impact of this
seems to be less frequent picker updates after the first pass in happy
eyeballs and incorrect interleaving of IPv4/IPv6 addresses during the
first happy eyeballs pass.

RELEASE NOTES:
* balancer/pickfirst: Fix a bug where duplicate addresses were not being ignored as intended.

2 months agoexamples/health: fix markdown formatting and improve content (#8625)
Easwar Swaminathan [Thu, 2 Oct 2025 22:30:50 +0000 (15:30 -0700)]
examples/health: fix markdown formatting and improve content (#8625)

This PR fixes some markdown formatting issues flagged by an internal
tool (when attempting to import recent changes into google3). It also
makes minor improvements to the content.

RELEASE NOTES: N/A

2 months agoxdsclient: fix race in ADS stream flow control causing indefinite blocking (#8605)
Easwar Swaminathan [Thu, 2 Oct 2025 18:31:00 +0000 (11:31 -0700)]
xdsclient: fix race in ADS stream flow control causing indefinite blocking (#8605)

Fixes https://github.com/grpc/grpc-go/issues/8594

The above issue clearly describes the condition under which the race
manifests. The changes in this PR are as follows:
- Remove the `readyCh` field in the flow control that was previously
used to block when waiting for flow control. Instead use a condition
variable.
- Have two bits of state inside the flow control:
- One to indicate if there is a pending update that is waiting
consumption by all watchers
  - One to indicate that the stream is closed
- The flow control objects no longer needs to be recreated every time a
new stream is created
- The flow control object is stopped when the `adsStreamImpl` is stopped

This PR also makes other minor changes:
- Fix a flaky test by ensuring that the test stream implementation
unblocks from a `Recv` call when the underlying stream context is
cancelled
- Couple of logging improvements

RELEASE NOTES:
- xdsclient: fix a race in the ADS stream implementation that could
result in resource-not-found errors, causing the gRPC client channel to
move to `TransientFailure`

2 months agoexamples: improve interceptor example with better markdown formatting (#8612)
Easwar Swaminathan [Wed, 1 Oct 2025 21:26:19 +0000 (14:26 -0700)]
examples: improve interceptor example with better markdown formatting (#8612)

This PR changes the existing example to use fenced code blocks instead
of backticks to show the interceptor signatures. This greatly improves
how the document is rendered. The PR also changes the word `overload` to
`override` because it is the latter that we are showcasing here, and Go
does not support overloading anyways.

RELEASE NOTES: none

---------

Co-authored-by: Doug Fawley <dfawley@google.com>
2 months agoxds/cdsbalancer: change tests to use xds resolver (#8579)
eshitachandwani [Wed, 1 Oct 2025 19:24:25 +0000 (00:54 +0530)]
xds/cdsbalancer: change tests to use xds resolver (#8579)

Change the tests in xds cds balancer to use xds resolver instead of
manual resolver

This change is being done as part of gRFC [A74 : xDS Config
tears](https://github.com/grpc/proposal/blob/master/A74-xds-config-tears.md).
This is to make sure the tests pass after the change too.

RELEASE NOTES: None

2 months agoxds/e2e: Use `Metadata` field from `BackendOptions` only when it is non-nil (#8619)
Elric [Wed, 1 Oct 2025 09:41:00 +0000 (18:41 +0900)]
xds/e2e: Use `Metadata` field from `BackendOptions` only when it is non-nil (#8619)

Fixes: #8545
Check BackendOptions.Metadata field before assigning it to
`v3corepb.Metadata`.

RELEASE NOTES: N/A

2 months agoxds/resolver: minor cleanup in the config selector implementation (#8609)
Easwar Swaminathan [Tue, 30 Sep 2025 23:02:42 +0000 (16:02 -0700)]
xds/resolver: minor cleanup in the config selector implementation (#8609)

This PR injects the dependencies of `configSelector` at creation time,
instead of passing a reference to the `xdsResolver` and having the
former directly access fields from the latter.

RELEASE NOTES: none

2 months agopickfirstleaf: Avoid getting stuck in IDLE on connection breakage (#8615)
Arjan Singh Bal [Tue, 30 Sep 2025 05:34:22 +0000 (11:04 +0530)]
pickfirstleaf: Avoid getting stuck in IDLE on connection breakage (#8615)

Related issue: b/415354418

## Problem

On connection breakage, the pickfirst leaf balancer enters idle and
returns an `Idle picker` that calls the balancer's `ExitIdle` method
only the first time `Pick` is called. The following sequence of events
will cause the balancer to get stuck in `Idle` state:
1. Existing connection breaks, SubConn [requests re-resolution and
reports
IDLE](https://github.com/grpc/grpc-go/blob/bb71072094cf533965450c44890f8f51c671c393/clientconn.go#L1388-L1393).
In turn PF updates the ClientConn state to IDLE with an `Idle picker`.
1. An RPC is made, triggering `balancer.ExitIdle` through the idle
picker. The balancer attempts to re-connect the failed SubConn.
1. The resolver produces a new endpoint list, removing the endpoint used
by the existing SubConn. PF removes the existing SubConn. Since the
balancer didn't update the ClientConn state to CONNECTING yet, pickfirst
thinks that it's still in IDLE and doesn't start connecting to the new
endpoints.
1. New RPC requests trigger the idle picker, but it's a no-op since it
only [triggers the balancer's ExitIdle method
once](https://github.com/grpc/grpc-go/blob/bb71072094cf533965450c44890f8f51c671c393/balancer/pickfirst/pickfirstleaf/pickfirstleaf.go#L663https://github.com/grpc/grpc-go/blob/bb71072094cf533965450c44890f8f51c671c393/balancer/pickfirst/pickfirstleaf/pickfirstleaf.go#L663).

## Fix

This change moves the ClientConn into Connecting immediately when the
`ExitIdle` method is called. This ensures that the balancer continues to
re-connect when a new endpoint list is produced by the resolver.

RELEASE NOTES:
* balancer/pickfirst: Fix bug that can cause balancer to get stuck in
`IDLE` state on connection failure.

2 months agotransport: Invoke `net.Conn.SetWriteDeadline` in `http2_client.Close` (#8534)
jgold2-stripe [Mon, 29 Sep 2025 20:45:54 +0000 (13:45 -0700)]
transport: Invoke `net.Conn.SetWriteDeadline` in `http2_client.Close` (#8534)

Fixes: #8425
This PR adds a call to `net.Conn.SetWriteDeadline`, as discussed in
https://github.com/grpc/grpc-go/issues/8425#issuecomment-3057938248.
Additionally, it updates the previous call to `SetReadDeadline` to log
any non-nil error value (this doesn't affect behavior but proved helpful
in some earlier debugging).

RELEASE NOTES:
* client: Set a read deadline when closing a transport to prevent it
from blocking indefinitely on a broken connection.

2 months agoxds/clusterimpl: Convert existing unit tests to e2e style (1/N) (#8549)
Pranjali-2501 [Mon, 29 Sep 2025 05:39:47 +0000 (11:09 +0530)]
xds/clusterimpl: Convert existing unit tests to e2e style (1/N) (#8549)

3 months agopickfirstleaf: Fix shuffling of addresses in resolver updates without endpoints ...
Arjan Singh Bal [Fri, 26 Sep 2025 05:38:27 +0000 (11:08 +0530)]
pickfirstleaf: Fix shuffling of addresses in resolver updates without endpoints (#8610)

The new `pick_first`, which is the default, doesn't shuffle the
addresses at all for resolver updates that are missing the `Endpoints`
field. This change fixes that. Since [gRPC automatically sets the the
missing
`Endpoints`](https://github.com/grpc/grpc-go/blob/1059e84f885bf7ed65b3b1a4fbe914360d8ab5b1/resolver_wrapper.go#L136-L138),
occurrence of this bug should be uncommon in practice.

RELEASE NOTES:
* balancer/pick_first: When configured, shuffle addresses in resolver
updates that lack endpoints. Since gRPC automatically adds endpoints to
resolver updates, this bug should only affect implementers of custom LB
policies that use pick_first for delegation but don't forward the
endpoints.

3 months agoxds: Fix log level and message (#8608)
Arjan Singh Bal [Thu, 25 Sep 2025 18:10:56 +0000 (23:40 +0530)]
xds: Fix log level and message (#8608)

RELEASE NOTES: N/A

3 months agoexamples/features/health: Clarify docs for health import (#8597)
Evan Jones [Thu, 25 Sep 2025 17:53:20 +0000 (13:53 -0400)]
examples/features/health: Clarify docs for health import (#8597)

The google.golang.org/grpc/health package must be imported for client
health checking to work. I somehow missed this, even though it is in the
README, the client example, and the health package docs. Attempt to make
it clearer with a few extra mentions, since it is quite hard to debug
this misconfiguration.

* Remove deprecated grpc.WithBlock function
* Make service config const since it isn't modified

Attempts to clarify Issue #8590.

RELEASE NOTES: N/A

3 months agoxdsclient: improve fallback test involving three servers (#8604)
Easwar Swaminathan [Thu, 25 Sep 2025 17:15:27 +0000 (10:15 -0700)]
xdsclient: improve fallback test involving three servers (#8604)

The existing fallback test that involves three servers is flaky. The
reason for the flake is because some of the resources have the same name
in different servers. The listener resource is expected to have the same
name across the different management servers, but we generally expect
the other resources to have different names.

See the following from the gRFC:
- In
https://github.com/grpc/proposal/blob/master/A71-xds-fallback.md#reservations-about-using-the-fallback-server-data,
we have the following:
```
We have no guarantee that a combination of resources from different xDS servers form a valid cohesive
configuration, so we cannot make this determination on a per-resource basis. We need any given gRPC
channel or server listener to only use the resources from a single server.
```
- In
https://github.com/grpc/proposal/blob/master/A71-xds-fallback.md#config-tears,
we have the following:
```
Config tears happen when the client winds up using some combination of resources from the primary and
fallback servers at the same time, even though that combination of resources was never validated to work
together. In theory, this can cause correctness issues where we might send traffic to the wrong location or
the wrong way, or it can cause RPCs to fail. Note that this can happen only when the primary and fallback
server use the same resource names.
```

This PR ensures that all the different management servers have different
resource names for all resources except the listener. Also, ran the test
on forge 100K times with no failures.

This PR also improves a couple of logs that I found useful when
debugging the failures.

RELEASE NOTES: none

3 months agoopentelemetry: Remove chatty log in client (#8606)
Arjan Singh Bal [Thu, 25 Sep 2025 03:03:46 +0000 (08:33 +0530)]
opentelemetry: Remove chatty log in client (#8606)

Removing this debug log to reduce noise. This log fires on every RPC
call but provides no useful debugging value. The action it logs (adding
callInfo to the context) is part of the normal flow, and the message
contains no helpful variables.

RELEASE NOTES: N/A

3 months agobenchmark: Hold read+write lock while updating server state (#8601)
Arjan Singh Bal [Tue, 23 Sep 2025 19:37:31 +0000 (01:07 +0530)]
benchmark: Hold read+write lock while updating server state (#8601)

The `lastResetTime` and `rusageLastReset ` fields in the
`benchmarkServer` are written while holding a read lock. This can result
in concurrent modifications. This change replaces the `RWMutex` with a
regular `Mutex` to avoid such problems. This lock is acquired a couple
of times during the entire test run, so contention is not a major
concern.

RELEASE NOTES: N/A

3 months agoencoding: Add a test-only function for temporarily registering compressors (#8587)
Arjan Singh Bal [Mon, 22 Sep 2025 19:29:26 +0000 (00:59 +0530)]
encoding: Add a test-only function for temporarily registering compressors (#8587)

Fixes: https://github.com/grpc/grpc-go/issues/7960
This PR adds a function that allows tests to register a compressor with
arbitrary names and un-register them at the end of the test. This
prevents the compressor names from showing up in the encoding header in
subsequent tests. Previously, tests were using the name of the existing
compressor "gzip" and re-registering the original compressor to
workaround this problem.

RELEASE NOTES: N/A

3 months agoxdsclient: fix TestConcurrentReportLoad to not run for 10s (#8598)
Easwar Swaminathan [Mon, 22 Sep 2025 17:49:08 +0000 (10:49 -0700)]
xdsclient: fix TestConcurrentReportLoad to not run for 10s (#8598)

While working on the fix for the xDS client unsubscribe/resubscribe
race, I noticed that the tests in the `internal/xds/xdsclient/tests/`
directory were taking about a minute to run. Upon inspection I found
that `TestConcurrentReportLoad` was running for the configured test
timeout duration of `10s`, but was not failing.

This PR fixes the test to run in a short duration. It also makes a
couple of other cleanups that I noticed when fixing this test.

RELEASE NOTES: none

---------

Co-authored-by: eshitachandwani <59800922+eshitachandwani@users.noreply.github.com>
3 months agoxdsclient/tests: move fallback tests to separate directory (#8600)
Easwar Swaminathan [Mon, 22 Sep 2025 17:38:00 +0000 (10:38 -0700)]
xdsclient/tests: move fallback tests to separate directory (#8600)

Currently, tests in the `internal/xds/xdsclient/tests` package can take
close to a minute to run. Almost half of that time is taken by the
fallback tests which actually have to run longer because they have to
wait for connections to go down and come up and for these events to be
detected by the code (before fallback is triggered).

Splitting the fallback tests into a separate directory almost reduces
the time by half since tests from these two packages can now run in
parallel.

We *could* possibly add a way for tests to add some dial options (to be
used when dialing the management server), and thereby reduce the time
spent in exponential backoff before connections are reattempted (during
the fallback process). But this would require non-trivial amount of
work, and could make the code more complicated. The change in this PR
seems like a good bang for the buck.

RELEASE NOTES: none

3 months agoflowcontrol: change variable names for better understanding (#8578)
Icarus Wu [Fri, 19 Sep 2025 18:29:33 +0000 (02:29 +0800)]
flowcontrol: change variable names for better understanding (#8578)

This PR aims to improve some variable names for better understanding.
Before the change, it took time for users to think about why there's a
`b` variable.

RELEASE NOTES: N/A

Signed-off-by: Icarus Wu <icaruswu66@qq.com>
3 months agobenchmark: Avoid spawning a goroutine per unary call (#8591)
Arjan Singh Bal [Fri, 19 Sep 2025 03:24:11 +0000 (08:54 +0530)]
benchmark: Avoid spawning a goroutine per unary call (#8591)

The benchmark client is presently spawning a new goroutine per unary
call and blocking on its completion. Since the spawning goroutine is
blocked, it is more efficient to do the work in the spawning goroutine
itself. This change has the following effect on the [benchmark
performance](https://grafana-dot-grpc-testing.appspot.com/):
1. Unary 8-core: 184k QPS to 233k QPS (+26%)
2. Unary 30-core: 403k QPS to 624k QPS (+54%)

## Tested
* Ran the benchmark on the same GKE cluster to repro the results from
the dashboard.
* Created a docker image with the changes in this PR. Re-ran the
benchmark with the new image.

RELEASE NOTES: N/A

3 months agovet: add line numbers of offending lines to the output (#8593)
Easwar Swaminathan [Thu, 18 Sep 2025 20:26:50 +0000 (13:26 -0700)]
vet: add line numbers of offending lines to the output (#8593)

When vet fails because of offending whitespace, the output currently
only lists the offending file. This change adds the line number to the
output to make it easier on the developer to fix the issue.

RELEASE NOTE: n/a

3 months agocredentials: Remove TODO from public godoc (#8589)
Arjan Singh Bal [Thu, 18 Sep 2025 14:43:42 +0000 (20:13 +0530)]
credentials: Remove TODO from public godoc (#8589)

The TODO comment with a Github user's name shows up in the [public
godoc](https://pkg.go.dev/google.golang.org/grpc@v1.75.1/credentials#PerRPCCredentials).
Since this is a stable API, changing it now doesn't seem feasible, so
this change removes it completely.

RELEASE NOTES: N/A

3 months agodeps: update dependencies for all modules (#8588)
Pranjali-2501 [Wed, 17 Sep 2025 19:17:57 +0000 (00:47 +0530)]
deps: update dependencies for all modules (#8588)

3 months agoChange version to 1.77.0-dev (#8586)
Pranjali-2501 [Wed, 17 Sep 2025 08:37:49 +0000 (14:07 +0530)]
Change version to 1.77.0-dev (#8586)

3 months agoclient: minor improvements to log messages (#8564)
Easwar Swaminathan [Wed, 17 Sep 2025 02:30:47 +0000 (19:30 -0700)]
client: minor improvements to log messages (#8564)

Couple of minor improvements to log messages from the gRPC channel

The improvements are:
- Log the target URI when we log a message for the creation of a gRPC
channel
- Separate the channelz identifier (which could be something like
`[Channel #X]` or `[Channel X][Subchannel Y]` etc) from the actual
message being logged with a space

RELEASE NOTES: none

3 months agocredentials: implement file-based JWT Call Credentials (part 1 for A97) (#8431)
Dimitar Pavlov [Tue, 16 Sep 2025 07:20:07 +0000 (08:20 +0100)]
credentials: implement file-based JWT Call Credentials (part 1 for A97) (#8431)

Part one for https://github.com/grpc/proposal/pull/492 (A97).
This is done in a new `credentials/jwt` package to provide file-based
PerRPCCallCredentials. It can be used beyond XDS. The package handles
token reloading, caching, and validation as per A97 .

There will be a separate PR which uses it in `xds/bootstrap`.

Whilst implementing the above, I considered `credentials/oauth` and
`credentials/xds` packages instead of creating a new one. The former
package has `NewJWTAccessFromKey` and `jwtAccess` which seem very
relevant at first. However, I think the `jwtAccess` behaviour seems more
tailored towards Google services. Also, the refresh, caching, and error
behaviour for A97 is quite different than what's already there and
therefore a separate implementation would have still made sense.
WRT `credentials/xds`, it could have been extended to both handle
transport and call credentials. However, this is a bit at odds with A97
which says that the implementation should be non-XDS specific and, from
reading between the lines, usable beyond XDS.
I think the current approach makes review easier but because of the
similarities with the other two packages, it is a bit confusing to
navigate. Please let me know whether the structure should change.

Relates to https://github.com/istio/istio/issues/53532

RELEASE NOTES:
- credentials: Add `credentials/jwt` package providing file-based JWT
PerRPCCredentials (A97).

3 months agoxds/resolver_test: fix flaky test ResolverBadServiceUpdate_NACKedWithoutCache (#8521)
Elric [Mon, 15 Sep 2025 12:56:31 +0000 (21:56 +0900)]
xds/resolver_test: fix flaky test ResolverBadServiceUpdate_NACKedWithoutCache (#8521)

Fixes: #8435
### root cause of issue:
- I think there was a race condition when channel communicates between
the xDS resolver and test infrastructure
- insufficient buffer size: original channels (stateCh and errCh) had
only buffer size of 1
- blocking sends: When buffer is full, the resolver would block trying
to send the next update
- test deadlock: test infra might be waiting for a specific update while
the resolver was blocked trying to send a different update, creating a
deadlock

### Changes
1) Increased buffer size (1 → 10):
``` go
  stateCh := make(chan resolver.State, 10)
  errCh := make(chan error, 10)
```

2) Non-blocking send pattern:
 ``` go
  select {
  case stateCh <- s:  // the resolver try to send updates
  default:            // If channel is full, drain old message and retry
      select {
      case <-stateCh:
          stateCh <- s
      default:
      }
  }
```
- make it drain old messages preventing the resolver from blocking and just keeping the most latest updates.

3) Cleanup with draining goroutines:
``` go
  go func() {
      for range stateCh { }  // Drain any remaining messages
  }()
```
- it ensures the resolver never blocks on sends and prevents `goroutine leaks` during test cleanup.

RELEASE NOTES: N/A

3 months agointernal/buffer: set closed flag when closing channel in the Load method (#8575)
tsukiyoz [Mon, 15 Sep 2025 08:37:10 +0000 (16:37 +0800)]
internal/buffer: set closed flag when closing channel in the Load method (#8575)

## Description

This PR fixes a bug in the `Unbounded.Load()` method where the `closed`
flag was not being set to `true` when the channel was closed.

## Problem

In the `Load()` method, when the condition `b.closing && !b.closed` is
met, the code closes the channel but doesn't update the `closed` flag.
This creates an inconsistent state where:
- The channel is closed (no more data can be sent)
- But `b.closed` remains `false`

This inconsistency could potentially cause issues in code that relies on
the `closed` flag to determine the buffer's state.

## Solution

Added `b.closed = true` before `close(b.c)` in the `else if` branch of
the `Load()` method to ensure the closed flag accurately reflects the
buffer's state.

## Changes

- **File**: `internal/buffer/unbounded.go`
- **Method**: `Load()`
- **Line**: 86
- **Change**: Added `b.closed = true` before closing the channel

## Testing

- ✅ All existing tests pass
- ✅ No linter errors introduced
- ✅ The fix ensures consistent state between channel closure and closed
flag

## Impact

This is a bug fix that improves the correctness of the `Unbounded`
buffer implementation without changing its public API or behavior from a
user perspective.

Fixes: https://github.com/grpc/grpc-go/issues/8572
RELEASE NOTES: None

3 months agoencoding/proto: enable use cached size option (#8569)
Roy Salame [Mon, 15 Sep 2025 05:21:51 +0000 (01:21 -0400)]
encoding/proto: enable use cached size option (#8569)

Enable UseCachedSize in proto marshal to eliminate redundant size
computation

Fixes: https://github.com/grpc/grpc-go/issues/8570
The proto message size was previously being computed twice: once before
marshalling and again during the marshalling call itself. In
high-throughput workloads, this duplicated computation is expensive.

By enabling `UseCachedSize` on `MarshalOptions`, we reuse the size
calculated immediately before marshalling, avoiding the second call to
`proto.Size`.

In our application, the redundant size call accounted for ~12% of total
CPU time. With this change, we eliminate that overhead while preserving
correctness.

RELEASE NOTES:
- encoding/proto: Avoid redundant message size calculation when
marshalling.

3 months agotransport: avoid slice reallocation during header creation (#8547)
Arjan Singh Bal [Thu, 11 Sep 2025 08:32:34 +0000 (14:02 +0530)]
transport: avoid slice reallocation during header creation (#8547)

This PR improves the size estimate while pre-allocating `headerFields`
to avoid reallocations, which pprof showed were responsible for ~4% of
total memory allocations. This change improves performance, increasing
QPS by 1% while reducing bytes/op by 4% and latencies by 0.3-4%.

## Tested
```sh
go run benchmark/benchresult/main.go unary-before unary-after
unary-networkMode_Local-bufConn_false-keepalive_false-benchTime_1m0s-trace_false-latency_0s-kbps_0-MTU_0-maxConcurrentCalls_120-reqSiz
e_1024B-respSize_1024B-compressor_off-channelz_false-preloader_false-clientReadBufferSize_-1-clientWriteBufferSize_-1-serverReadBuffer
Size_-1-serverWriteBufferSize_-1-sleepBetweenRPCs_0s-connections_1-recvBufferPool_simple-sharedWriteBuffer_false
               Title       Before        After Percentage
            TotalOps      6327736      6390728     1.00%
             SendOps            0            0      NaN%
             RecvOps            0            0      NaN%
            Bytes/op     13903.23     13354.55    -3.95%
           Allocs/op       156.22       155.23    -0.64%
             ReqT/op 863946888.53 872547396.27     1.00%
            RespT/op 863946888.53 872547396.27     1.00%
            50th-Lat    1.00991ms   1.006914ms    -0.30%
            90th-Lat   1.678329ms   1.610331ms    -4.05%
            99th-Lat   2.517556ms   2.497122ms    -0.81%
             Avg-Lat   1.136117ms   1.125311ms    -0.95%
           GoVersion     go1.24.4     go1.24.4
         GrpcVersion   1.76.0-dev   1.76.0-dev
```

RELEASE NOTES:
* client: Improve header slice length estimate to reduce re-allocations.

3 months agoRevert "stats/opentelemetry: record retry attempts from clientStream (#8342)" (#8571)
eshitachandwani [Wed, 10 Sep 2025 17:01:05 +0000 (22:31 +0530)]
Revert "stats/opentelemetry: record retry attempts from clientStream (#8342)" (#8571)

This introduced flakiness in a test -
Test/TraceSpan_WithRetriesAndNameResolutionDelay
Failure:
https://github.com/grpc/grpc-go/actions/runs/17614152882/job/50042942932?pr=8547

Related issue: https://github.com/grpc/grpc-go/issues/8299

RELEASE NOTES: None

3 months agostats/opentelemetry: record retry attempts from clientStream (#8342)
vinothkumarr227 [Wed, 10 Sep 2025 06:57:54 +0000 (12:27 +0530)]
stats/opentelemetry: record retry attempts from clientStream (#8342)

Fixes: https://github.com/grpc/grpc-go/issues/8299
RELEASE NOTES:

- stats/opentelemetry: Retry attempts (`grpc.previous-rpc-attempts`) are
now recorded as span attributes for non-transparent client retries.

3 months agoGoogleC2P: remove dependency on metadata server for IPv6 node metadata (#8550)
apolcyn [Mon, 8 Sep 2025 20:05:26 +0000 (13:05 -0700)]
GoogleC2P: remove dependency on metadata server for IPv6 node metadata (#8550)

Remove reliance on metadata server since it's result is no longer
needed, hardcode IPv6 support in node metadata instead.

Related c++ change: https://github.com/grpc/grpc/pull/40571

Note we preserve prior behavior in case experiment `NewPickFirstEnabled`
is disabled, because our testing/qualification has not covered that
being disabled.

Related: internal issue b/407587619

RELEASE NOTES: n/a

3 months agoxds: move env var check for HTTP CONNECT metadata parsing to endpoint and locality...
Easwar Swaminathan [Fri, 5 Sep 2025 22:20:05 +0000 (15:20 -0700)]
xds: move env var check for HTTP CONNECT metadata parsing to endpoint and locality parsing functions (#8551)

Currently, the env var check for parsing HTTP CONNECT metadata (A86) is
inside the function that parses custom metadata,
`validateAndConstructMetadata`.

This PR moves the check to the endpoint and locality parsing functions,
`parseEndpoint` and the top-level `parseEDSRespProto` which is where
localities are parsed. This allows multiple env vars to control
different custom metadata keys. We already support two custom metadata
keys (A76 and A86) and we plan to support more (A83).

This PR also ensures that the custom metadata used for ring_hash key
(A76) uses the recently added `StructMetadataValue` type. This ensures
that metadata parsing happens only once.

Since the location of the env var check is moved, the tests are also
restructured a little. This PR groups the custom metadata parsing tests
into three groups: one for success cases when the env var is turned on,
one for success cases when the env var is turned off, and one for
failure cases when the env var is turned on.

RELEASE NOTES: none

3 months agopriority: use new-style atomic APIs (#8558)
Easwar Swaminathan [Fri, 5 Sep 2025 06:04:46 +0000 (23:04 -0700)]
priority: use new-style atomic APIs (#8558)

Use new-style atomic APIs instead of the old ones in the
`ignoreResolveNowClientConn` type.

The changes made in this PR improve the code in the following ways:
* Ergonomics: Method-based API vs function-based, no pointer management
needed
* Safety: Type safety prevents mixing atomic/non-atomic operations,
eliminates pointer errors
* Clarity: The `atomic.Uint32` type makes atomic intent explicit from
declaration

RELEASE NOTES: none

3 months agoclient: handle 1xx HTTP status HEADERS (#8518)
vinothkumarr227 [Thu, 4 Sep 2025 20:19:45 +0000 (01:49 +0530)]
client: handle 1xx HTTP status HEADERS (#8518)

Fixes: https://github.com/grpc/grpc-go/issues/8485
RELEASE NOTES:
* client: Ignore http headers with status 1xx and `END_STREAM` flag
unset.
* client: Fail RPCs with status `INTERNAL` instead of `UNKNOWN` on
receiving http headers with status 1xx and `END_STREAM` flag set.

3 months agogithub,test: fix internal CI build (#8556)
Arjan Singh Bal [Thu, 4 Sep 2025 16:06:30 +0000 (21:36 +0530)]
github,test: fix internal CI build (#8556)

Fix the following errors:
* only use one space to separate words
* use service codedgen import name for stream type

RELEASE NOTES: N/A