]> git.feebdaed.xyz Git - 0xmirror/grpc-go.git/log
0xmirror/grpc-go.git
2 months agoxds/clusterimpl: Convert existing unit tests to e2e style (1/N) (#8549)
Pranjali-2501 [Mon, 29 Sep 2025 05:39:47 +0000 (11:09 +0530)]
xds/clusterimpl: Convert existing unit tests to e2e style (1/N) (#8549)

3 months agopickfirstleaf: Fix shuffling of addresses in resolver updates without endpoints ...
Arjan Singh Bal [Fri, 26 Sep 2025 05:38:27 +0000 (11:08 +0530)]
pickfirstleaf: Fix shuffling of addresses in resolver updates without endpoints (#8610)

The new `pick_first`, which is the default, doesn't shuffle the
addresses at all for resolver updates that are missing the `Endpoints`
field. This change fixes that. Since [gRPC automatically sets the the
missing
`Endpoints`](https://github.com/grpc/grpc-go/blob/1059e84f885bf7ed65b3b1a4fbe914360d8ab5b1/resolver_wrapper.go#L136-L138),
occurrence of this bug should be uncommon in practice.

RELEASE NOTES:
* balancer/pick_first: When configured, shuffle addresses in resolver
updates that lack endpoints. Since gRPC automatically adds endpoints to
resolver updates, this bug should only affect implementers of custom LB
policies that use pick_first for delegation but don't forward the
endpoints.

3 months agoxds: Fix log level and message (#8608)
Arjan Singh Bal [Thu, 25 Sep 2025 18:10:56 +0000 (23:40 +0530)]
xds: Fix log level and message (#8608)

RELEASE NOTES: N/A

3 months agoexamples/features/health: Clarify docs for health import (#8597)
Evan Jones [Thu, 25 Sep 2025 17:53:20 +0000 (13:53 -0400)]
examples/features/health: Clarify docs for health import (#8597)

The google.golang.org/grpc/health package must be imported for client
health checking to work. I somehow missed this, even though it is in the
README, the client example, and the health package docs. Attempt to make
it clearer with a few extra mentions, since it is quite hard to debug
this misconfiguration.

* Remove deprecated grpc.WithBlock function
* Make service config const since it isn't modified

Attempts to clarify Issue #8590.

RELEASE NOTES: N/A

3 months agoxdsclient: improve fallback test involving three servers (#8604)
Easwar Swaminathan [Thu, 25 Sep 2025 17:15:27 +0000 (10:15 -0700)]
xdsclient: improve fallback test involving three servers (#8604)

The existing fallback test that involves three servers is flaky. The
reason for the flake is because some of the resources have the same name
in different servers. The listener resource is expected to have the same
name across the different management servers, but we generally expect
the other resources to have different names.

See the following from the gRFC:
- In
https://github.com/grpc/proposal/blob/master/A71-xds-fallback.md#reservations-about-using-the-fallback-server-data,
we have the following:
```
We have no guarantee that a combination of resources from different xDS servers form a valid cohesive
configuration, so we cannot make this determination on a per-resource basis. We need any given gRPC
channel or server listener to only use the resources from a single server.
```
- In
https://github.com/grpc/proposal/blob/master/A71-xds-fallback.md#config-tears,
we have the following:
```
Config tears happen when the client winds up using some combination of resources from the primary and
fallback servers at the same time, even though that combination of resources was never validated to work
together. In theory, this can cause correctness issues where we might send traffic to the wrong location or
the wrong way, or it can cause RPCs to fail. Note that this can happen only when the primary and fallback
server use the same resource names.
```

This PR ensures that all the different management servers have different
resource names for all resources except the listener. Also, ran the test
on forge 100K times with no failures.

This PR also improves a couple of logs that I found useful when
debugging the failures.

RELEASE NOTES: none

3 months agoopentelemetry: Remove chatty log in client (#8606)
Arjan Singh Bal [Thu, 25 Sep 2025 03:03:46 +0000 (08:33 +0530)]
opentelemetry: Remove chatty log in client (#8606)

Removing this debug log to reduce noise. This log fires on every RPC
call but provides no useful debugging value. The action it logs (adding
callInfo to the context) is part of the normal flow, and the message
contains no helpful variables.

RELEASE NOTES: N/A

3 months agobenchmark: Hold read+write lock while updating server state (#8601)
Arjan Singh Bal [Tue, 23 Sep 2025 19:37:31 +0000 (01:07 +0530)]
benchmark: Hold read+write lock while updating server state (#8601)

The `lastResetTime` and `rusageLastReset ` fields in the
`benchmarkServer` are written while holding a read lock. This can result
in concurrent modifications. This change replaces the `RWMutex` with a
regular `Mutex` to avoid such problems. This lock is acquired a couple
of times during the entire test run, so contention is not a major
concern.

RELEASE NOTES: N/A

3 months agoencoding: Add a test-only function for temporarily registering compressors (#8587)
Arjan Singh Bal [Mon, 22 Sep 2025 19:29:26 +0000 (00:59 +0530)]
encoding: Add a test-only function for temporarily registering compressors (#8587)

Fixes: https://github.com/grpc/grpc-go/issues/7960
This PR adds a function that allows tests to register a compressor with
arbitrary names and un-register them at the end of the test. This
prevents the compressor names from showing up in the encoding header in
subsequent tests. Previously, tests were using the name of the existing
compressor "gzip" and re-registering the original compressor to
workaround this problem.

RELEASE NOTES: N/A

3 months agoxdsclient: fix TestConcurrentReportLoad to not run for 10s (#8598)
Easwar Swaminathan [Mon, 22 Sep 2025 17:49:08 +0000 (10:49 -0700)]
xdsclient: fix TestConcurrentReportLoad to not run for 10s (#8598)

While working on the fix for the xDS client unsubscribe/resubscribe
race, I noticed that the tests in the `internal/xds/xdsclient/tests/`
directory were taking about a minute to run. Upon inspection I found
that `TestConcurrentReportLoad` was running for the configured test
timeout duration of `10s`, but was not failing.

This PR fixes the test to run in a short duration. It also makes a
couple of other cleanups that I noticed when fixing this test.

RELEASE NOTES: none

---------

Co-authored-by: eshitachandwani <59800922+eshitachandwani@users.noreply.github.com>
3 months agoxdsclient/tests: move fallback tests to separate directory (#8600)
Easwar Swaminathan [Mon, 22 Sep 2025 17:38:00 +0000 (10:38 -0700)]
xdsclient/tests: move fallback tests to separate directory (#8600)

Currently, tests in the `internal/xds/xdsclient/tests` package can take
close to a minute to run. Almost half of that time is taken by the
fallback tests which actually have to run longer because they have to
wait for connections to go down and come up and for these events to be
detected by the code (before fallback is triggered).

Splitting the fallback tests into a separate directory almost reduces
the time by half since tests from these two packages can now run in
parallel.

We *could* possibly add a way for tests to add some dial options (to be
used when dialing the management server), and thereby reduce the time
spent in exponential backoff before connections are reattempted (during
the fallback process). But this would require non-trivial amount of
work, and could make the code more complicated. The change in this PR
seems like a good bang for the buck.

RELEASE NOTES: none

3 months agoflowcontrol: change variable names for better understanding (#8578)
Icarus Wu [Fri, 19 Sep 2025 18:29:33 +0000 (02:29 +0800)]
flowcontrol: change variable names for better understanding (#8578)

This PR aims to improve some variable names for better understanding.
Before the change, it took time for users to think about why there's a
`b` variable.

RELEASE NOTES: N/A

Signed-off-by: Icarus Wu <icaruswu66@qq.com>
3 months agobenchmark: Avoid spawning a goroutine per unary call (#8591)
Arjan Singh Bal [Fri, 19 Sep 2025 03:24:11 +0000 (08:54 +0530)]
benchmark: Avoid spawning a goroutine per unary call (#8591)

The benchmark client is presently spawning a new goroutine per unary
call and blocking on its completion. Since the spawning goroutine is
blocked, it is more efficient to do the work in the spawning goroutine
itself. This change has the following effect on the [benchmark
performance](https://grafana-dot-grpc-testing.appspot.com/):
1. Unary 8-core: 184k QPS to 233k QPS (+26%)
2. Unary 30-core: 403k QPS to 624k QPS (+54%)

## Tested
* Ran the benchmark on the same GKE cluster to repro the results from
the dashboard.
* Created a docker image with the changes in this PR. Re-ran the
benchmark with the new image.

RELEASE NOTES: N/A

3 months agovet: add line numbers of offending lines to the output (#8593)
Easwar Swaminathan [Thu, 18 Sep 2025 20:26:50 +0000 (13:26 -0700)]
vet: add line numbers of offending lines to the output (#8593)

When vet fails because of offending whitespace, the output currently
only lists the offending file. This change adds the line number to the
output to make it easier on the developer to fix the issue.

RELEASE NOTE: n/a

3 months agocredentials: Remove TODO from public godoc (#8589)
Arjan Singh Bal [Thu, 18 Sep 2025 14:43:42 +0000 (20:13 +0530)]
credentials: Remove TODO from public godoc (#8589)

The TODO comment with a Github user's name shows up in the [public
godoc](https://pkg.go.dev/google.golang.org/grpc@v1.75.1/credentials#PerRPCCredentials).
Since this is a stable API, changing it now doesn't seem feasible, so
this change removes it completely.

RELEASE NOTES: N/A

3 months agodeps: update dependencies for all modules (#8588)
Pranjali-2501 [Wed, 17 Sep 2025 19:17:57 +0000 (00:47 +0530)]
deps: update dependencies for all modules (#8588)

3 months agoChange version to 1.77.0-dev (#8586)
Pranjali-2501 [Wed, 17 Sep 2025 08:37:49 +0000 (14:07 +0530)]
Change version to 1.77.0-dev (#8586)

3 months agoclient: minor improvements to log messages (#8564)
Easwar Swaminathan [Wed, 17 Sep 2025 02:30:47 +0000 (19:30 -0700)]
client: minor improvements to log messages (#8564)

Couple of minor improvements to log messages from the gRPC channel

The improvements are:
- Log the target URI when we log a message for the creation of a gRPC
channel
- Separate the channelz identifier (which could be something like
`[Channel #X]` or `[Channel X][Subchannel Y]` etc) from the actual
message being logged with a space

RELEASE NOTES: none

3 months agocredentials: implement file-based JWT Call Credentials (part 1 for A97) (#8431)
Dimitar Pavlov [Tue, 16 Sep 2025 07:20:07 +0000 (08:20 +0100)]
credentials: implement file-based JWT Call Credentials (part 1 for A97) (#8431)

Part one for https://github.com/grpc/proposal/pull/492 (A97).
This is done in a new `credentials/jwt` package to provide file-based
PerRPCCallCredentials. It can be used beyond XDS. The package handles
token reloading, caching, and validation as per A97 .

There will be a separate PR which uses it in `xds/bootstrap`.

Whilst implementing the above, I considered `credentials/oauth` and
`credentials/xds` packages instead of creating a new one. The former
package has `NewJWTAccessFromKey` and `jwtAccess` which seem very
relevant at first. However, I think the `jwtAccess` behaviour seems more
tailored towards Google services. Also, the refresh, caching, and error
behaviour for A97 is quite different than what's already there and
therefore a separate implementation would have still made sense.
WRT `credentials/xds`, it could have been extended to both handle
transport and call credentials. However, this is a bit at odds with A97
which says that the implementation should be non-XDS specific and, from
reading between the lines, usable beyond XDS.
I think the current approach makes review easier but because of the
similarities with the other two packages, it is a bit confusing to
navigate. Please let me know whether the structure should change.

Relates to https://github.com/istio/istio/issues/53532

RELEASE NOTES:
- credentials: Add `credentials/jwt` package providing file-based JWT
PerRPCCredentials (A97).

3 months agoxds/resolver_test: fix flaky test ResolverBadServiceUpdate_NACKedWithoutCache (#8521)
Elric [Mon, 15 Sep 2025 12:56:31 +0000 (21:56 +0900)]
xds/resolver_test: fix flaky test ResolverBadServiceUpdate_NACKedWithoutCache (#8521)

Fixes: #8435
### root cause of issue:
- I think there was a race condition when channel communicates between
the xDS resolver and test infrastructure
- insufficient buffer size: original channels (stateCh and errCh) had
only buffer size of 1
- blocking sends: When buffer is full, the resolver would block trying
to send the next update
- test deadlock: test infra might be waiting for a specific update while
the resolver was blocked trying to send a different update, creating a
deadlock

### Changes
1) Increased buffer size (1 → 10):
``` go
  stateCh := make(chan resolver.State, 10)
  errCh := make(chan error, 10)
```

2) Non-blocking send pattern:
 ``` go
  select {
  case stateCh <- s:  // the resolver try to send updates
  default:            // If channel is full, drain old message and retry
      select {
      case <-stateCh:
          stateCh <- s
      default:
      }
  }
```
- make it drain old messages preventing the resolver from blocking and just keeping the most latest updates.

3) Cleanup with draining goroutines:
``` go
  go func() {
      for range stateCh { }  // Drain any remaining messages
  }()
```
- it ensures the resolver never blocks on sends and prevents `goroutine leaks` during test cleanup.

RELEASE NOTES: N/A

3 months agointernal/buffer: set closed flag when closing channel in the Load method (#8575)
tsukiyoz [Mon, 15 Sep 2025 08:37:10 +0000 (16:37 +0800)]
internal/buffer: set closed flag when closing channel in the Load method (#8575)

## Description

This PR fixes a bug in the `Unbounded.Load()` method where the `closed`
flag was not being set to `true` when the channel was closed.

## Problem

In the `Load()` method, when the condition `b.closing && !b.closed` is
met, the code closes the channel but doesn't update the `closed` flag.
This creates an inconsistent state where:
- The channel is closed (no more data can be sent)
- But `b.closed` remains `false`

This inconsistency could potentially cause issues in code that relies on
the `closed` flag to determine the buffer's state.

## Solution

Added `b.closed = true` before `close(b.c)` in the `else if` branch of
the `Load()` method to ensure the closed flag accurately reflects the
buffer's state.

## Changes

- **File**: `internal/buffer/unbounded.go`
- **Method**: `Load()`
- **Line**: 86
- **Change**: Added `b.closed = true` before closing the channel

## Testing

- ✅ All existing tests pass
- ✅ No linter errors introduced
- ✅ The fix ensures consistent state between channel closure and closed
flag

## Impact

This is a bug fix that improves the correctness of the `Unbounded`
buffer implementation without changing its public API or behavior from a
user perspective.

Fixes: https://github.com/grpc/grpc-go/issues/8572
RELEASE NOTES: None

3 months agoencoding/proto: enable use cached size option (#8569)
Roy Salame [Mon, 15 Sep 2025 05:21:51 +0000 (01:21 -0400)]
encoding/proto: enable use cached size option (#8569)

Enable UseCachedSize in proto marshal to eliminate redundant size
computation

Fixes: https://github.com/grpc/grpc-go/issues/8570
The proto message size was previously being computed twice: once before
marshalling and again during the marshalling call itself. In
high-throughput workloads, this duplicated computation is expensive.

By enabling `UseCachedSize` on `MarshalOptions`, we reuse the size
calculated immediately before marshalling, avoiding the second call to
`proto.Size`.

In our application, the redundant size call accounted for ~12% of total
CPU time. With this change, we eliminate that overhead while preserving
correctness.

RELEASE NOTES:
- encoding/proto: Avoid redundant message size calculation when
marshalling.

3 months agotransport: avoid slice reallocation during header creation (#8547)
Arjan Singh Bal [Thu, 11 Sep 2025 08:32:34 +0000 (14:02 +0530)]
transport: avoid slice reallocation during header creation (#8547)

This PR improves the size estimate while pre-allocating `headerFields`
to avoid reallocations, which pprof showed were responsible for ~4% of
total memory allocations. This change improves performance, increasing
QPS by 1% while reducing bytes/op by 4% and latencies by 0.3-4%.

## Tested
```sh
go run benchmark/benchresult/main.go unary-before unary-after
unary-networkMode_Local-bufConn_false-keepalive_false-benchTime_1m0s-trace_false-latency_0s-kbps_0-MTU_0-maxConcurrentCalls_120-reqSiz
e_1024B-respSize_1024B-compressor_off-channelz_false-preloader_false-clientReadBufferSize_-1-clientWriteBufferSize_-1-serverReadBuffer
Size_-1-serverWriteBufferSize_-1-sleepBetweenRPCs_0s-connections_1-recvBufferPool_simple-sharedWriteBuffer_false
               Title       Before        After Percentage
            TotalOps      6327736      6390728     1.00%
             SendOps            0            0      NaN%
             RecvOps            0            0      NaN%
            Bytes/op     13903.23     13354.55    -3.95%
           Allocs/op       156.22       155.23    -0.64%
             ReqT/op 863946888.53 872547396.27     1.00%
            RespT/op 863946888.53 872547396.27     1.00%
            50th-Lat    1.00991ms   1.006914ms    -0.30%
            90th-Lat   1.678329ms   1.610331ms    -4.05%
            99th-Lat   2.517556ms   2.497122ms    -0.81%
             Avg-Lat   1.136117ms   1.125311ms    -0.95%
           GoVersion     go1.24.4     go1.24.4
         GrpcVersion   1.76.0-dev   1.76.0-dev
```

RELEASE NOTES:
* client: Improve header slice length estimate to reduce re-allocations.

3 months agoRevert "stats/opentelemetry: record retry attempts from clientStream (#8342)" (#8571)
eshitachandwani [Wed, 10 Sep 2025 17:01:05 +0000 (22:31 +0530)]
Revert "stats/opentelemetry: record retry attempts from clientStream (#8342)" (#8571)

This introduced flakiness in a test -
Test/TraceSpan_WithRetriesAndNameResolutionDelay
Failure:
https://github.com/grpc/grpc-go/actions/runs/17614152882/job/50042942932?pr=8547

Related issue: https://github.com/grpc/grpc-go/issues/8299

RELEASE NOTES: None

3 months agostats/opentelemetry: record retry attempts from clientStream (#8342)
vinothkumarr227 [Wed, 10 Sep 2025 06:57:54 +0000 (12:27 +0530)]
stats/opentelemetry: record retry attempts from clientStream (#8342)

Fixes: https://github.com/grpc/grpc-go/issues/8299
RELEASE NOTES:

- stats/opentelemetry: Retry attempts (`grpc.previous-rpc-attempts`) are
now recorded as span attributes for non-transparent client retries.

3 months agoGoogleC2P: remove dependency on metadata server for IPv6 node metadata (#8550)
apolcyn [Mon, 8 Sep 2025 20:05:26 +0000 (13:05 -0700)]
GoogleC2P: remove dependency on metadata server for IPv6 node metadata (#8550)

Remove reliance on metadata server since it's result is no longer
needed, hardcode IPv6 support in node metadata instead.

Related c++ change: https://github.com/grpc/grpc/pull/40571

Note we preserve prior behavior in case experiment `NewPickFirstEnabled`
is disabled, because our testing/qualification has not covered that
being disabled.

Related: internal issue b/407587619

RELEASE NOTES: n/a

3 months agoxds: move env var check for HTTP CONNECT metadata parsing to endpoint and locality...
Easwar Swaminathan [Fri, 5 Sep 2025 22:20:05 +0000 (15:20 -0700)]
xds: move env var check for HTTP CONNECT metadata parsing to endpoint and locality parsing functions (#8551)

Currently, the env var check for parsing HTTP CONNECT metadata (A86) is
inside the function that parses custom metadata,
`validateAndConstructMetadata`.

This PR moves the check to the endpoint and locality parsing functions,
`parseEndpoint` and the top-level `parseEDSRespProto` which is where
localities are parsed. This allows multiple env vars to control
different custom metadata keys. We already support two custom metadata
keys (A76 and A86) and we plan to support more (A83).

This PR also ensures that the custom metadata used for ring_hash key
(A76) uses the recently added `StructMetadataValue` type. This ensures
that metadata parsing happens only once.

Since the location of the env var check is moved, the tests are also
restructured a little. This PR groups the custom metadata parsing tests
into three groups: one for success cases when the env var is turned on,
one for success cases when the env var is turned off, and one for
failure cases when the env var is turned on.

RELEASE NOTES: none

3 months agopriority: use new-style atomic APIs (#8558)
Easwar Swaminathan [Fri, 5 Sep 2025 06:04:46 +0000 (23:04 -0700)]
priority: use new-style atomic APIs (#8558)

Use new-style atomic APIs instead of the old ones in the
`ignoreResolveNowClientConn` type.

The changes made in this PR improve the code in the following ways:
* Ergonomics: Method-based API vs function-based, no pointer management
needed
* Safety: Type safety prevents mixing atomic/non-atomic operations,
eliminates pointer errors
* Clarity: The `atomic.Uint32` type makes atomic intent explicit from
declaration

RELEASE NOTES: none

3 months agoclient: handle 1xx HTTP status HEADERS (#8518)
vinothkumarr227 [Thu, 4 Sep 2025 20:19:45 +0000 (01:49 +0530)]
client: handle 1xx HTTP status HEADERS (#8518)

Fixes: https://github.com/grpc/grpc-go/issues/8485
RELEASE NOTES:
* client: Ignore http headers with status 1xx and `END_STREAM` flag
unset.
* client: Fail RPCs with status `INTERNAL` instead of `UNKNOWN` on
receiving http headers with status 1xx and `END_STREAM` flag set.

3 months agogithub,test: fix internal CI build (#8556)
Arjan Singh Bal [Thu, 4 Sep 2025 16:06:30 +0000 (21:36 +0530)]
github,test: fix internal CI build (#8556)

Fix the following errors:
* only use one space to separate words
* use service codedgen import name for stream type

RELEASE NOTES: N/A

3 months agotransport: allow stream cancellation on the server when blocked on flow control ...
Arjan Singh Bal [Thu, 4 Sep 2025 06:28:10 +0000 (11:58 +0530)]
transport: allow stream cancellation on the server when blocked on flow control (#8528)

Fixes: #8517
This change allows `t.closeStream()` to be executed even if the stream
state is `done`. This is required to allow streams to be cancelled to
timed out. See issue for detailed root cause.

RELEASE NOTES:
* server: Fix bug preventing streams from being cancelled or timed out
when blocked on flow control.

3 months agoxdsclient: Fix race in SetWatchExpiryTimeoutForTesting (#8526)
eshitachandwani [Sat, 30 Aug 2025 14:24:14 +0000 (19:54 +0530)]
xdsclient: Fix race in SetWatchExpiryTimeoutForTesting (#8526)

Fixes: #8525
There is a race in
[SetWatchExpiryTimeoutForTesting](https://github.com/grpc/grpc-go/blob/fa0d6583208033fe4f69d359f80286736fd121d0/internal/xds/clients/xdsclient/xdsclient.go#L121)
which is used to override the watch expiry timeout of XDSClient for
testing. Currently it just sets the watchExpiryTimeout of the XDSClient
to the provided value without a mutex each time we call
[NewClientForTesting](https://github.com/grpc/grpc-go/blob/fa0d6583208033fe4f69d359f80286736fd121d0/internal/xds/xdsclient/pool.go#L116C16-L116C35)
which might of might not create a new XDSClient if one is already there.

Fix : Add a new field `WatchExpiryTimeout` to the xdsclient
[config](https://github.com/grpc/grpc-go/blob/30645d521be375d13fa4cb2baa0d2561ca44c342/internal/xds/clients/xdsclient/xdsconfig.go#L28)
which will now be used instead of `internal.WatchExpiryTImeout`

RELEASE NOTES: None

3 months agoxds: add metadata registry (#8537)
cjqzhao [Fri, 29 Aug 2025 16:57:00 +0000 (09:57 -0700)]
xds: add metadata registry (#8537)

Following
[A83](https://github.com/grpc/proposal/blob/master/A83-xds-gcp-authn-filter.md)
and
[A86](https://github.com/grpc/proposal/blob/master/A86-xds-http-connect.md),
this adds a registry for custom metadata received in xDS protos for the
purpose of converting the received metadata into internal
representations.

RELEASE NOTES: n/a

3 months agoxds/resolver: change tests to update all resources (#8539)
eshitachandwani [Fri, 29 Aug 2025 03:48:05 +0000 (09:18 +0530)]
xds/resolver: change tests to update all resources (#8539)

Change the tests in xds resolver to update all resources in management
server instead of only listener and route resource.

This change is being done as part of gRFC [A74 : xDS Config
tears](https://github.com/grpc/proposal/blob/master/A74-xds-config-tears.md).
This is to make sure the tests pass after the change too.

RELEASE NOTES: None

4 months agoxdsclient: create LRSClient at time of initialisation (#8483)
eshitachandwani [Tue, 26 Aug 2025 05:29:33 +0000 (10:59 +0530)]
xdsclient: create LRSClient at time of initialisation (#8483)

Fixes: https://github.com/grpc/grpc-go/issues/8474
The race is in
[ReportLoad](https://github.com/grpc/grpc-go/blob/9186ebd774370e3b3232d1b202914ff8fc2c56d6/xds/internal/xdsclient/clientimpl_loadreport.go#L35C2-L44C21)
function of clientImpl. The implementation was recently changed as the
part of [xds client
migration](https://github.com/grpc/grpc-go/commit/082a9275c79a9d78fdaa4a93018e5e53a4a3af18).

The
[comment](https://github.com/grpc/grpc-go/blob/85240a5b02defe7b653ccba66866b4370c982b6a/xds/internal/xdsclient/clientimpl.go#L86C2-L87C16)
says that `lrsclient.LRSClient` should be initialized only at creation
time but that was not the case. It was being initialized at the time of
calling `ReportLoad` function.

RELEASE NOTES:

- lrsclient:
- Fix a race condition where the `LRSClient` was not initialized at
creation time but it was being initialized at the time of calling the
`ReportLoad` function.
-  Creating an `LRSClient` no longer requires a node ID.

4 months agoclient: Roll-forward PR #8278(with changes): Restore the existing behavior to return...
Pranjali-2501 [Mon, 25 Aug 2025 19:24:23 +0000 (00:54 +0530)]
client: Roll-forward PR #8278(with changes): Restore the existing behavior to return io.EOF on repeated RecvMsg() calls for client-streaming RPCs (#8523)

Partially addresses: https://github.com/grpc/grpc-go/issues/7286

This reverts commit
https://github.com/grpc/grpc-go/commit/20bd1e7dfad3535675e3dff3d73fa894256f1f79

Changes:
- Modifies client.RecvMsg() so that successive calls after stream ends
return io.EOF.
- Adds extra state to track calls to client.recvmsg(required to return
Cardinality Violation only in case zero response)

RELEASE NOTES:
* client: Return status code INTERNAL when a server sends 0 response
messages for a unary or client streaming RPC.

4 months agoCONTRIBUTING.md: minor text tweaks for linters (#8533)
Doug Fawley [Fri, 22 Aug 2025 15:49:27 +0000 (08:49 -0700)]
CONTRIBUTING.md: minor text tweaks for linters (#8533)

4 months agoxdsclient: revert #8369: delay resource cache deletion (#8527)
Easwar Swaminathan [Thu, 21 Aug 2025 17:52:40 +0000 (10:52 -0700)]
xdsclient: revert #8369: delay resource cache deletion (#8527)

The change being reverted here (#8369) is a prime suspect for a race
that can show up with the following sequence of events:
- create a new gRPC channel with the `xds:///` scheme
- make an RPC
- close the channel
- repeat (possibly from multiple goroutines)

The observable behavior from the race is that the xDS client thinks that
a Listener resource is removed by the control plane when it clearly is
not. This results in the user's gRPC channel moving to TRANSIENT_FAILURE
and subsequent RPC failures.

The reason the above mentioned PR is not being rolled back using `git
revert` is because the xds directory structure has changed significantly
since the time the PR was originally merged. Manually performing the
revert seemed much easier.

RELEASE NOTES:
* xdsclient: Revert a change that introduces a race with xDS resource
processing, leading to RPC failures

4 months agogithub: add PR template (#8524)
Doug Fawley [Thu, 21 Aug 2025 16:53:13 +0000 (09:53 -0700)]
github: add PR template (#8524)

4 months agotransport: ensure header mutex is held while copying trailers in handler_server ...
Arjan Singh Bal [Thu, 21 Aug 2025 06:50:13 +0000 (12:20 +0530)]
transport: ensure header mutex is held while copying trailers in handler_server (#8519)

Fixes: https://github.com/grpc/grpc-go/issues/8514
The mutex that guards the trailers should be held while copying the
trailers. We do lock the mutex in [the regular gRPC server
transport](https://github.com/grpc/grpc-go/blob/9ac0ec87ca2ecc66b3c0c084708aef768637aef6/internal/transport/http2_server.go#L1140-L1142),
but have missed it in the std lib http/2 transport. The only place where
a write happens is `writeStatus()` is when the status contains a proto.

https://github.com/grpc/grpc-go/blob/4375c784450aa7e43ff15b8b2879c896d0917130/internal/transport/handler_server.go#L251-L252

RELEASE NOTES:
* transport: Fix a data race while copying headers for stats handlers in
the std lib http2 server transport.

4 months agodeps: bump Go version in Dockerfiles (#8522)
Stanley Cheung [Tue, 19 Aug 2025 21:04:51 +0000 (14:04 -0700)]
deps: bump Go version in Dockerfiles (#8522)

4 months agoxds: move all functionality from `xds/internal` to `internal/xds` (#8515)
eunsang [Tue, 19 Aug 2025 17:05:46 +0000 (02:05 +0900)]
xds: move all functionality from `xds/internal` to `internal/xds` (#8515)

Fixes grpc#7290, ensuring that only user-facing functionality remains in
the top-level xds package.

Updates all import paths and aliases to reference the new internal/xds
package, using aliases (e.g., `internal` → `xds` or `xdsinternal`) where
needed to minimize changes to call sites.

No functional changes intended; this is purely a package path
reorganization.

RELEASE NOTES: none

4 months agoxds/cdsbalancer: increase buffer size of requested resource channel in test (#8467)
eshitachandwani [Mon, 18 Aug 2025 05:15:30 +0000 (10:45 +0530)]
xds/cdsbalancer: increase buffer size of requested resource channel in test  (#8467)

RELEASE NOTES: N/A

Fixes: https://github.com/grpc/grpc-go/issues/8462
The main issue was that the requests were getting dropped since we use a
[non-blocking
send](https://github.com/grpc/grpc-go/blob/a5e7cd6d4c2c31b1e6649789c2ddc9a82ad6b5fa/xds/internal/balancer/cdsbalancer/cdsbalancer_test.go#L222C5-L227C6)
for resources in test along with buffer size of just
[one](https://github.com/grpc/grpc-go/blob/a5e7cd6d4c2c31b1e6649789c2ddc9a82ad6b5fa/xds/internal/balancer/cdsbalancer/cdsbalancer_test.go#L210)
which was resulting in resource request updates being dropped if the
receiver is not executing at the exact moment.
Fix:
Changed the `setupManagementServer` to take `listener` and `OnStreamReq`
function as a parameter and in the `TestWatcher` added a blocking send
whenever a cluster resource is requested.

4 months agogrpctest: add test coverages of `ExitIdle` (#8375)
Elric [Fri, 15 Aug 2025 17:57:04 +0000 (02:57 +0900)]
grpctest: add test coverages of `ExitIdle` (#8375)

Fixes: https://github.com/grpc/grpc-go/issues/8118
4 months agodeps: bump go version to 1.24 (#8509)
Kevin Krakauer [Thu, 14 Aug 2025 18:31:22 +0000 (11:31 -0700)]
deps: bump go version to 1.24 (#8509)

4 months agoxdsclient: add an e2e style test for fallback involving more than 2 servers #7817...
vinothkumarr227 [Thu, 14 Aug 2025 05:43:57 +0000 (11:13 +0530)]
xdsclient: add an e2e style test for fallback involving more than 2 servers #7817 (#8427)

Fixes: https://github.com/grpc/grpc-go/issues/7817
4 months agoxdsclient: schedule serializer callback from the authority instead of from the xdsCha...
Easwar Swaminathan [Wed, 13 Aug 2025 18:44:50 +0000 (11:44 -0700)]
xdsclient: schedule serializer callback from the authority instead of from the xdsChannel (#8498)

This is a small code change that simplifies how a callback is scheduled.
The `xdsChannel` will no longer directly access the serializer inside
the `authority` type. Instead, the authority type will now handle the
scheduling itself. This makes the code cleaner and moves the scheduling
logic to where it belongs.

RELEASE NOTES: none

4 months agogrpcsync: use context.AfterFunc to close buffer after context canceled in CallbackSer...
Turfa Auliarachman [Tue, 12 Aug 2025 22:27:23 +0000 (05:27 +0700)]
grpcsync: use context.AfterFunc to close buffer after context canceled in CallbackSerializer (#8489)

[The current minimum supported Go version is now
1.23](https://github.com/grpc/grpc-go/blob/62ec29fd9b3f9ea3cea6dc08a31e837aa92678b7/go.mod#L3).
`context.AfterFunc` is available for all of grpc-go's latest version
users. Thus we can do this pending TODO.

`context.AfterFunc` would invoke the given function for both _immediate_
context cancelation and timer-based context cancelation (`WithTimeout`,
`WithDeadline`). So I think this change is safe.

RELEASE NOTES: N/A

4 months agodeps: update github.com/prometheus/client_golang (#8502)
Pranjali-2501 [Tue, 12 Aug 2025 06:41:50 +0000 (12:11 +0530)]
deps: update github.com/prometheus/client_golang (#8502)

This PR updates Prometheus-related dependencies in grpc-go to fix
compatibility issues caused by recent API changes in
github.com/prometheus/otlptranslator.
Complementing the broader dependency updates made in PR #8497.

RELEASE NOTES: N/A

4 months agogrpclb: simplify stringifying of IPv6 with net.JoinHostPort (#8503)
Oleksandr Redko [Tue, 12 Aug 2025 06:39:40 +0000 (09:39 +0300)]
grpclb: simplify stringifying of IPv6 with net.JoinHostPort (#8503)

This PR simplifies IP address handling in
`lbBalancer.processServerList`.

From [net.JoinHostPort](https://pkg.go.dev/net#JoinHostPort):

> JoinHostPort combines host and port into a network address of the form
"host:port". If host contains a colon, as found in literal IPv6
addresses, then JoinHostPort returns "[host]:port".

RELEASE NOTES: none

4 months agoxdsclient: modify how the resource watch state is retrieved for testing (#8499)
Easwar Swaminathan [Wed, 6 Aug 2025 22:12:34 +0000 (15:12 -0700)]
xdsclient: modify how the resource watch state is retrieved for testing (#8499)

4 months agodeps: update dependencies for all modules (#8497)
Pranjali-2501 [Wed, 6 Aug 2025 17:13:29 +0000 (22:43 +0530)]
deps: update dependencies for all modules (#8497)

4 months agoChange version to 1.76.0-dev
Pranjali-2501 [Wed, 6 Aug 2025 06:37:42 +0000 (12:07 +0530)]
Change version to 1.76.0-dev

4 months agocredentials: fix behavior of grpc.WithAuthority and credential handshake precedence...
Doug Fawley [Tue, 5 Aug 2025 22:04:18 +0000 (15:04 -0700)]
credentials: fix behavior of grpc.WithAuthority and credential handshake precedence (#8488)

4 months agoxds: remove xds client fallback environment variable (#8482)
cjqzhao [Tue, 5 Aug 2025 16:04:11 +0000 (09:04 -0700)]
xds: remove xds client fallback environment variable (#8482)

4 months agogrpc: Fix cardinality violations in non-client streaming RPCs. (#8385)
Pranjali-2501 [Tue, 5 Aug 2025 05:01:52 +0000 (10:31 +0530)]
grpc: Fix cardinality violations in non-client streaming RPCs. (#8385)

4 months agostats: change non-standard units to annotations (#8481)
Arjan Singh Bal [Fri, 1 Aug 2025 04:40:49 +0000 (10:10 +0530)]
stats: change non-standard units to annotations (#8481)

4 months agoupdate deps (#8478)
eshitachandwani [Wed, 30 Jul 2025 08:42:56 +0000 (14:12 +0530)]
update deps (#8478)

4 months agoexamples/opentelemetry: use experimental metrics in example (#8441)
vinothkumarr227 [Wed, 30 Jul 2025 05:52:50 +0000 (11:22 +0530)]
examples/opentelemetry:  use experimental metrics in example (#8441)

4 months agoxdsclient: do not process updates from closed server channels (#8389)
Easwar Swaminathan [Tue, 29 Jul 2025 23:26:08 +0000 (16:26 -0700)]
xdsclient: do not process updates from closed server channels (#8389)

4 months agoAllow empty nodeID (#8476)
Sotiris Nanopoulos [Tue, 29 Jul 2025 22:48:53 +0000 (18:48 -0400)]
Allow empty nodeID (#8476)

5 months agocleanup: use slices.Equal to simplify code (#8472)
jishudashu [Thu, 24 Jul 2025 21:25:26 +0000 (05:25 +0800)]
cleanup: use slices.Equal to simplify code (#8472)

5 months agoprotoc-gen-go-grpc: bump golang.org/x/net (#8458)
dependabot[bot] [Thu, 24 Jul 2025 04:44:38 +0000 (10:14 +0530)]
protoc-gen-go-grpc: bump golang.org/x/net (#8458)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.35.0 to 0.38.0.
- [Commits](https://github.com/golang/net/compare/v0.35.0...v0.38.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-version: 0.38.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Arjan Bal <arjansbal@google.com>
5 months agoxdsclient: delay resource cache deletion to handle immediate re-subscription of same...
Purnesh Dixit [Thu, 24 Jul 2025 04:31:38 +0000 (10:01 +0530)]
xdsclient: delay resource cache deletion to handle immediate re-subscription of same resource (#8369)

5 months agoadvancedtls: avoid txt lookups in test and use test logger instead of Printf (#8469)
Arjan Singh Bal [Tue, 22 Jul 2025 18:41:27 +0000 (00:11 +0530)]
advancedtls: avoid txt lookups in test and use test logger instead of Printf (#8469)

5 months agostats: add DelayedPickComplete and follow correct semantics (#8465)
Doug Fawley [Mon, 21 Jul 2025 18:35:33 +0000 (11:35 -0700)]
stats: add DelayedPickComplete and follow correct semantics (#8465)

5 months agogithub: run arm64 tests without emulation (#8463)
Arjan Singh Bal [Mon, 21 Jul 2025 18:23:22 +0000 (23:53 +0530)]
github: run arm64 tests without emulation (#8463)

5 months agotestutils/roundrobin: Improve validation of WRR distribution (#8459)
Arjan Singh Bal [Mon, 21 Jul 2025 06:03:36 +0000 (11:33 +0530)]
testutils/roundrobin: Improve validation of WRR distribution (#8459)

5 months agotransport: add test case for zero second timeout (#8452)
Doug Fawley [Fri, 18 Jul 2025 16:33:24 +0000 (09:33 -0700)]
transport: add test case for zero second timeout (#8452)

5 months agoxdsclient: typed config better nil checks (#8412)
Burkov Egor [Fri, 18 Jul 2025 15:20:15 +0000 (18:20 +0300)]
xdsclient: typed config better nil checks (#8412)

5 months agoRetract v1.74.0 and v1.74.1 (#8456)
Doug Fawley [Thu, 17 Jul 2025 19:54:20 +0000 (12:54 -0700)]
Retract v1.74.0 and v1.74.1 (#8456)

5 months agoRevert "credentials: allow audience to be configured (#8421) (#8442)" (#8450)
eshitachandwani [Wed, 16 Jul 2025 09:49:22 +0000 (15:19 +0530)]
Revert "credentials: allow audience to be configured (#8421) (#8442)" (#8450)

This reverts commit 7208cdc42397c5ec1cc03f48e2ffd414cc75e635.

5 months agotransport: release mutex before returning on expired deadlines in server streams...
Arjan Singh Bal [Wed, 16 Jul 2025 05:16:15 +0000 (10:46 +0530)]
transport: release mutex before returning on expired deadlines in server streams (#8451)

5 months agoxds: add a test for deadlocks in nested xDS channels (#8448)
Arjan Singh Bal [Tue, 15 Jul 2025 05:55:39 +0000 (11:25 +0530)]
xds: add a test for deadlocks in nested xDS channels (#8448)

5 months agoendpointsharding: shuffle endpoint order before updating children (#8438)
Doug Fawley [Mon, 14 Jul 2025 22:29:30 +0000 (15:29 -0700)]
endpointsharding: shuffle endpoint order before updating children (#8438)

5 months agocredentials: allow audience to be configured (#8421) (#8442)
Chris Staite [Mon, 14 Jul 2025 17:52:09 +0000 (18:52 +0100)]
credentials: allow audience to be configured (#8421) (#8442)

There are competing specifications around whether a method should be included in a JWT audience or not.  For example #4713 specifically excluded the method referencing https://google.aip.dev/auth/4111 whereas GCE IAP requires the full URI https://cloud.google.com/iap/docs/authentication-howto.

In order to facilitate both methods, we introduce a new environment variable, namely GRPC_AUDIENCE_IS_FULL_PATH, to allow the method stripping to be disabled.  This defaults to the existing behaviour of stripping the method, but can be set to avoid this.

5 months agoMove erm-g to Emeritus Maintainer (#8418)
Richard Belleville [Fri, 11 Jul 2025 22:11:52 +0000 (15:11 -0700)]
Move erm-g to Emeritus Maintainer (#8418)

5 months agoRemove inactive maintainers (#8416)
Richard Belleville [Fri, 11 Jul 2025 22:11:39 +0000 (15:11 -0700)]
Remove inactive maintainers (#8416)

5 months agoxds: give up pool lock before closing xdsclient channel (#8445)
Doug Fawley [Fri, 11 Jul 2025 20:28:15 +0000 (13:28 -0700)]
xds: give up pool lock before closing xdsclient channel (#8445)

5 months agoalts: improve alts handshaker error logs (#8444)
Luwei Ge [Thu, 10 Jul 2025 21:07:22 +0000 (14:07 -0700)]
alts: improve alts handshaker error logs (#8444)

* improve alts handshaker error logs

5 months agoserver: allow 0s grpc-timeout header values, as java is known to be able to send...
Doug Fawley [Wed, 9 Jul 2025 14:35:47 +0000 (07:35 -0700)]
server: allow 0s grpc-timeout header values, as java is known to be able to send them (#8439)

5 months agodeps: update dependencies for all modules (#8434)
Arjan Singh Bal [Wed, 9 Jul 2025 05:39:21 +0000 (11:09 +0530)]
deps: update dependencies for all modules (#8434)

5 months agoxds/cdsbalancer: correctly remove the unwanted cds watchers (#8428)
eshitachandwani [Tue, 8 Jul 2025 02:42:35 +0000 (08:12 +0530)]
xds/cdsbalancer: correctly remove the unwanted cds watchers (#8428)

5 months agogrpctest: minor improvements to the test logger implementation (#8370)
Ashesh Vidyut [Mon, 7 Jul 2025 06:25:00 +0000 (11:55 +0530)]
grpctest: minor improvements to the test logger implementation (#8370)

5 months agoxds: cleanup internal testing functions for env vars that have long been removed...
Easwar Swaminathan [Wed, 2 Jul 2025 12:24:14 +0000 (05:24 -0700)]
xds: cleanup internal testing functions for env vars that have long been removed (#8413)

5 months agoxdsclient: relay marshalled bytes of complete resource proto to decoders (#8422)
Purnesh Dixit [Wed, 2 Jul 2025 04:21:43 +0000 (09:51 +0530)]
xdsclient: relay marshalled bytes of complete resource proto to decoders (#8422)

5 months agodns: add environment variable to disable TXT lookups in DNS resolver (#8377)
Doug Fawley [Tue, 1 Jul 2025 21:33:13 +0000 (14:33 -0700)]
dns: add environment variable to disable TXT lookups in DNS resolver (#8377)

5 months agoxds: Avoid error logs when setting fallback bootstrap config (#8419)
Arjan Singh Bal [Tue, 1 Jul 2025 21:08:29 +0000 (02:38 +0530)]
xds: Avoid error logs when setting fallback bootstrap config (#8419)

5 months agoadd grpctester (#8423)
eshitachandwani [Tue, 1 Jul 2025 13:25:23 +0000 (18:55 +0530)]
add grpctester (#8423)

6 months agogithub: delete mergeable configuration (#8415)
Doug Fawley [Thu, 26 Jun 2025 19:36:11 +0000 (12:36 -0700)]
github: delete mergeable configuration (#8415)

6 months agogithub: Restrict repo contents permissions to read-only in pr-validation (#8414)
Doug Fawley [Thu, 26 Jun 2025 19:36:00 +0000 (12:36 -0700)]
github: Restrict repo contents permissions to read-only in pr-validation (#8414)

6 months agoxdsclient: preserve original bytes for decoding when the resource is wrapped (#8411)
Easwar Swaminathan [Wed, 25 Jun 2025 10:50:29 +0000 (03:50 -0700)]
xdsclient: preserve original bytes for decoding when the resource is wrapped (#8411)

6 months agoChange version to 1.75.0-dev (#8409)
eshitachandwani [Wed, 25 Jun 2025 07:14:42 +0000 (12:44 +0530)]
Change version to 1.75.0-dev (#8409)

6 months agoxdsclient: export genericResourceTypeDecoder (#8406)
Easwar Swaminathan [Wed, 25 Jun 2025 04:07:51 +0000 (21:07 -0700)]
xdsclient: export genericResourceTypeDecoder (#8406)

6 months agoxdsclient: make a function to return the supported resource type implementations...
Easwar Swaminathan [Tue, 24 Jun 2025 06:00:14 +0000 (23:00 -0700)]
xdsclient: make a function to return the supported resource type implementations (#8405)

6 months agogrpc: revert #8278: Fix cardinality violations in non-server streaming RPCs (#8404)
Arjan Singh Bal [Mon, 23 Jun 2025 18:50:59 +0000 (00:20 +0530)]
grpc: revert #8278: Fix cardinality violations in non-server streaming RPCs (#8404)

This reverts commit a64d9333afdba82e81468c7a1c8b56070af13ff7.

6 months agoexamples/opentelemetry: demonstrate enabling experimental metrics (#8388)
vinothkumarr227 [Mon, 23 Jun 2025 05:46:35 +0000 (11:16 +0530)]
examples/opentelemetry: demonstrate enabling experimental metrics (#8388)

6 months agooutlierdetection: cleanup temporary pickfirst health listener attribute (#8402)
Arjan Singh Bal [Thu, 19 Jun 2025 05:50:35 +0000 (11:20 +0530)]
outlierdetection: cleanup temporary pickfirst health listener attribute (#8402)

6 months agostub: Add child balancer in stub.BalancerData (#8393)
vinothkumarr227 [Thu, 19 Jun 2025 05:18:14 +0000 (10:48 +0530)]
stub: Add child balancer in stub.BalancerData (#8393)

6 months agoxdsclient_test: Avoid restarting listener in TestServerFailureMetrics_AfterResponseRe...
Arjan Singh Bal [Tue, 17 Jun 2025 04:32:42 +0000 (10:02 +0530)]
xdsclient_test: Avoid restarting listener in TestServerFailureMetrics_AfterResponseRecv (#8399)

6 months agoxds: Fix flaky test HandleListenerUpdate_ErrorUpdate (#8397)
Arjan Singh Bal [Mon, 16 Jun 2025 18:52:57 +0000 (00:22 +0530)]
xds: Fix flaky test HandleListenerUpdate_ErrorUpdate (#8397)