]> git.feebdaed.xyz Git - 0xmirror/grpc.git/commit
[pick_first] go CONNECTING when selected subchannel goes CONNECTING or TF (#41029)
authorMark D. Roth <roth@google.com>
Fri, 5 Dec 2025 20:24:49 +0000 (12:24 -0800)
committerCopybara-Service <copybara-worker@google.com>
Fri, 5 Dec 2025 20:27:39 +0000 (12:27 -0800)
commitfb7b981977769fe2301d8409c3b8462b8a5223b5
tree080b2f23c4ed05fbefb91d56b5ba23f1eb38de31
parent698c7a89a89b592b0bda7425241180f4782d19db
[pick_first] go CONNECTING when selected subchannel goes CONNECTING or TF (#41029)

Needed as part of gRFC A105 (https://github.com/grpc/proposal/pull/516).

Currently, when the selected subchannel leaves READY state, the only possible state it can move to is IDLE, and pick_first handles that by itself going IDLE.  However, as part of A105, we are going to introduce the possibility of the subchannel going from READY to either CONNECTING or TRANSIENT_FAILURE, and in those two cases we want pick_first to go back into CONNECTING and start a new happy eyeballs pass.  This PR introduces an experiment that adds that behavior.

While I was at it, I noticed an existing misfeature.  There are two cases where pick_first will go IDLE, which is done by calling [`GoIdle()`](https://github.com/grpc/grpc/blob/24b25a0baa72a658cc37d1db28f77513a9670ea2/src/core/load_balancing/pick_first/pick_first.cc#L610):
1. The case mentioned above, where the selected subchannel goes from READY to IDLE (`GoIdle()` is called from [`SubchannelState::OnConnectivityStateChange()`](https://github.com/grpc/grpc/blob/24b25a0baa72a658cc37d1db28f77513a9670ea2/src/core/load_balancing/pick_first/pick_first.cc#L784)).
2. The case where pick_first already has a selected subchannel and receives a new address list, but none of the subchannels in the new list report READY.  In this case, pick_first knows that the currently selected subchannel is for an address that is not present in the new address list, so it unrefs the selected subchannel and goes IDLE (`GoIdle()` is called from [`SubchannelData::OnConnectivityStateChange()`](https://github.com/grpc/grpc/blob/24b25a0baa72a658cc37d1db28f77513a9670ea2/src/core/load_balancing/pick_first/pick_first.cc#L859)).

The code in `GoIdle()` currently requests a re-resolution, which is the right behavior for case 1.  However, it doesn't really make sense to do this for case 2, since we have just received a fresh resolver update in that case.  Therefore, as part of this experiment, I am moving the code that triggers the re-resolution out of `GoIdle()` and directly into `SubchannelState::OnConnectivityStateChange()`, where it will occur only for case 1.

Closes #41029

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/41029 from markdroth:pick_first_ready_to_connecting fdb6ef68e3a73e0035520149b72a1d21775354c3
PiperOrigin-RevId: 840830927
bazel/experiments.bzl
src/core/lib/experiments/experiments.cc
src/core/lib/experiments/experiments.h
src/core/lib/experiments/experiments.yaml
src/core/load_balancing/pick_first/pick_first.cc
test/core/load_balancing/lb_policy_test_lib.h
test/core/load_balancing/pick_first_test.cc