Conversation

olavloite

Sessions should be closed using an async gRPC call in order to speed up the closing of a large session pool. Instead of using its own executor, which is limited to 8 worker threads, to execute asynchronous delete session calls, the session pool should use the asynchronous call option in gRPC. This allows a larger number of asynchronous delete session calls to be executed in parallel and speeds up closing a session pool with a large number of sessions.

Testing against a real Cloud Spanner database with a session pool containing 1,000 sessions shows the following performance for closing the session pool:

Without this change (3 test runs):
6603ms
8169ms
8367ms

With this change (3 test runs):
1297ms
1710ms
1851ms

Fixes #19

@googlebotgooglebot added the cla: yesThis human has signed the Contributor License Agreement.label Jan 9, 2020
@olavloite

The binary compatibility check fails because of:

[ERROR] 7012: com.google.cloud.spanner.Session: Method 'public com.google.api.core.ApiFuture asyncClose()' has been added to an interface
[ERROR] 7012: com.google.cloud.spanner.spi.v1.SpannerRpc: Method 'public com.google.api.core.ApiFuture asyncDeleteSession(java.lang.String, java.util.Map)' has been added to an interface

Both Session and SpannerRpc are public interfaces, but are only used internally in the client library. There are no public methods returning either a Session or a SpannerRpc.

@olavloiteolavloite added the kokoro:force-runAdd this label to force Kokoro to re-run the tests.label Jan 9, 2020
@yoshi-kokoroyoshi-kokoro removed the kokoro:force-runAdd this label to force Kokoro to re-run the tests.label Jan 9, 2020
Sessions should be closed using an async gRPC call in order
to speed up the closing of a large session pool. Instead of
using its own executor, which is limited to 8 worker threads,
to execute asynchronous delete session calls, the session pool
should use the asynchronous call option in gRPC. This allows a
larger number of asynchronous delete session calls to be executed
in parallel and speeds up closing a session pool with a large
number of sessions.

Testing against a real Cloud Spanner database with a session pool
containing 1,000 sessions shows the following performance for
closing the session pool:

Before (3 runs):
6603ms
8169ms
8367ms

After (3 runs):
1297ms
1710ms
1851ms

Fixes #19.
@olavloiteolavloite force-pushed the close-sessions-async branch from b3cfae4 to a2be792 Compare January 14, 2020 17:34
@olavloiteolavloite added the kokoro:force-runAdd this label to force Kokoro to re-run the tests.label Jan 14, 2020
@yoshi-kokoroyoshi-kokoro removed the kokoro:force-runAdd this label to force Kokoro to re-run the tests.label Jan 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the benchmarking test 👍.

@olavloiteolavloite added semver: majorHint for users that this is an API breaking change.kokoro:force-runAdd this label to force Kokoro to re-run the tests.labels Jan 18, 2020
@yoshi-kokoroyoshi-kokoro removed the kokoro:force-runAdd this label to force Kokoro to re-run the tests.label Jan 18, 2020
@olavloiteolavloite changed the title perf: close sessions async perf!: close sessions async Jan 18, 2020
@olavloiteolavloite added the kokoro:force-runAdd this label to force Kokoro to re-run the tests.label Jan 18, 2020
@yoshi-kokoroyoshi-kokoro removed theAdd this label to force Kokoro to re-run the tests.label Jan 18, 2020
@olavloiteolavloite changed the title perf!: close sessions async perf: close sessions async Jan 18, 2020
@olavloiteolavloite removed the semver: majorHint for users that this is an API breaking change.label Jan 18, 2020
@olavloite

@skuruppu @hengfengli @chingor13
This PR cannot be merged because the Binary Compatibility test fails. This happens because it adds a method to two public interfaces. These interfaces are not really public (as in: returned to users), but must be public because they reside in different packages within the Spanner client library and are used by different parts of the Spanner client lib.

So:

  1. Should we still treat this as a breaking change?
  2. Regardless of whether we do treat is a breaking change or not: What action is needed to make the PR mergeable?

@skuruppu

@skuruppu @hengfengli @chingor13
This PR cannot be merged because the Binary Compatibility test fails. This happens because it adds a method to two public interfaces. These interfaces are not really public (as in: returned to users), but must be public because they reside in different packages within the Spanner client library and are used by different parts of the Spanner client lib.

So:

  1. Should we still treat this as a breaking change?

I don't believe it should be flagged as a breaking change given that we should reserve those when making backwards-incompatible changes that requires users to rebuild their apps.

  1. Regardless of whether we do treat is a breaking change or not: What action is needed to make the PR mergeable?

I'm not sure actually :( @kolea2 would you be able to advice us on whether it's ok to ignore a Binary Compatibility failure and merge this change.

I don't know why the Java 11 test fails so I triggered a rebuild.

@skuruppu

@skuruppu @hengfengli @chingor13
This PR cannot be merged because the Binary Compatibility test fails. This happens because it adds a method to two public interfaces. These interfaces are not really public (as in: returned to users), but must be public because they reside in different packages within the Spanner client library and are used by different parts of the Spanner client lib.
So:

  1. Should we still treat this as a breaking change?

I don't believe it should be flagged as a breaking change given that we should reserve those when making backwards-incompatible changes that requires users to rebuild their apps.

  1. Regardless of whether we do treat is a breaking change or not: What action is needed to make the PR mergeable?

I'm not sure actually :( @kolea2 would you be able to advice us on whether it's ok to ignore a Binary Compatibility failure and merge this change.

I don't know why the Java 11 test fails so I triggered a rebuild.

As discussed offline, it's ok to push this change despite the Binary Compatibility failure since the changes being made are to internal interfaces.

@skuruppuskuruppu merged commit ab25087 into master Jan 23, 2020
@skuruppuskuruppu deleted the close-sessions-async branch January 23, 2020 02:51
@skuruppu

@skuruppu @hengfengli @chingor13
This PR cannot be merged because the Binary Compatibility test fails. This happens because it adds a method to two public interfaces. These interfaces are not really public (as in: returned to users), but must be public because they reside in different packages within the Spanner client library and are used by different parts of the Spanner client lib.
So:

  1. Should we still treat this as a breaking change?

I don't believe it should be flagged as a breaking change given that we should reserve those when making backwards-incompatible changes that requires users to rebuild their apps.

  1. Regardless of whether we do treat is a breaking change or not: What action is needed to make the PR mergeable?

I'm not sure actually :( @kolea2 would you be able to advice us on whether it's ok to ignore a Binary Compatibility failure and merge this change.
I don't know why the Java 11 test fails so I triggered a rebuild.

As discussed offline, it's ok to push this change despite the Binary Compatibility failure since the changes being made are to internal interfaces.

Yikes, it turns out the compatibility check just keeps failing for all future PRs as well. I was hoping that it would stop complaining after it was merged in. Let me check with @kolea2 offline about what we can do to fix this issue.

@hengfengli

Should we revert this change? We can wait until it is resolved so that we can merge it in.

@skuruppu

Should we revert this change? We can wait until it is resolved so that we can merge it in.

Yeah good point. I'll revert it now.

skuruppu added a commit that referenced this pull request Jan 23, 2020
hengfengli pushed a commit that referenced this pull request Jan 23, 2020
@hengfenglihengfengli restored the close-sessions-async branch January 23, 2020 05:37
@release-pleaserelease-please bot mentioned this pull request Jan 23, 2020
skuruppu added a commit that referenced this pull request Jan 24, 2020
skuruppu added a commit that referenced this pull request Jan 24, 2020
asyncClose() was added to com.google.cloud.spanner.Session and
asyncDeleteSession() was added to
com.google.cloud.spanner.spi.v1.SpannerRpc in #24 which resulted in
binary compatibility test failures. This config allows us to ignore the
failure.
olavloite pushed a commit that referenced this pull request Jan 24, 2020
* Revert "Revert "perf: close sessions async (#24)" (#43)"

This reverts commit 809ed88.

* Ignore compatibility check failure in internal interfaces.

asyncClose() was added to com.google.cloud.spanner.Session and
asyncDeleteSession() was added to
com.google.cloud.spanner.spi.v1.SpannerRpc in #24 which resulted in
binary compatibility test failures. This config allows us to ignore the
failure.

* Annotate SpannerRpc and Session classes as @internalapi.

Users shouldn't be implementing these interfaces as they're internal to
the client library implementation.
rajatbhatta pushed a commit to rajatbhatta/java-spanner that referenced this pull request Nov 17, 2022
Sign up for free to join this conversation on . Already have an account? Sign in to comment
cla: yesThis human has signed the Contributor License Agreement.
None yet

Successfully merging this pull request may close these issues.

Spanner - Closing session pool is slow for large pools