configuring the nats target to reconnect forever #16050
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
We observed bucket notification events do not resume sending after the NATS server has been offline for more than 5 minutes. A shorter outage, of only a minute or so, does not experience this issue.
In the nats.go lib the client's MaxReconnect default setting is 60 attempts and
ReconnectWait
is 2 seconds, so it seems the client will only attempt reconnect for a short period before giving up.I found that setting the
nats.MaxReconnects(-1)
on the client resolved the issue. I tested after a 20 minute outage, and the messages resumed sending on their own without a minio restart. 👍🏼Motivation and Context
During testing we found that after an extended network outage was resolved, bucket events do not resume sending until minio is restarted.
How to test this PR?
MINIO_NOTIFY_NATS_QUEUE_DIR
is defined we can observe that messages are stuck queued in the minio folder, and are not delivered until a minio restartTypes of changes
Checklist:
commit-id
orPR #
here)