Bug Report: (Atomic) copy fails to detect errors and restarts itself in the middle, causing duplicate key errors #17864
Labels
Content-Length: 223885 | pFad | http://github.com/vitessio/vitess/issues/17864
B6Fetched URL: http://github.com/vitessio/vitess/issues/17864
Alternative Proxies:
Overview of the Issue
When the atomic copy failed because of #17862, the result was that it got confused. It though it was done, so it ended, but then the calling context still saw a table list to do, and restarted itself. It then quickly errorred because of duplicate key reasons.
Rohit said:
(in
vcopier_atomic.go
)What's also interesting to note: is that as you can see below from my start parameters, the copy phase duration is 5 minutes. This translates into the source tablet (by
GetVReplicationMaxExecutionTimeQueryHint()
I guess) logging this:So, even tough it tried to set
300000
, that didn't work. It was at the 10 minute mark that it failed. See the log output later.To summarize, I see these issues that need to be fixed:
MAX_EXECUTION_TIME(300000)
query hint doesn't seem to work.max_execution_time
global setting may need to be verified before starting.Reproduction Steps
Perform an atomic copy of a large enough table with a value for
max_execution_time
on the source short enough for the source to abort the query.We used MySQL
8.0.35
at AWS RDS as source (to move away from).Tablet start params:
Binary Version
vttablet version Version: 21.0.1 (Git revision 3d4f41db2fbc32611c7d2ea2af3dc68b9d962415 branch 'HEAD') built on Tue Dec 3 05:39:35 UTC 2024 by runner@fv-az2029-313 using go1.23.3 linux/amd64
Operating System and Environment details
DISTRIB_ID=Ubuntu DISTRIB_RELEASE=24.04 DISTRIB_CODENAME=noble DISTRIB_DESCRIPTION="Ubuntu 24.04.2 LTS"
Log Fragments
Slack discussion
https://vitess.slack.com/archives/C0PQY0PTK/p1740042281348869
The text was updated successfully, but these errors were encountered: