Environment
MySQL: 8.0+ (Cloud RDS: Tencent Cloud / Alibaba Cloud)
gh-ost: 1.0.49
Topology: MySQL accessed through cloud proxy / load balancer (e.g. ProxySQL-like architecture)
Description In a cloud-native deployment where MySQL is accessed via a proxy layer, gh-ost repeatedly fails during cut-over with:
Error 1205: Lock wait timeout exceeded
This occurs even under very low or zero application traffic, and with no observable long-running transactions.
Under the same zero-traffic conditions, a manual atomic RENAME TABLE old TO old_del, ghost TO old succeeds immediately.
Observed Behavior
gh-ost enters cut-over
lock orchestration retries/yields
eventually exits with lock wait timeout (1205)
Expected Behavior Cut-over should complete successfully under no-contention conditions, similar to the manual atomic RENAME TABLE.
Hypothesis / Possible Root Cause gh-ost cut-over uses multiple sessions (e.g. one session holding LOCK TABLES WRITE, another issuing RENAME TABLE).
When routed through a proxy, these sessions may not share lock state/ordering semantics exactly as expected (session split-brain / lock visibility inconsistency), causing the RENAME session to wait on locks effectively held by gh-ost’s own companion session.
This can look like a self-deadlock pattern and ends in timeout/yield.
Suggested Improvement Add an optional mode for proxy/cloud-RDS environments, e.g.:
--single-session-cut-over
that executes cut-over lock + atomic rename in a single session/connection, avoiding multi-session lock orchestration dependency.
Environment
MySQL: 8.0+ (Cloud RDS: Tencent Cloud / Alibaba Cloud)
gh-ost: 1.0.49
Topology: MySQL accessed through cloud proxy / load balancer (e.g. ProxySQL-like architecture)
Description In a cloud-native deployment where MySQL is accessed via a proxy layer, gh-ost repeatedly fails during cut-over with:
Error 1205: Lock wait timeout exceeded
This occurs even under very low or zero application traffic, and with no observable long-running transactions.
Under the same zero-traffic conditions, a manual atomic RENAME TABLE old TO old_del, ghost TO old succeeds immediately.
Observed Behavior
gh-ost enters cut-over
lock orchestration retries/yields
eventually exits with lock wait timeout (1205)
Expected Behavior Cut-over should complete successfully under no-contention conditions, similar to the manual atomic RENAME TABLE.
Hypothesis / Possible Root Cause gh-ost cut-over uses multiple sessions (e.g. one session holding LOCK TABLES WRITE, another issuing RENAME TABLE).
When routed through a proxy, these sessions may not share lock state/ordering semantics exactly as expected (session split-brain / lock visibility inconsistency), causing the RENAME session to wait on locks effectively held by gh-ost’s own companion session.
This can look like a self-deadlock pattern and ends in timeout/yield.
Suggested Improvement Add an optional mode for proxy/cloud-RDS environments, e.g.:
--single-session-cut-over
that executes cut-over lock + atomic rename in a single session/connection, avoiding multi-session lock orchestration dependency.