Cluster startup failure
Incident Report for Qubole
Resolved
Outstanding cluster issues appear to be specific to the clusters' configuration. At this time the service interruption is resolved.
Posted Apr 29, 2021 - 06:05 PDT
Update
Devops believes they have identified the issue preventing some individual clusters from coming online. They're monitoring to ensure that the change provided is the complete fix.
Posted Apr 28, 2021 - 07:42 PDT
Update
We are continuing to monitor for any further issues.
Posted Apr 25, 2021 - 22:53 PDT
Monitoring
A cluster engine restart has resolved this issue. Devops is resolving a few leftover cluster redirection issues manually.
Posted Apr 23, 2021 - 07:50 PDT
Identified
Tunnel server replacement uncovered an issue with the discovery server. The server is in the process of being replaced, and will have to be online before clusters can be started.
Posted Apr 22, 2021 - 06:09 PDT
Investigating
We are aware of a tunnel server availability issue on gcp.qubole.com that may prevent clusters from starting. Devops is in the process of restarting tunnel servers -- this incident will be updated as that is finalized.
Posted Apr 16, 2021 - 12:59 PDT
This incident affected: gcp.qubole.com Environment (GCP) (Cluster Operations).