Spark and Presto query failures
Incident Report for Qubole
Resolved
Devops expects operational issues to be resolved. After restarting discovery, they needed to augment client nodes to serve the scope of traffic.
Posted Apr 27, 2021 - 13:45 PDT
Update
We are continuing to monitor for any further issues.
Posted Apr 25, 2021 - 22:51 PDT
Update
An additional incidence of stalled operations was reported yesterday evening (4/24), which have since cleared. Devops is looking into a root cause for the stall, so that a more permanent fix can be applied.
Posted Apr 25, 2021 - 04:47 PDT
Update
We are continuing to monitor for any further issues.
Posted Apr 25, 2021 - 04:44 PDT
Monitoring
Devops is monitoring its latest fix -- this should be resolved. Additional information about the resolution will be added after monitoring.
Posted Apr 23, 2021 - 11:07 PDT
Update
We are continuing to investigate this issue.
Posted Apr 22, 2021 - 07:56 PDT
Investigating
Spark and Presto queries run in in.qubole.com may stall, returning Pending or Queued status. Devops is investigating.
Posted Apr 22, 2021 - 05:52 PDT
This incident affected: in.qubole.com Environment (AWS) (QDS API, Command Processing, Qubole Scheduler, Cluster Operations).