Degraded performance issue Airflow clusters on api.q
Incident Report for Qubole
Resolved
The Airflow issue is resolved. Customers needing to obtain this hotfix AMI should file a ticket with Qubole Support and include the cluster Id(s) they need to be updated.
Posted Oct 31, 2022 - 09:04 PDT
Update
We were able to get a new AMI set up and working. Our next step will be to roll out to all affected customers using Airflow. In the meantime, this new AMI is available and you can override the current Airflow cluster settings to use the new AMI.
Posted Oct 16, 2022 - 08:04 PDT
Update
We are continuing to monitor for any further issues.
Posted Oct 16, 2022 - 05:49 PDT
Update
We are continuing to monitor for any further issues.
Posted Oct 16, 2022 - 02:03 PDT
Update
We are continuing to monitor for any further issues.
Posted Oct 15, 2022 - 22:44 PDT
Update
We are continuing to monitor for any further issues.
Posted Oct 15, 2022 - 19:45 PDT
Monitoring
We are investigating options to resolve the issue. We will continue to post the status.
Posted Oct 15, 2022 - 16:43 PDT
Update
DevOps continues to work on patching an Airflow AMI that can be used to mitigate the issue. Until the issue is resolved, if you are currently running Airflow clusters and they are being utilized do not shut them down. This will prevent the failure on startup condition that we are reporting.
Posted Oct 15, 2022 - 08:12 PDT
Update
We are continuing to work on a resolution.
Posted Oct 15, 2022 - 07:12 PDT
Update
We are continuing to work on a resolution.
Posted Oct 15, 2022 - 03:58 PDT
Update
We are continuing to work on a resolution.
Posted Oct 15, 2022 - 00:31 PDT
Update
We are investigating options to resolve the issue. We will continue to post the status.
Posted Oct 14, 2022 - 19:59 PDT
Identified
The issue has been identified and a fix is being implemented.
Posted Oct 14, 2022 - 17:42 PDT
Update
The library used by airflow relies on a common python package which was recently upgraded in the open source community and is causing the breaking changes. The community is triaging and updating. In the meantime, we are exploring ways to bring in a resolution into QDS airflow that serves as a workaround for this upstream issue.
Posted Oct 14, 2022 - 17:38 PDT
Investigating
For customers using Airflow, there is an issue with clusters that are running v1.10.2. All other aspects of Qubole operations are functioning normally.

These Airflow clusters do not automatically terminate due to the nature of their function. Any currently running Airflow clusters are functioning normally. If an Airflow cluster is terminated or restarted for any reason then it will not come back up as the loading of Airflow will fail.

We are currently working to resolve the issue and will post an update in the next 2 hours or earlier if new information is available or service is restored.
Posted Oct 14, 2022 - 11:33 PDT
This incident affected: api.qubole.com Environment (AWS) (Cluster Operations).