Microsoft Azure SouthCentralUS Outage Impacting Loftware Cloud

Incident Report for Loftware Cloud

Resolved

All customer deployments are now observed as healthy and operational. We do not currently detect any outstanding issues or performance degradation.

Upon conclusion, Microsoft will post final details here:
https://azure.status.microsoft/en-us/status/history/

Our apologies for any inconvenience. Kindly contact Support if you have any questions or concerns:
https://www.loftware.com/customer-center/technical-support/submit-a-support-ticket/nicelabel-support
Posted Dec 27, 2024 - 04:05 UTC

Monitoring

All customers previously impacted by Azure SouthCentralUS outage should now be restored and fully operational. We will continue to monitor service availability and system performance.

The Loftware Cloud 24.2 deployment (dep216) was further impacted by physical blockers described in latest Azure status updates:
https://azure.status.microsoft/en-us/status

"Not all storage stamps have successfully recovered, this is blocking the recovery of some services. We are working with an onsite team and will need to manually replace network components aiding in faster recovery."

Although the Azure incident is still outstanding, we have migrated any remaining customers of dep216 to a new deployment, which is not impacted by the limited availability zone.
Posted Dec 27, 2024 - 03:30 UTC

Update

After latest mitigations provided by Microsoft at approximately Dec26 21:00UTC, we have observed that most customer deployments and backend services are fully operational.

Additional details from Microsoft here:
https://azure.status.microsoft/en-us/status

Only a small subset of Loftware Cloud 24.2 customers on a single SouthCentralUS deployment continue to be impacted:
lmscloud-serext-production-dep216.southcentralus.cloudapp.azure.com

You may determine deployment placement using commands like below (replacing [customer] with instance identifier):
"nslookup [customer].onnicelabel.com" or "ping [customer].onnicelabel.com"

It appears that several cascading effects of the SouthCentralUS data center power outage are still being resolved by Microsoft. We will continue to monitor and verify the performance of related compute, storage, and SQL databases impacted by the incident. Please contact Support if you have any questions or concerns:
https://www.loftware.com/customer-center/technical-support/submit-a-support-ticket/nicelabel-support
Posted Dec 26, 2024 - 22:00 UTC

Identified

Microsoft Azure outage statement:
Active - Storage latency, timeouts, or HTTP 500 errors in South Central US

Impact Statement: Starting at 18:44 UTC on 26 Dec 2024, you have been identified as a customer who was impacted by a power incident in South Central US and may be experiencing a degraded service.

Current Status: There was a power incident in a portion of South Central US AZ03 which affected multiple services. At approximately 20:43 UTC, power was fully restored, and services started to recover. However not all storage stamps have successfully recovered, this is blocking the recovery of some services. We are working with an onsite team and will need to manually replace network components; we expect these services to recover over the next five hours.
Posted Dec 26, 2024 - 21:00 UTC

Investigating

Beginning at approximately Dec26 18:00UTC, a subset of Loftware Cloud customers utilizing Control Center/Web Printing are impacted by a power outage incident in Azure's SouthCentralUS data center. We are actively monitoring the progress of the incident and will restore services as soon as possible. Additional updates will be provided as the information becomes available.

Further up-to-date details about the outage can be found on Azure's status page:
https://azure.status.microsoft/en-us/status

We have observed that several underlying services related to compute and storage in SouthCentralUS are degraded or fully unavailable for the impacted customer deployments. In the case of extended outages, disaster recovery measures are in-place to temporarily relocate impacted customers to a healthy region. Additional details about incident response plans here:
https://help.loftware.com/cloud/en/Cloud/Loftware-Cloud-Infrastructure--Security--and-Maintenance/Disaster-recovery.html

We apologize for any service disruptions caused by this incident. Updates will be posted here when new details are provided by Microsoft. Kindly contact Support if you have any questions or concerns in the meantime:
https://www.loftware.com/customer-center/technical-support/submit-a-support-ticket/nicelabel-support
Posted Dec 26, 2024 - 18:00 UTC
This incident affected: South Central US (SignIn Site, Dashboard, Document Storage, Print, Cloud Trigger, Cloud Print, Labeling Databases, Cloud Automation).