“Customers value cloud services because they are agile and adaptable, scaling and transforming to meet the changing needs of business. Since the velocity of change can work against the tenets of reliability, our Azure engineering teams have evolved their culture, processes, and frameworks to balance the pace of innovation with assurance of performance and quality. Today, I asked Principal Program Manager Anne Hamilton to explore the challenges of developing a culture of reliability through Azure engineering onboarding skills training, as part of our Advancing Reliability blog series.” —Mark Russinovich, CTO, Azure
Like engineering reliability, Azure culture must balance the speed of the new with the stability of the known in the face of tremendous growth and unknowns. New hires bring new ideas and perspectives while veterans bring experience and institutional knowledge. Both contribute to the team culture, which defines how quality and innovation are valued and implemented.
To evolve the best quality outcomes, the Azure engineering team culture must be a place where ideas are openly shared, rigorously challenged, and effectively implemented. It’s a space where ideation and creativity thrive.
Skills, processes, and frameworks can be taught. But can culture be taught? How do you onboard new
“Service incidents like outages are an unfortunate inevitability of the technology industry. Of course, we are constantly improving the reliability of the Microsoft Azure cloud platform. We meet and exceed our Service Level Agreements (SLAs) for the vast majority of customers and continue to invest in evolving tools and training that make it easy for you to design and operate mission-critical systems with confidence.
In spite of these efforts, we acknowledge the unfortunate reality that—given the scale of our operations and the pace of change—we will never be able to avoid outages entirely. During these times we endeavor to be as open and transparent as possible to ensure that all impacted customers and partners understand what’s happening. As part of our Advancing Reliability blog series, I asked Sami Kubba, Principal Program Manager overseeing our outage communications process, to outline the investments we’re making to continue improving this experience.”—Mark Russinovich, CTO, Azure
In the cloud industry, we have a commitment to bring our customers the latest technology at scale, keeping customers and our platform secure, and ensuring that our customer experience is always optimal. For this to happen Azure is subject to a significant amount of change—and in
The economic challenges posed by the global health pandemic continue to affect every organization around the world. During this difficult time, cost optimization has become an especially critical topic. Recently, we provided an overview of how to approach cost optimization on Microsoft Azure, which laid out three focus areas to help you get the most value out of your Azure investment: understanding and forecasting your costs, optimizing your workload costs, and controlling your costs.
Today, we’ll dive more deeply into the second focus area—how you can optimize your Azure workloads costs—and show you how guidance in the Microsoft Azure Well-Architected Framework, tools like Azure Advisor, and offers like the Azure Hybrid Benefit and Azure Reservations can help you operate more efficiently on Azure and save.
Design workloads for cost optimization using best practices from the Azure Well-Architected Framework
The Azure Well-Architected Framework is designed to help you build and deploy cloud workloads with confidence, using actionable and simple to use deep technical content, assessments, and reference architectures based on proven industry best practices. You can assess workloads against the five pillars of the Azure Well-Architected Framework cloud design—cost optimization, reliability, security, performance efficiency, and operational excellence—to help you focus on
“When I first kicked off this Advancing Reliability blog series in my post last July, I highlighted several initiatives underway to keep improving platform availability, as part of our commitment to provide a trusted set of cloud services. One area I mentioned was fault injection, through which we’re increasingly validating that systems will perform as designed in the face of failures. Today I’ve asked our Principal Program Manager in this space, Chris Ashton, to shed some light on these broader ‘chaos engineering’ concepts, and to outline Azure examples of how we’re already applying these, together with stress testing and synthetic workloads, to improve application and service resilience.” – Mark Russinovich, CTO, Azure
Developing large-scale, distributed applications has never been easier, but there is a catch. Yes, infrastructure is provided in minutes thanks to your public cloud, there are many language options to choose from, swaths of open source code available to leverage, and abundant components and services in the marketplace to build upon. Yes, there are good reference guides that help give a leg up on your solution architecture and design, such as the Azure Well-Architected Framework and other resources in the Azure Architecture Center. But while application development
Large enterprise customers running business-critical workloads on Azure manage thousands of subscriptions and use automation for deployment and management of their Azure resources. Expert support for these customers is critical in achieving success and operational health of their business. Today, customers can keep running their Azure solutions smoothly with self-help resources, such as diagnosing and solving problems in the Azure portal, and by creating support tickets to work directly with technical support engineers.
We have heard feedback from our customers and partners that automating support procedures is key to help them move faster in the cloud and focus on their core business. Integrating internal monitoring applications and websites with Azure support tickets has been one of their top asks. Customers expect to create, view, and manage support tickets without having to sign-in to the Azure portal. This gives them the flexibility to associate the issues they are tracking with the support tickets they raise with Microsoft. The ability to programmatically raise and manage support tickets when an issue occurs is a critical step for them in Azure usability.
We’re happy to share that the Azure Support API is now generally available. With this API, customers can integrate the creation and management of support tickets directly into their
Securing any environment requires multiple lines of defense. Azure Container Registry recently announced the general availability of features like Azure Private Link, customer-managed keys, dedicated data-endpoints, and Azure Policy definitions. These features provide tools to secure Azure Container Registry as part of the container end-to-end workflow.
By default, when you store images and other artifacts in an Azure Container Registry, content is automatically encrypted at rest with Microsoft-managed keys.
Choosing Microsoft-managed keys means that Microsoft oversees managing the key’s lifecycle. Many organizations have stricter compliance needs, requiring ownership and management of the key’s lifecycle and access policies. In such cases, customers can choose customer-managed keys that are created and maintained in a customer’s Azure Key Vault instance. Since the keys are stored in Key Vault, customers can also closely monitor the access of these keys using the built-in diagnostics and audit logging capabilities in Key Vault. Customer-managed keys supplement the default encryption capability with an additional encryption layer using keys provided by customers. See details on how you can create a registry enabled for customer-managed keys.
Today, we’ll explore some strategies that you can leverage on Azure to optimize your cloud-native application development process using Azure Kubernetes Service (AKS) and managed databases, such as Azure Cosmos DB and Azure Database for PostgreSQL.
Optimize compute resources with Azure Kubernetes Service
AKS makes it simple to deploy a managed Kubernetes cluster in Azure. AKS reduces the complexity and operational overhead of managing Kubernetes by offloading much of that responsibility to Azure. As a managed Kubernetes service, Azure handles critical tasks like health monitoring and maintenance for you.
When you’re using AKS to deploy your container workloads, there are a few strategies to save costs and optimize the way you run development and testing environments.
Create multiple user node pools and enable scale to zero
In AKS, nodes of the same configuration are grouped together into node pools. To support applications that have different compute or storage demands, you can create additional user node pools. User node pools serve the primary purpose of hosting your application pods. For example, you can use these additional user node pools to provide GPUs for compute-intensive applications or access to high-performance SSD storage.
When you have multiple node pools, which run on virtual
“The COVID-19 pandemic has reset what it means to work, study, and socialize. Like many of us, I have come to rely on Microsoft Teams as my connection to my colleagues. In this post, our friends from the Microsoft Teams product group—Rish Tandon (Corporate Vice President), Aarthi Natarajan (Group Engineering Manager), and Martin Taillefer (Architect)—share some of their learnings about managing and scaling an enterprise-grade, secure productivity app.” – Mark Russinovich, CTO, Azure
Scale, resiliency, and performance do not happen overnight—it takes sustained and deliberate investment, day over day, and a performance-first mindset to build products that delight our users. Since its launch, Teams has experienced strong growth: from launch in 2017 to 13 million daily users in July 2019, to 20 million in November 2019. In April, we shared that Teams has more than 75 million daily active users, 200 million daily meeting participants, and 4.1 billion daily meeting minutes. We thought we were accustomed to the ongoing work necessary to scale service at such a pace given the rapid growth Teams had experienced to date. COVID-19 challenged this assumption; would this experience give us the ability to keep the service running amidst a previously unthinkable growth period?
The global health pandemic continues to impact every organization—large or small—their employees, and the customers they serve. Over the last several months, we have seen firsthand the role that cloud computing plays in sustaining operations across the board that helps us live, work, learn, and play.
During this unparalleled time all of Microsoft’s cloud services, in particular Azure, Microsoft Teams, Windows Virtual Desktop, and Xbox Live experienced unprecedented demand. It has been our privilege to provide support and the infrastructure needed to help our customers successfully accelerate their cloud adoption to enable digital transformation during such a critical time.
Over the last 90 days, we have learned a lot and I want to share those observations with you all. The following video has been developed to provide a more technical look at how we scaled Azure as the COVID-19 outbreak rapidly pushed demand for cloud services.
As global organizations across every industry adjust to the new normal, SAP solutions are playing an increasingly vital role in addressing immediate needs and paving a path to a resilient future. Now more than ever, companies are realizing the value of running their SAP solutions in the cloud. While some are using advanced analytics to process their SAP data to make real-time business decisions, others are integrating their SAP and non-SAP data to build stronger supply chains. Whether it’s meeting urgent customer needs, empowering employees to make quick decisions, or planning for the future, customers running SAP solutions in the cloud have been well prepared to face the new reality. Check out how Walgreens delivers superior customer service with SAP solutions on Microsoft Azure.
Many organizations running their SAP solutions on-premises have become increasingly aware of the need to be more agile and responsive to real-time business needs. According to an IDC survey, 54 percent of enterprises expect the future demand for cloud software will increase. As global organizations seek agility, cost savings, risk reduction, and immediate insights from their ERP solutions, here are some reasons many of the largest enterprises choose Microsoft Azure as their trusted partner when moving