Category Archives : Monitoring



March 2019 changes to Azure Monitor Availability Testing

Azure Monitor Availability Testing allows you to monitor the availability and responsiveness of any HTTP or HTTPS endpoint that is accessible from the public internet. You don’t have to add anything to the web site you’re testing. It doesn’t even have to be your site, you could test a REST API service you depend on. This service sends web requests to your application at regular intervals from points around the world. It alerts you if your application doesn’t respond, or if it responds slowly.

At the end of this month we are deploying some major changes to this service, these changes will improve performance and reliability, as well as allow us to make more improvements to the service in the future. This post will highlight some of the changes, as well as describe some of the changes you should be aware of to ensure that your tests continue running without any interruption.

Reliability improvements

We are deploying a new version of the availability testing service. This new version should improve the reliability of the service, resulting in fewer false alarms. This change also increases the capacity for the creation of new availability tests, which is greatly needed as Application Insights




Securely monitoring your Azure Database for PostgreSQL Query Store

A few months ago, I shared best practices for alerting on metrics with Azure Database for PostgreSQL. Though I was able to cover how to monitor certain key metrics on Azure Database for PostgreSQL, I did not cover how to monitor and alert on the performance of queries that your application is heavily relying on. As a PostgreSQL database, from time to time you will need to investigate if there are any queries running indefinitely on a PostgreSQL database. These long running queries may interfere with the overall database performance and likely get stuck on some background process. This blog post covers how you can set up alerting on query performance related metrics using Azure Functions and Azure Key Vault.

What is Query Store?

Query Store was a feature in Azure Database for PostgreSQL announced in early Fall 2018 that seamlessly enables tracking query performance over time. This simplifies performance troubleshooting by helping you quickly find the longest running and most resource-intensive queries. Learn how you can use Query Store on a wide variety of scenarios by visiting our documentation, “Usage scenarios for Query Store.” Query Store, when enabled, automatically captures a history of query runtime and wait statistics. It




Monitoring on HDInsight Part 1: An Overview
Monitoring on HDInsight Part 1: An Overview

Azure HDInsight offers several ways to monitor your Hadoop, Spark, or Kafka clusters. Monitoring on HDInsight can be broken down into three main categories:

Cluster health and availability Resource utilization and performance Job status and logs

Two main monitoring tools are offered on Azure HDInsight, Apache Ambari which is included with all HDInsight clusters and optional integration with Azure Monitor logs, which can be enabled on all HDInsight clusters. While these tools contain some of the same information, each has advantages in certain scenarios. Read on for an overview of the best way to monitor various aspects of your HDInsight clusters using these tools.

Cluster health and availability

Azure HDInsight is a high-availability service that has redundant gateway nodes, head nodes, and ZooKeeper nodes to keep your HDInsight clusters running smoothly. While this ensures that a single failure will not affect the functionality of a cluster, you may still want to monitor cluster health so you are alerted when an issue does arise. Monitoring cluster health refers to monitoring whether all nodes in your cluster and the components that run on them are available and functioning correctly. Ambari is the recommended way to monitor the health for any given HDInsight




Create a transit VNet using VNet peering
Create a transit VNet using VNet peering

Azure Virtual Network (VNet) is the fundamental building block for any customer network. VNet lets you create your own private space in Azure, or as I call it your own network bubble. VNets are crucial to your cloud network as they offer isolation, segmentation, and other key benefits. Read more about VNet’s key benefits in our documentation, “What is Azure Virtual Network?

With VNets, you can connect your network in multiple ways. You can connect to on-premises using Point-to-Site (P2S), Site-to-Site (S2S) gateways or ExpressRoute gateways. You can also connect to other VNets directly using VNet peering.

Customer network can be expanded by peering Virtual Networks to one another. Traffic sent over VNet peering is completely private and stays on the Microsoft Backbone. No extra hops or public Internet involved. Customers typically leverage VNet peering in the hub-and-spoke topology. The hub consists of shared services and gateways, and the spokes comprise business units or applications.

Gateway transit

Today I’d like to do a refresh of a unique and powerful functionality we’ve supported from day one with VNet peering. Gateway transit enables you to use a peered VNet’s gateway for connecting to on-premises instead of creating a new gateway for connectivity.




Announcing Azure Monitor AIOps Alerts with Dynamic Thresholds

We are happy to announce that Metric Alerts with Dynamic Thresholds is now available in public preview. Dynamic Thresholds are a significant enhancement to Azure Monitor Metric Alerts. With Dynamic Thresholds you no longer need to manually identify and set thresholds for alerts. The alert rule leverages advanced machine learning (ML) capabilities to learn metrics’ historical behavior, while identifying patterns and anomalies that indicate possible service issues.

Metric Alerts with Dynamic Thresholds are supported through a simple Azure portal experience, as well as provides support for Azure workloads operations at scale by allowing users to configure alert rules through an Azure Resource Manager (ARM) API in a fully automated manner.

Why and when should I apply Dynamic Thresholds to my metrics alerts?

Smart metric pattern recognition – A big pain point with setting static threshold is that you need to identify patterns on your own and create an alert rule for each pattern. With Dynamic Thresholds, we use a unique ML technology to identify the patterns and come up with a single alert rule that has the right thresholds and accounts for seasonality patterns such as hourly, daily, or weekly. Let’s take the example of HTTP requests rate. As you




Modernize alerting using Azure Resource Manager storage accounts

Classic alerts in Azure Monitor will reach retirement this coming June. We recommend that you migrate your classic alert rules defined on your storage accounts, especially if you want to retain alerting functionality with the new alerting platform. If you have classic alert rules configured on classic storage accounts, you will need to upgrade your accounts to Azure Resource Manager (ARM) storage accounts before you migrate alert rules.

For more information on the new Azure Monitor service and classic alert retirement read the article, “Classic alerts in Azure Monitor to retire in June 2019.”

Identify classic alert rules

You should first find all classic alert rules before you migrate. The following screenshot shows how you can identify classic alert rules in the Azure portal. Please note, you can filter by subscription so you can find all classic alert rules without checking on each resource separately.

Migrate classic storage accounts to ARM

New alerts do not support classic storage accounts, only ARM storage accounts. If you configured classic alert rules on a classic storage account you will need to migrate to an ARM storage account.

You can use “Migrate to ARM” to migrate using the storage menu on your classic




Monitor at scale in Azure Monitor with multi-resource metric alerts

Our customers rely on Azure to run large scale applications and services critical to their business. To run services at scale, you need to setup alerts to proactively detect, notify, and remediate issues before it affects your customers. However, configuring alerts can be hard when you have a complex, dynamic environment with lots of moving parts.

Today, we are excited to release multi-resource support for metric alerts in Azure Monitor to help you set up critical alerts at scale. Metric alerts in Azure Monitor work on a host of multi-dimensional platform and custom metrics, and notify you when the metric breaches a threshold that was either defined by you or detected automatically.

With this new feature, you will be able to set up a single metric alert rule that monitors:

A list of virtual machines in one Azure region All virtual machines in one or more resource groups in one Azure region All virtual machines in a subscription in one Azure region Benefits of using multi-resource metric alerts Get alerting coverage faster: With a small number of rules, you can monitor all the virtual machines in your subscription. Multi-resource rules set at subscription or resource group level can automatically monitor




Azure Monitor January 2019 updates
Azure Monitor January 2019 updates

Azure Monitor, which now includes Log Analytics and Application Insights, provides sophisticated tools for collecting and analyzing telemetry. It allows you to maximize the performance and availability of your cloud, on-premises resources, and applications. It helps you understand how your applications are performing and proactively identifies issues affecting them and the resources on which they depend.

Learn more about how you can get started with Azure Monitor. Now let’s check out what’s new from the past month.


First, a huge thank you to our customers for once again naming Microsoft a Gartner Peer Insights Customers’ Choice for Application Performance Monitoring Suites for its System Center Operations Manager, Microsoft Azure Application Insights, and System Center Global Service Monitor applications.

Application Insights

Application Insights is the application performance monitoring (APM) service of Azure Monitor, providing observability for Java, .NET, and Node.js web services, plus client-side JavaScript apps.

End-to-end transactions

The end-to-end transactions view now supports time scrubbing. Click and drag over a period of time to filter the view to that time range and analyze it in more detail.

Performance and failures

We’ve squashed a handful of bugs in the performance and failures tools:

The Roles tab now preserves role




Azure Monitor logs in Grafana – now in public preview

We’re happy to introduce the new Grafana integration with Microsoft Azure Monitor logs. This integration is achieved through the new Log Analytics plugin, now available as part of the Azure Monitor data source.

The new plugin continues our promise to make Azure’s monitoring data available and easy to consume. Last year, in the v1 of this data source we exposed Azure Monitor metric data in Grafana. While you can natively consume all logs in Azure Monitor Log Analytics, our customers also requested to make logs available in Grafana. We have heard this request and partnered with Grafana to enable you to use OSS tools more on Azure.

The new plugin allows you to display any data available in Log Analytics, such as logs related to virtual machine performance, security, Azure Active Directory which has recently integrated with Log Analytics, and many other log types including custom logs.

How can I use it?

The new plugin requires Grafana version 5.3 or newer. After the initial data source configuration, you can start embedding Azure Monitor logs in your dashboards and panels easily, simply select the service Azure Log Analytics and your workspace, then provide a query. You can reuse any existing queries




Best practices for alerting on metrics with Azure Database for MariaDB monitoring

On December 4, 2018 Microsoft’s Azure Database for open sources announced the general availability of MariaDB. This blog intends to share some guidance and best practices for alerting on the most commonly monitored metrics for MariaDB.

Whether you are a developer, a database analyst, a site reliability engineer, or a DevOps professional at your company, monitoring databases is an important part of maintaining the reliability, availability, and performance of your MariaDB server. There are various metrics available for you in Azure Database for MariaDB to get insights on the behavior of the server. You can also set alerts on these metrics using the Azure portal or Azure CLI.

With modern applications evolving from a traditional on-premises approach to becoming more hybrid or cloud native, there is also a need to adopt some best practices for a successful monitoring strategy on a hybrid/public cloud. Here are some example best practices on how you can use monitoring data on your MariaDB server and areas you can consider improving based on these various metrics.

Active connections

Sample threshold (percentage or value): 80 percent of total connection limit for greater than or equal to 30 minutes, checked every five minutes.

Things to check