Let Application Insights focus on real problems and not missing data (404) in your APIs

Application Insights is a great tool for collecting insights and telemetry for your Azure (or for that matter almost any) resources such as websites, console apps, apis etc. There are cases however where default settings may log errors that are undesirable. Worst case scenario is that it could hide the real problems or have them look irrelevant. This blog post will illustrate the problem and show how you can handle it when logging to Application Insights from a .Net Core Api. If you are using other languages you can still read this post and continue reading in the reference section for some other languages/technologies

404 may not be an error in APIs

The dreaded 404, feared by many CMS- and custom web site developers signals that someone has tried to access a page that does not exist and is typically an error that should be corrected, likely a broken link.

In REST Api’s, however, 404 could mean that you are simply trying to fetch something that does not exist. Consider asking an api for a specific customer, for example https://myserver.com/api/customer/121313. If the customer queried was not found, according to most REST developers the service would logically return a 404 – NotFound status code. Another example could be a service looking up the owner of a telephone number, like https://myserver.com/api/phone/owner/0046709123456 and many similar scenarios.

A controller method may look like below, you call a service to lookup an object by id but if it returns null you do not return Ok but rather NotFound on line 11.

[ProducesResponseType(typeof(MyObject), (int)HttpStatusCode.OK)]
[ProducesResponseType(typeof(string), (int)HttpStatusCode.NotFound)]
[Authorize()]
[HttpGet("{objectid}")]
public async Task<IActionResult> GetObjectById(string objectId)
{
    var result = await myService.GetObjectById(objectId);
    if (result != null)
         return Ok(result);
    else
         return NotFound($"{objectId} was not found");
}   

Note that there are others who suggests using response codes like 204 (No Content) for these situations but from what I have seen it is most common to return 404 in these cases. The advantage of 204 it is easier to determine if the URI of the call does not exist or if the customer does not exist.

Anyhow, if we assume that you belong to the people preferring 404, including myself, the 404 response is fine. In the examples above there is no action that needs to be taken if 404 occurs as it is a perfectly normal response from the API. It seems quite logical when you think of it that you get a NotFound if you query a customer or other object that does not exist.

Looking in Application Insights may give a completely different picture though.

Wow – there definitely must be an error here

This may trigger alerts and does not look good at all in our dashboard over most common errors. In fact, in this particular example, the real errors were ranked so low that we did not even bother to look at them (see the 412 and 500 errors that look pretty irrelevant in comparison).

The real problems was overlooked as the 404’s looked much worse

So evidently, for our DevOps team to react on important matters and not missing data, we would like to ignore these errors in Application Insights.

The solution

What we need to do is to tell Application Insights that we don’t want to have the 404 errors flagged as a failed request. As stated before the example below assumes that you have an API written in .Net Core but there are other similar examples in the reference section for other code bases like .Net Framework and JavaScript.

So create a class in your API solution and have it Implement thee ITelemetryInitializer interface and implement the Initialize method.

public class OverideTelemetry : ITelemetryInitializer
{
    public void Initialize(ITelemetry telemetry)
    {
        var requestTelemetry = telemetry as RequestTelemetry;
        if (requestTelemetry == null) return;
            
        int code;
        bool parsed = Int32.TryParse(requestTelemetry.ResponseCode, out code);
        if (!parsed) return;
            
        switch ((HttpStatusCode)code)
        {
            case HttpStatusCode.NotFound:
                requestTelemetry.Success = true;

                // One can search for the below property in Aplication Insights
                requestTelemetry.Properties["Overridden404s"] = "true";
                break;
            default:
                // else leave the SDK to set the Success property
                 break;
        }

    }
}

Line 9 reads the status code from the request telemetry and tries to convert it to an integer and the code intercepts 404 errors on line 14. The most important line of code is line 15 which sets the Success flag to true. This will mean that the 404 error will not be flagged as a Failure anymore in Application Insights. The code also sets a custom property when a 404 error is treated as OK on line 18 named Overridden404s. This will enable us to find the 404 errors easily in Application Insights later.

Now we just have to tell our .Net code to use the ITelemetryInitializer when processing telemetry for Application Insights. To do that we edit the ConfigureServices method.

public void ConfigureServices(IServiceCollection services)
{
    //...

    services.AddSingleton<ITelemetryInitializer, OverideTelemetry>();

    //...
}

Result and how to verify

Well the first obvious evidence of this working is difficult to visualize, i.e. the ceasing of registering failures on 404-NotFound requests. One way is to open Application Insights in the Azure Portal and select Failures. While monitoring the Failures you would at the same time and on purpose generate 404 requests and verify in the portal that no more 404 errors are registered as failures.

With the help of the extra property we can however still find the requests/outcomes in Application Insights. Go to Search and enter your property name and the OK 404 returning calls will be displayed.

Remember that it can take up to 5 minutes before the telemetry appears in Application Insights. You may also be missing some telemetry due to the sampling behavior (not everything may necessarily be logged), which can be turned on and off through configuration.

Searching for our property finds the overridden 404 failures (note the blue color)

We can also compare the outcome from before and after the change and it clearly illustrates how Application Insights treats it differently.

Before
After

So using this approach we can now more easily set up alerts and dashboards to react on real problems.

To consider

As far as I know (please correct me If I am wrong readers) I cannot set the ITelemetryInitializer per controller or method so if the same solution have some situations where 404 should be treated as an error and some that shouldn’t you probably have to extend the ITelemetryInitializer code to just set requestTelemetry.Success to true when appropriate.

if (!requestTelemetry.Url.AbsolutePath.ToLower().Contains("apithatshouldflag404aserror"))
{
    requestTelemetry.Success = true;
    requestTelemetry.Properties["Overridden404s"] = "true";
}

The above code will only set it to Success if it is not in a particular path.

You could also consider handling other errors, like 412-PreConditionFailed, as Successful. The API says that the preconditions for running the API is not fulfilled like invalid in-parameters. No action needs to be taken and may therefor possibly not be needed to be listed as Failures or be a source for alerts.

Also consider that while, in our example, the 404 error may not be needed to treated as a failure in Application Insights for the API, a website calling the API might want to flag it as an error in Application Insights (for the web site) if it indicates that it performs invalid/faulty requests to the API.

If you want to completely want to ignore the telemetry for some scenarios – without needing to change the Success flag or adding properties – you could implement the ITelemetryProcessor interface instead. Read more about this option in the referenced page. Note that Microsoft warns that this may “skew the statistics that you see in the portal and make it difficult to follow related items“. The sample in this post however does not remove any telemetry and does not suffer from this issue.

References

The above suggested solution is based on the information from the following page where you can read a bit more about the technical details and get code examples for other programming languages/technologies.

https://docs.microsoft.com/en-us/azure/azure-monitor/app/api-filtering-sampling#add-properties-itelemetryinitializer

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s