"
 
 
 
ASP.NET (snapshot 2017) Microsoft documentation and samples

Transient Fault Handling (Building Real-World Cloud Apps with Azure)

by Mike Wasson, Rick Anderson, Tom Dykstra

Download Fix It Project or Download E-book

The Building Real World Cloud Apps with Azure e-book is based on a presentation developed by Scott Guthrie. It explains 13 patterns and practices that can help you be successful developing web apps for the cloud. For information about the e-book, see the first chapter.

When you’re designing a real world cloud app, one of the things you have to think about is how to handle temporary service interruptions. This issue is uniquely important in cloud apps because you’re so dependent on network connections and external services. You can frequently get little glitches that are typically self-healing, and if you aren’t prepared to handle them intelligently, they’ll result in a bad experience for your customers.

Causes of transient failures

In the cloud environment you’ll find that failed and dropped database connections happen periodically. That’s partly because you’re going through more load balancers compared to the on-premises environment where your web server and database server have a direct physical connection. Also, sometimes when you’re dependent on a multi-tenant service you’ll see calls to the service get slower or time out because someone else who uses the service is hitting it heavily. In other cases you might be the user who is hitting the service too frequently, and the service deliberately throttles you – denies connections – in order to prevent you from adversely affecting other tenants of the service.

Use smart retry/back-off logic to mitigate the effect of transient failures

Instead of throwing an exception and displaying a not available or error page to your customer, you can recognize errors that are typically transient, and automatically retry the operation that resulted in the error, in hopes that before long you’ll be successful. Most of the time the operation will succeed on the second try, and you’ll recover from the error without the customer ever having been aware that there was a problem.

There are several ways you can implement smart retry logic.

## Circuit breakers

There are several reasons why you don’t want to retry too many times over too long a period:

Exponential back-off addresses some of these issue by limiting the frequency of retries a service can get from your application. But you also need to have circuit breakers: this means that at a certain retry threshold your app stops retrying and takes some other action, such as one of the following:

There is no one-size-fits-all retry policy. You can retry more times and wait longer in an asynchronous background worker process than you would in a synchronous web app where a user is waiting for a response. You can wait longer between retries for a relational database service than you would for a cache service. Here are some sample recommended retry policies to give you an idea of how the numbers might vary. (“Fast First” means no delay before the first retry.

Sample retry policies
Sample retry policies

For SQL Database retry policy guidance, see Troubleshoot transient faults and connection errors to SQL Database.

Summary

A retry/back-off strategy can help make temporary errors invisible to the customer most of the time, and Microsoft provides frameworks that you can use to minimize your work implementing a strategy whether you’re using ADO.NET, Entity Framework, or the Azure Storage service.

In the next chapter, we’ll look at how to improve performance and reliability by using distributed caching.

Resources

For more information, see the following resources:

Documentation

Videos

Code sample

Previous Next





Comments ( )
<00>  <01>  <02>  <03>  <04>  <05>  <06>  <07>  <08>  <09>  <10>  <11>  <12>  <13>  <14>  <15>  <16>  <17>  <18>  <19>  <20>  <21>  <22>  <23
Link to this page: //www.vb-net.com/AspNet-DocAndSamples-2017/aspnet/aspnet/overview/developing-apps-with-windows-azure/building-real-world-cloud-apps-with-windows-azure/transient-fault-handling.htm
<SITEMAP>  <MVC>  <ASP>  <NET>  <DATA>  <KIOSK>  <FLEX>  <SQL>  <NOTES>  <LINUX>  <MONO>  <FREEWARE>  <DOCS>  <ENG>  <CHAT ME>  <ABOUT ME>  < THANKS ME>