How to Create Cloud Apps With Polly & Lightning Bug

[object Object]

In recent years, all of my development has shifted to the cloud. The “let us manage and maintain that for you” aspect of cloud services is very appealing, and the big three cloud providers all do a fantastic job. The scale of their operations allows them to bring in real experts and optimize each of their services from the massive amounts of reliability data they collect. A handful of IT generalists at your typical company just can’t compete. Of course, there are trade-offs. Developing a stable web application on cloud services presents some unique challenges compared with one hosted in your data center. An app in the cloud needs to weather small, frequent interruptions that are uncommon in traditional data centers. There are resiliency patterns and best practices for dealing with these issues. <h2>Cloud Failures</h2> Cloud services require the same level of disaster recovery planning and redundancy as a traditional service running on a server in your data center. That plan will certainly look different, but the effort is about the same. However, due to the fundamental way cloud services work, <a href='https://docs.microsoft.com/en-us/azure/architecture/best-practices/transient-faults'>we should expect brief, transient faults</a> as well. The hardware that runs your specific cloud service (database, file system, queue, web server, serverless function, etc.) is chosen seemingly at random, and you probably share it with other cloud subscribers. If your service starts using too many resources, it may be throttled to protect other subscribers sharing that machine. If a patch or update needs to be installed, or a bit of hardware fails, your service may be moved to another machine entirely – usually in less than a few seconds. In either of these cases, your app might not be able to connect to the service for a brief period. If we were to use standard try-once-and-fail logic from a traditional application, our users would quickly lose confidence in the stability of our application. This is where resiliency patterns come in. Instead of letting these blips in service impact your users, it’s often <a href='https://docs.microsoft.com/en-us/azure/architecture/best-practices/retry-service-specific#sql-database-using-adonet'>better to simply wait a second and try again</a>. <h2>Polly and LightningBug.Polly</h2> <a href='https://www.thepollyproject.org/'>Polly</a> implements a whole collection of resiliency patterns for .NET applications. While these patterns can be useful in most applications, they are critical to building stable cloud applications. In fact, <a href='https://docs.microsoft.com/en-us/azure/architecture/best-practices/retry-service-specific#sql-database-using-adonet'>Microsoft recommends Polly</a> in its best practices for handling SQL Azure transient faults. With <a href='https://en.wikipedia.org/wiki/SOLID'>SOLID principles</a> and <a href='https://lightningbug.Jason Dentler.com/'>LightningBug</a>, you can easily add Polly and its resiliency patterns to any new or existing .NET application for increased stability and easier, automatic recovery after a failure. Here’s a simple example where GetBar calls an API and we want to give up on that API after 30 seconds. <pre class='prettyprint lang-cs'>public interface IFooService { [Timeout(TimeoutInSeconds = 30)] Task&lt;Bar&gt; GetBar(int barId); } public class FooService : IFooService { private readonly HttpClient _httpClient; public FooService(HttpClient httpClient) { _httpClient = httpClient; } public async Task&lt;Bar&gt; GetBar(int barId) { var response = await _httpClient.GetAsync($'/api/bar/{barId}'); var result = await response.Content.ReadAsStringAsync(); return JsonConvert.DeserializeObject&lt;Bar&gt;(result); } } </pre> &nbsp; We can also combine these patterns in powerful ways. Here we've nested these patterns to retry the request up to 3 times, with each attempt timing out after 10 seconds. <pre class='prettyprint lang-cs'>public interface IFooService { [Retry(3, Order = 0)] [Timeout(TimeoutInSeconds = 10, Order = 1)] Task&lt;Bar&gt; GetBar(int barId); } </pre> &nbsp; <h2>Cloud Best Practices</h2> The specific guidance from Microsoft for Azure SQL calls for retries of specific errors. Here’s what that might look like. <pre class='prettyprint lang-cs'>public class SqlRetryAttribute : WaitAndRetryAttribute { public SqlRetryAttribute() : base(5, 15, 25) { } } public class SqlRetryPolicyProvider : WaitAndRetryPolicyProvider { private readonly ILogger _logger; public SqlRetryPolicyProvider(ILogger logger) { _logger = logger; } public override bool HandlesException(Exception exception) { // SqlServerTransientExceptionDetector defined in Entity Framework source code at // https://github.com/aspnet/EntityFrameworkCore/blob/release/2.2/src/EFCore.SqlServer/Storage/Internal/SqlServerTransientExceptionDetector.cs return SqlServerTransientErrorDetector.IsTransient(exception); } public override bool HandlesInnerException(Exception exception) { return HandlesException(exception); } } public interface IFooDataAccess { [SqlRetry] Task GetBar(int barId); } </pre>Each cloud provider has specific guidance and best practices for each of their services. (<a href='https://docs.microsoft.com/en-us/azure/architecture/best-practices/retry-service-specific'>All Azure services</a>, <a href='https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Programming.Errors.html#Programming.Errors.RetryAndBackoff'>AWS DynamoDB</a>, <a href='https://cloud.google.com/storage/docs/exponential-backoff'>Google Cloud storage</a>) When using one of these cloud services in a real-world application, go seek out these best practice documents. Your cloud app will be more resilient and reliable.

grey dotted shape