Optimize the Performance and Reliability of Azure Functions

The following are best practices in how you build and architect your serverless solutions using Azure Functions.

Avoid long-running functions

Large, long-running functions can cause unexpected timeout issues. A function can become large, due to many dependencies. Importing dependencies can also cause increased load times that result in unexpected timeouts. Dependencies are loaded both explicitly and implicitly. A single module loaded by your code may load its own additional modules.

Whenever possible, refactor large functions into smaller function sets that work together and return responses fast. For example, a webhook or HTTP trigger function might require an acknowledgment response within a certain time limit; it is common for webhooks to require an immediate response.

Mitigation:

Leverage Durable functions by designing a granular Function App structure to avoid extensive runtime.
Pass the HTTP trigger payload into a queue to be processed by a queue trigger function. This approach allows you to defer the actual work and return an immediate response.

Cross function communication

Durable Functions and Azure Logic Apps are built to manage state transitions and communication between multiple functions.

If you’re not using Durable Functions or Logic Apps to integrate with multiple functions, it is generally a best practice to use storage queues for cross function communication. The main reason is storage queues are cheaper and much easier to provision.

Individual messages in a storage queue are limited in size to 64 KB. If it is required to pass larger messages between functions, an Azure Service Bus queue should be used to support message sizes up to 256 KB in the Standard tiers, and up to 1 MB in the Premium tier.

Service Bus topics are useful if you require message filtering before processing.

Event hubs are useful to support high volume communications.

Stateful Functions

The primary use case for Durable Functions is simplifying complex, stateful coordination problems in serverless applications. We will define stateful workflows in a new type of function called an orchestrator function. Here are some of the advantages of orchestrator functions:

They define workflows in code. No JSON schemas or designers are needed.
They can call other functions synchronously and asynchronously. Output from called functions can be saved to local variables.
They automatically checkpoint their progress whenever the function awaits. Local state is never lost if the process recycles or the VM reboots.

1.1.1.1 Function chaining

Function chaining refers to the pattern of executing a sequence of functions in a particular order. Often the output of one function needs to be applied to the input of another function.

1.1.1.2 Fan-out/fan-in

Fan-out/fan-in refers to the pattern of executing multiple functions in parallel and then waiting for all to finish. Often some aggregation work is done on results returned from the functions.

Write functions to be stateless

Functions should be stateless and idempotent1 if possible. Associate any required state information with your data. For example, a Fee Adjustment being processed would likely have an associated state member. A function could process an order based on that state while the function itself remains stateless.

Idempotent functions are especially recommended with timer triggers. For example, if a function needs to run once a day, write it so it can run any time during the day with the same results. The function can exit when there is no work for a particular day. Also, if a previous run failed to complete, the next run should pick up where it left off. This would be relevant with any functions used to rerun functions that had errors the previous day.

Write defensive functions

Assume the function could encounter an exception at any time. Design the functions with the ability to continue from a previous fail point during the next execution. Consider a scenario that requires the following actions:

Query for 10,000 rows in a db.
Create a queue message for each of those rows to process further down the line.
This is a scenario where we will create an Azure Temp Table, or depending on size use a SQL instance with staging tables.

Depending on how complex your system is, you may have: involved downstream services behaving badly, networking outages, or quota limits reached, etc. All of these can affect the function at any time.

Fault Tolerance will be documented on each individual TDD

How does the code react if a failure occurs after inserting 5,000 of those items into a queue for processing? Track items in a set that’s completed. Otherwise, the function might insert them again next time. This can have a serious impact on workflow.

If a queue item was already processed, allow the function to be a no-op.

Take advantage of defensive measures already provided for components in the Azure Functions platform. For example, see Handling poison queue messages in the documentation for Azure Storage Queue triggers and bindings.

Scalability best practices

There are several factors that impact how instances of function apps scale. The details are provided in the documentation for function scaling. The following are some best practices to ensure optimal scalability of a function app.

Don’t mix test and production code in the same function app

Functions within a function app share the same resource. For example, memory is shared. If using a function app in production, don’t add test-related functions and resources to it. It can cause unexpected overhead during production code execution.

Be careful what is loaded in production function apps. Memory is averaged across each function in the app.

If there is a shared assembly referenced in multiple .Net functions, put it in a common shared folder. Reference the assembly with a statement like the following example if using C# Scripts (.csx):

Copy

#r “..\Shared\MyAssembly.dll”.

Otherwise, it is easy to accidentally deploy multiple test versions of the same binary that behave differently between functions.

Don’t use verbose logging in production code. It has a negative performance impact.

Use async code but avoid blocking calls

Asynchronous programming is a recommended best practice. However, always avoid referencing the Result property or calling Wait method on a Task instance. This approach can lead to thread exhaustion.

Tip: If planning to use the HTTP or WebHook bindings, plan to avoid port exhaustion that can be caused by improper instantiation of HttpClient. For more information, see How to manage connections in Azure Functions.

Receive messages in batch whenever possible

Some triggers like Event Hub enable receiving a batch of messages on a single invocation. Batching messages has much better performance. You can configure the max batch size in the host.json file as detailed in the host.json reference documentation

For C# functions you can change the type to a strongly-typed array. For example, instead of EventData sensorEvent the method signature could be EventData [] sensorEvent. For other languages you’ll need to explicitly set the cardinality property in your function.json to many in order to enable batching as shown here.

Configure host behaviors to better handle concurrency

The host.json file in the function app allows for configuration of host runtime and trigger behaviors. In addition to batching behaviors, you can manage concurrency for a number of triggers. Often, adjusting the values in these options can help each instance scale appropriately for the demands of the invoked functions.

Settings in the hosts file apply across all functions within the app, within a single instance of the function. For example, if you had a function app with 2 HTTP functions and concurrent requests set to 25, a request to either HTTP trigger would count towards the shared 25 concurrent requests. If that function app scaled to 10 instances, the 2 functions would effectively allow 250 concurrent requests (10 instances * 25 concurrent requests per instance).