Raptisv Blog

Microsoft Orleans - Introduction

Raptis Evangelos

Oct 03, 2021
448 views

The purpose of this article is to provide a smooth introduction into Orleans, to developers that are new to concepts like distributed applications, cluster computing and actor model.

Distributed applications

A typical web application consists of a web server that accepts HTTP requests over the internet and persists data in a database. If we add two or more servers, usually under a load balancer, the same application is considered as distributed. Simple as that. There is no connection between the servers and therefore there is no complexity overall. All instances serve the same purpose and scaling is easy. We just need to add more servers.

Besides scaling, most web applications need to cache data. In-memory cache, within the server, is the solution in most cases. But what about critical data, that need to be cached in one place and also need to be accessible from all servers? This is where distributed cache is especially useful.

Let's take the "Apples Bucket" game as an example. This is a simple app/game where each player has a number of apples and decides when to sell or when to buy some. Let's assume that the count of apples on each bucket is cached in order to avoid database requests. The cache is invalidated every time apples are moved around. When asking for how many apples a user has at any moment, we should get the same answer no matter which server responds to the request. For that reason, we cache this critical piece of information in a distributed cache system.

Distributed cache is very useful, yet it introduces another single point of failure besides the database. If distributed cache fails, none of the servers is able to perform it's task.

Besides that, the load balancer should be somehow aware of the status of all servers. If a server fails, the balancer should stop sending requests to this server. This task requires a separate monitoring system that informs the balancer of the "sick" servers.

There is absolutely nothing wrong with this setup. This is a very common architecture for a distributed web application. Can we do better though? Sure we can!

Clustered applications

A cluster is a group of servers that act like a single system. Some typical common features of these kind of applications are:

The servers (aka nodes) communicate within the cluster as they perform their tasks, through a process usually referred as gossip.
All nodes are fully aware of which nodes are "healthy" and "sick" nodes are removed from the cluster until declared "healthy" again (through gossip).
Load balancing is handled by the cluster itself. Since the cluster nodes communicate, they are aware of each other's load and act accordingly.

We have now eliminated the need of a load balancer and a system to monitor for healthy and sick servers/nodes. We have not yet eliminated the need for a distributed cache. Let's do that!

What if we could separate our application domain model to completely separate application instances, each one containing every piece of business logic required along with it's critical data cached somewhere near? If we could do that, then each domain instance would be able to respond on behalf of the piece of domain logic for which it is responsible to. Such an instance would be in place to act independently and hold it's own state. It should be clear by now that this instance, is an actor.

Actor model

The actor model is really old (1973) and defines a set of rules under which the actors operate. On the actor model, each logical unit is an actor. Actors communicate with each other and can generate other actors. Actors have a state and use it to process messages (requests). Each actor processes one message at a time (usually but not exclusively) while pending messages are waiting to be processed on the actor's mailbox.

Each actor is instantiated on a specific physical location (server) and "lives" on that location until it is not required anymore. We do not (and should not) care of where that location is. Actor placement and localization is handled from the respective framework. We just need to be in place to locate the actor and append messages to his mailbox. For that reason, each actor has a unique identifier within the system.

Actors are fault tolerant, meaning that if an actor fails for some reason (e.g. hardware failure) the same actor will be instantiated on another node (server) within the cluster and continue to process messages for his mailbox.

Writing distributed applications is challenging by itself. Programming actors adds extra complexity since the engineer has to handle issues that complex distributed systems have like resource management, error recovery and concurrency issues. Microsoft Orleans attempts to hide that complexity and provide a familiar programming model for distributed applications.

Microsoft Orleans a.k.a. The Virtual Actor Model

Microsoft Orleans introduced the Virtual Actor model some years back. Orleans aims to hide the implementation details and offer a hassle-free programming model that does not require distributed programming expertise. Orleans cannot be considered as pure Actor Model framework since it's actors do not have the ability to generate new actors. What is really important though is the following:

Orleans takes familiar concepts like objects, interfaces, async/await, and try/catch and extends them to multi-server environments. As such, it helps developers experienced with single-server applications transition to building resilient, scalable cloud services and other distributed applications.

Before proceeding, let's setup the terminology and key concepts.

The Grain is the building block within the Orleans ecosystem. A Grain is the equivalent of an Actor. Grains have:
- Identity. A Grain is unique within the cluster and is referenced by it's identity.
- Behavior. A Grain encodes its behavior in the code logic.
- State. A Grain encapsulates it's state.
The Silo is where the Grains live. A node in a cluster hosts one or more Silos and Silos host Grains.

On the "Apples Bucket" game, we finally eliminated the need for a distributed cache. Every user is a Grain instance and the Grain's state is the user's apples bucket. There is only one user instance in the cluster and the Orleans framework knows how to talk to it. Every time we ask for a user's apple count, the framework routes that request to that specific Grain no matter where it is located within the cluster.

Orleans Clients

Assuming we have setup our Orleans Silo, how do we talk to Grains from our code? For that we need to setup an Orleans Client. There are 2 types of Orleans Clients.

Co-hosted Client. This type of Client is running within the same process as the Silo. This is the most common Client and is probably the only one you will ever need. This Client is initialized along with the Silo and can be directly obtained from the hosting application's dependency injection container.

External Client. Client code can run outside of the Orleans cluster where Grain code is hosted. This Client needs to be initialized and point to the cluster where the Grains "live". It is the Client's responsibility to locate Grains and route requests. We just need the Grain Interface and to make the call. There are very specific cases where you might need to setup an External Client, like in the setup depicted below.

That's it, you should be familiar by now with the key concepts of Orleans and it should be more clear if it's a good fit for your application. The Microsoft Orleans project has a great documentation here. Take the time read it, it is totally worth it! You may also follow this official sample in order to setup your first Orleans project -> Creating a Minimal Orleans Application

Have fun 😏