January 23, 2022

Blog @ Munaf Sheikh

Latest news from tech-feeds around the world.

Desktop High-Performance Computing – DZone DevOps

Great post from our friends at Source link

Ever since Amazon Web Services debuted in 2008, builders of complex engineering software systems have had increasingly powerful ways to scale heavy computational workloads in the cloud. 

Calculations that previously would have required the purchase of dozens of expensive servers can now be executed at a fraction of the cost in AWS. Unfortunately, not all engineering software packages are server-based, much less cloud-based. 

Many engineering teams rely on desktop products that only run on Microsoft Windows. Desktop engineering tools that perform tasks such as optical ray tracing, genome sequencing, or computational fluid dynamics often couple graphical user interfaces with complex algorithms that can take many hours to run on traditional workstations, even when powerful CPUs and large amounts of RAM are available. Until recently, there has been no convenient way to scale complex desktop computational engineering workloads seamlessly to the cloud.

Fortunately, the advent of AWS Cloud Development Kit (CDK), AWS Elastic Container Service (ECS), and Docker finally make it easy to scale desktop engineering workloads written in C# and other languages to the cloud.

This article presents an approach for high-performance computing (HPC) for a contrived use case requiring the factoring of large sets of very big integers. Imagine that an engineering team, for some crazy reason, has a desktop program written in C# that finds factors of a group of integers, all larger than one quintillion. 

Even with a beefy processor, factoring just one integer larger than one quintillion can take several seconds. If a user needs to factor one thousand unique integers, say all the numbers from 1,000,000,000,000,000,001 to 1,000,000,000,000,001,000, they will be waiting quite a few minutes for results. 

The reason it takes so long is that for each number, the algorithm needs to perform a whole lot of divisions and test to see if the division result has a remainder or not. It would be very nice if we could scale these calculations onto elastic compute resources in the cloud while retaining our investment in existing algorithms written in a language like C#.

What We Will Build

We will leverage AWS CDK, C#, Docker, and AWS ECS Fargate to build a simple cloud solution that can be orchestrated from our desktop. The figure below shows the system architecture.

The desktop component first builds and packages a Docker image that can perform the engineering workload (factor an integer). AWS CDK, executing on the desktop, deploys the Docker image to AWS and stands up cloud infrastructure consisting of input/output worker queues and a serverless ECS Fargate cluster (1). 

Once the cloud infrastructure is provisioned, a client program can be executed on the desktop, sending integers to be factored to the cloud (2). Integers are sent to the cloud via the input queue (3) and resulting factors are returned from the ECS cluster (4) to the desktop via the output queue (5). 

The cluster starts with just one ECS task composed of a single Docker container running and is allowed to automatically scale up to five ECS tasks, each running its own Docker container, as load increases. When all the integers have been factored, the desktop client reports success and exits normally.

Prerequisites

To build his system, the following prerequisites are required. For brevity, the reader is referred to source documentation for installation instructions via the links below:

  • .NET — provides tools for compiling and packaging the C# code.
    Note: the code in the example assumes you will work with .NET 6.0.
  • AWS CDK — enables programmatic provisioning of cloud infrastructure
    Note: be sure to follow all the instructions in the getting started guide, particularly the ones regarding installation and configuration of the CDK prerequisites.
  • Docker Desktop — allows building and management of Docker container images needed for execution on AWS ECS Fargate

Building the System

First, let’s create a new .NET solution containing three projects, one for the worker container image, one for the desktop client, and one for the CDK infrastructure. We will also add NuGet dependencies for the AWS simple queueing system (SQS) and AWS CDK. Open a terminal and type the following .NET commands:

dotnet new sln -o DesktopHPC
cd DesktopHPC
dotnet new console -o Client
dotnet sln DesktopHPC.sln add Client/Client.csproj
dotnet new console -o Worker
dotnet sln DesktopHPC.sln add Worker/Worker.csproj
dotnet new console -o CDK
dotnet sln DesktopHPC.sln add CDK/CDK.csproj
dotnet add Client package AWSSDK.SQS
dotnet add Worker package AWSSDK.SQS
dotnet add CDK package Amazon.CDK
dotnet add CDK package Amazon.CDK.AWS.SQS
dotnet add CDK package Amazon.CDK.AWS.ECS
dotnet add CDK package Amazon.CDK.AWS.ECS.Patterns

Now we are ready to add code. Open the generated “Program.cs” file in the “Worker” directory and replace its contents by pasting in the following:

using System;
using System.Collections.Generic;
using System.Threading.Tasks;
using Amazon.SQS;
using Amazon.SQS.Model;

public class Worker
{
  static async Task Main(string[] args)
  {
    var sqs = new AmazonSQSClient();
    var input = await sqs.GetQueueUrlAsync("InputQueue");
    var output = await sqs.GetQueueUrlAsync("OutputQueue");
    Console.WriteLine("waiting for numbers to factor on {0}", input.QueueUrl);
    do
    {
      var msg = await sqs.ReceiveMessageAsync(new ReceiveMessageRequest
      {
        QueueUrl = input.QueueUrl,
        MaxNumberOfMessages = 1,
        WaitTimeSeconds = 3
      });

      if (msg.Messages.Count != 0)
      {
        long n2 = Convert.ToInt64(msg.Messages[0].Body);
        Console.WriteLine("received input: {0}", n2);
        string factorsString = GetFactors(n2);
        Console.WriteLine("factors are: {0}", factorsString);
        await sqs.SendMessageAsync(output.QueueUrl, factorsString);
        await sqs.DeleteMessageAsync(input.QueueUrl, msg.Messages[0].ReceiptHandle);
      }
    } while (true);
  }

  static string GetFactors(long n)
  {
    string s = n.ToString() + " = ";
    List<long> factors = new List<long>();
    while (!(n % 2 > 0))
    {
      n = n / 2;
      factors.Add(2);
    }
    for (long i = 3; i <= (long)Math.Sqrt(n); i += 2)
    {
      while (n % i == 0)
      {
        factors.Add(i);
        n = n / i;
      }
    }
    if (n > 2)
    {
      factors.Add(n);
    }
    return s + string.Join(" x ", factors);
  }
}

The worker code will be packaged into, and run from, a Docker container. The worker listens to the “InputQueue” for a new integer, factors it, and writes the resulting factors to the “OutputQueue”.

To build the Docker container image we need to add a “Dockerfile” in the same directory. Open a new file in the “Worker” directory called “Dockerfile” and paste in the following:

FROM mcr.microsoft.com/dotnet/runtime:6.0
COPY bin/Release/net6.0/publish/ App/
WORKDIR /App
ENTRYPOINT ["dotnet", "Worker.dll"]

Note again that we are assuming the installation of .NET 6.0. If you are using an older version such as 5.0, adjust the references above accordingly.

Now we are ready to code the client. Open the generated “Program.cs” file in the “Client” directory and replace its contents with the code below:

using System;
using System.Threading.Tasks;
using Amazon.SQS;
using Amazon.SQS.Model;

class Client
{
  static async Task Main(string[] args)
  {
    var sqs = new AmazonSQSClient();

    var output = await sqs.GetQueueUrlAsync("OutputQueue");
    var input = await sqs.GetQueueUrlAsync("InputQueue");
    long iStart = 1000000000000000001; // one quintillion one

    Console.WriteLine("Input number of large integers to factor, starting with one trillion one:");

    int count = Convert.ToInt32(Console.ReadLine());

    Console.WriteLine("Sending {0} numbers sent to cluster for factoring.", count);
    Console.WriteLine("From {0} to {1}", iStart, iStart + count);

    for (long i = iStart; i < iStart + count; i++)
    {
      await sqs.SendMessageAsync(input.QueueUrl, i.ToString());
    }

    Console.WriteLine("Results:");

    int finishedCount = 0;

    while (finishedCount < count)
    {
      var msg = await sqs.ReceiveMessageAsync(new ReceiveMessageRequest
      {
        QueueUrl = output.QueueUrl,
        MaxNumberOfMessages = 1,
        WaitTimeSeconds = 3
      });

      if (msg.Messages.Count != 0)
      {
        Console.WriteLine(msg.Messages[0].Body);
        await sqs.DeleteMessageAsync(output.QueueUrl, msg.Messages[0].ReceiptHandle);
        finishedCount++;
      }

    }
    Console.WriteLine("Factoring complete, destroy cluster when done.");
  }
}

The client code above provides a simple command-line interface prompting the user for the number of large integers they would like to factor and then sends that number of integers to the ECS Fargate cluster via the “InputQueue”. The client then waits for the same number of results to appear via the “OutputQueue”, prints the results, and terminates once all the results are back.

Now that we have the worker and the client code, we just need the infrastructure CDK code. The nice thing about CDK is that you can write it in Typescript, JavaScript, Java, or C#. Since the rest of our system is written in C#, we will use it here as well.

In the “CDK” directory, open the generated “Program.cs” file and replace its contents with the code below:

using Amazon.CDK;
using Amazon.CDK.AWS.ECS;
using Amazon.CDK.AWS.ECS.Patterns;
using Amazon.CDK.AWS.SQS;

public class Infrastructure : Stack
{
  static void Main(string[] args)
  {
    var app = new App();
    new Infrastructure(app, "HPCStack");
    app.Synth();
  }

  internal Infrastructure(
    Construct scope,
    string id,
    IStackProps props = null
  ) :
      base(scope, id, props)
  {
    var input =
      new Queue(this,
        "InputQueue",
        new QueueProps { QueueName = "InputQueue" });
    var output =
      new Queue(this,
        "OutputQueue",
        new QueueProps { QueueName = "OutputQueue" });
    var cluster =
      new Cluster(this,
        "HPCCluster",
        new ClusterProps { ClusterName = "HPCCluster" });
    var service =
        new QueueProcessingFargateService(this,
          "HPCService",
          new QueueProcessingFargateServiceProps
          {
            ServiceName = "HPCService",
            Cluster = cluster,
            MinScalingCapacity = 1,
            MaxScalingCapacity = 5,
            Image = ContainerImage.FromAsset("./worker"),
            MemoryLimitMiB = 2048,
            Queue = input
          });
    output.GrantSendMessages(service.TaskDefinition.TaskRole);
  }
}

The CDK code automatically creates the input and output queues, the ECS Fargate Cluster, and the ECS task definition utilizing our Docker image. The cluster is allowed to scale as needed from one worker instance up to a maximum of five worker instances. Each instance used 2 GB of memory in AWS.

Now we are ready to build the client and worker and infrastructure. Navigate to the top-level “DesktopHPC” directory and run the following command to build the whole solution:

dotnet publish -c Release

Next, bootstrap the CDK. Bootstrapping CDK sets up some resources in your AWS account you will need. You should only have to do this once. Use the “cdk bootstrap” command with an argument formatted like “aws://ACCOUNT-NUMBER/REGION”. For example, if your AWS account ID is 123456789012 and you want to deploy to the us-west-2 region, type:

cdk bootstrap aws://123456789012/us-west-2

When you are ready to deploy the cloud infrastructure, type:

cdk deploy --app "dotnet run --project CDK/CDK.csproj"

You will be asked to confirm the creation of certain security-related resources. Answer with “y” and CDK will go to work standing up your infrastructure.

After a few minutes, the ECS Fargate cluster should be ready to use.

Log in to the AWS console and navigate to “Elastic Container Service -> Clusters.” You should see your cluster, named “HPCCluster” created and ready as shown below:

Now we are ready to generate some workload and see how it scales. On your desktop go back to the command line and run the client from the DesktopHPC directory:

dotnet run --project Client

You will be prompted for a desired number of integers to factor starting with one trillion one. Try entering “1000”.

Input number of large integers to factor, starting with one trillion one:
1000
Sending 100 numbers sent to cluster for factoring.
From 1000000000000000001 to 1000000000000000101
Results:
1000000000000000001 = 101 x 9901 x 999999000001
1000000000000000002 = 2 x 3 x 17 x 131 x 1427 x 52445056723
1000000000000000003 = 1000000000000000003
1000000000000000012 = 2 x 2 x 13 x 487 x 4623217 x 8541289
1000000000000000005 = 3 x 5 x 44087 x 691381 x 2187161
1000000000000000004 = 2 x 2 x 1801 x 246809 x 562425889
1000000000000000011 = 3 x 53 x 389 x 16167887342161

...

As this runs, watch the ECS console, although you start with just one running task, after a few minutes you will see the task count scale up to five and the factors will start coming in faster.

Try running the client again with even more numbers to factor.

Cleanup

When you are all done, remember to use CDK to destroy the cloud formation stack so you won’t be charged for a cluster you no longer need.

cdk destroy --app "dotnet run --project CDK/CDK.csproj" -f

Make sure the destroy command really deleted your ECS cluster. Use the AWS console to verify that the cluster is no longer running to be certain you won’t be charged for a cluster you no longer need.

Code

The complete sample code for this article can be obtained from GitHub.

#Desktop #HighPerformance #Computing #DZone #DevOps