How SOCI Index Reduced my Image Load Times by 70%

How SOCI Index Reduced my Image Load Times by 70%

Hello my fellow Docker-AWS-developing folks! I got some good stuff to share with you today.

  • Do you have large (1GB+) images that takes ages to pull?
  • Does this image take longer to setup than to run the main process?

Large images are a big problem when you want tasks to be done in a set amount of time. This is especially the case when you use AWS Fargate, where tasks are spun up and shut down each time. You don’t want your tasks to be sitting around waiting for the image to download!

We’re here to save time and money!!!!! And that is exactly what we’re doing today.

I have found a great tool to save up to 70% of time when pulling my Docker images.

Without further ado, let me introduce to you: SOCI Index on AWS!

But before I delve into the intricacies of SOCI indices, let me lay out some background information.

Docker

Docker is a PaaS product that allows developers to package applications in a portable way. Applications in Docker are run inside containers, which can be thought of as lightweight virtual machines. It holds everything that the app needs to run — code, libraries, dependencies and even the OS! This container can be moved and run on any machine that has Docker installed.

Docker images are templates for creating the containers. It has all the instructions required to create the container. Docker images are typically built from a base image and have extra instructions to suit the use case.

ECR

Amazon ECR (Elastic Container Registry) is a fully-managed AWS service that allows users to store Docker images in AWS. You get all the fully-managed AWS benefits: highly available, durable and minimal operational overhead.

Since it is offered in AWS, it integrates with other AWS services very well. For example, you can apply permissions to your IAM users to set access controls to your docker images. You can also deploy your images in ECR to container orchestration services like ECS and EKS very easily. You can even deploy your docker images to Lambda as well.

ECS

Amazon ECS (Elastic Container Service) is a fully-managed AWS service that allows users to run their docker containers. With ECS, its really easy to deploy, manage and scale docker containers on AWS!

You can specify the CPU, memory and networking configurations for your applications, and schedule your containers with ease as well. You can even set it to auto-scale based on your performance metrics too!

Integrate this with ECR to easily access your docker images and run them seamlessly.

Fargate

Amazon Fargate is a compute offering from AWS, available in ECS. It’s a serverless offering, meaning you can run your containers without having to manage EC2 instances. Just specify the CPU and memory and off you go! Don’t worry about setting up servers.

Now that the background information is out of the way, let me discuss my use case.

I have a Data Science team who wants to run some simulations on AWS. They have kindly created a Docker Image to package their application code. I have stored the image to ECR — Its pushed and ready to go.

Our data scientists wants to run their simulation once every 5 minutes. That’s perfect, because we can use ECS with Fargate to spin up the docker container, run the simulation and shut down after its finished. We are only paying for when the simulation is running.

The team also needs to run their simulation within two minutes. That is fine, because when we tested it locally the simulations only took a minute!

…hang on a minute. When I run it on AWS, its taking two and a half minutes! What’s going on?

Diagnosing the problem

The aws cli can give us a hint on what is going on.

aws ecs describe-tasks --tasks <TASK_ID>

The output might look something like this:

{
    "version": "0",
    "detail-type": "ECS Task State Change",
    "source": "aws.ecs",
    "region": "ap-southeast-2",
    "resources": [
        "arn"
    ],
    "detail": {
        "attachments": [
            {
                "type": "eni",
                "status": "DELETED",
                "details": [
                    {
                        "name": "subnetId",
                        "value": "subnet"
                    },
                    {
                        "name": "networkInterfaceId",
                        "value": "eni"
                    },
                    {
                        "name": "macAddress",
                        "value": "mac"
                    },
                    {
                        "name": "privateDnsName",
                        "value": "ip"
                    },
                    {
                        "name": "privateIPv4Address",
                        "value": "ip"
                    }
                ]
            }
        ],
        "attributes": [
            {
                "name": "ecs.cpu-architecture",
                "value": "x86_64"
            }
        ],
        "availabilityZone": "ap-southeast-2b",
        "clusterArn": "arn",
        "connectivity": "CONNECTED",
        "connectivityAt": "date",
        "containers": [
            {
                "containerArn": "arn",
                "exitCode": 0,
                "lastStatus": "STOPPED",
                "name": "name",
                "image": "image",
                "imageDigest": "sha256:",
                "runtimeId": "id",
                "taskArn": "arn",
                "networkInterfaces": [
                    {
                        "attachmentId": "",
                        "privateIpv4Address": ""
                    }
                ],
                "cpu": ""
            }
        ],
        "cpu": "16384",
        "createdAt": "",
        "desiredStatus": "STOPPED",
        "enableExecuteCommand": false,
        "ephemeralStorage": {
            "sizeInGiB": 20
        },
        "executionStoppedAt": "",
        "group": "",
        "launchType": "FARGATE",
        "lastStatus": "STOPPED",
        "memory": "32768",
        "overrides": {
            "containerOverrides": [
                {
                    "name": ""
                }
            ]
        },
        "platformVersion": "1.4.0",
        "pullStartedAt": "2024-04-12T05:56:00.171Z",
        "pullStoppedAt": "2024-04-12T05:57:30.739Z",
        "startedAt": "2024-04-12T05:57:32.323Z",
        "startedBy": "",
        "stoppingAt": "2024-04-12T05:58:35.034Z",
        "stoppedAt": "2024-04-12T05:58:49.111Z",
        "stoppedReason": "Essential container in task exited",
        "stopCode": "EssentialContainerExited",
        "taskArn": "",
        "taskDefinitionArn": "",
        "updatedAt": "2024-04-12T05:58:49.111Z",
        "version": 5
    }
}

We are interested in the time — especially pullStartedAt and pullStoppedAt. If you take a look, it shows that the image took a minute and a half to pull!

That’s not good! We want our simulations to finish in 2 minutes, but its taking 1.5 minutes just to get ready! We need a way to speed up the startup process. This is where SOCI index comes in.

SOCI Index

Seekable Open Container Initiative (SOCI) is an add-on technology within the docker ecosystem. It allows lazy loading of container images, meaning images can be pulled and loaded faster than conventional methods.

When you use SOCI, you first create a SOCI index for your image. The index puts a priority on the files and layers inside your image. This allows only the necessary portions of the image to be downloaded on-demand when setting up the image, rather than downloading the whole thing before starting up.

SOCI can also track the location of files within the compressed image layers. When a file is accessed, only the relevant compressed chunks containing the file are downloaded and decompressed, so that the download and decompression operations are kept at a minimum.

Overall, its a very innovative technology that enable on-demand loading of your images. This improves the startup performance and efficiency for Docker containers.

If you want to read more about SOCI indexes, check out this AWS Blog Post here: https://aws.amazon.com/blogs/containers/under-the-hood-lazy-loading-container-images-with-seekable-oci-and-aws-fargate/

Using it on AWS

In order to utilise the SOCI technology, we first have to create the SOCI indexes from our Docker images.

To do this, we have to install and use some SOCI software to generate the indexes and push it to our image repositories.

Thankfully, AWS has created a Github Repo with Cloudformation stacks that will deploy a SOCI Index Builder on your account.

This stack will deploy Lambda functions and EventBridge rules in your AWS account. These resources will automatically detect new images being pushed into your ECR repositories, create SOCI indexes for them and push the index into the same ECR repository. When using a SOCI indexed image, there is nothing special that you have to do. You can pull images from ECR just like you would normally do and you can see a boost in performance straight away. ECR will automatically use the SOCI index that is tied to your image to improve its performance.

For more information about AWS’s SOCI Index Builder, refer to https://aws-ia.github.io/cfn-ecr-aws-soci-index-builder/

My Use Case

As an example, when I first tried out SOCI indexes, I had a Docker image that was around 1GB in size. When I pull the image normally, my Fargate tasks take about 60 seconds to start up.

When I used SOCI indexes, the time was cut down to 15 seconds!! That is a 75% improvement.

Note that your experience with SOCI Index may vary based on the size of your Docker images. Official documentation recommend using SOCI index for any image larger than 250MB. Anything less and you are not likely to see a time reduction.

Read more

今年読んだ本を振り返る 2024

今年読んだ本を振り返る 2024

早いもので、今年も年の瀬。皆さん、年末いかがお過ごしでしょうか。 いきなりですが、読書歴って人の性格とマイブームを如実に写し出すものだと思うんですね。というわけで、年末というキリのいい時期に、今年読んだ本を振り返って見ようと思います。 「ああ、年初はこんなこと考えてたなぁ」 「こんな本読んでたっけ!?」 思い出にふける自分用の記事になってます。 「最強!」のニーチェ入門 幸福になる哲学 (河出文庫) 去年末、國分 功一郎先生の「暇と退屈の倫理学」を読んでから哲学にハマってました。その流れを汲んで2024年一発目の本は哲学書。「暇と退屈の倫理学」の中でニーチェに触れ、彼がカッコいいと思いニーチェ入門書を購入…みたいな理由だったと思います。 数年前、飲茶先生の著書「史上最強の哲学入門」を読み、"とにかく分かりやすいなー!"と思ったのは覚えてます。哲学の入門書といえば飲茶先生です。 この「ニーチェ入門」も、著者と一般人の対話形式なので凄く分かりやすいです。哲学書によくある難しい言い回しはないですし、ページ数も少ないので一般人でも読みやすい本になってます。 入門書なんですが、

By Roland Thompson
Keychron Q11のすゝめ

Keychron Q11のすゝめ

年がら年中パソコンに張り付いてるネット民の同志の皆様、ごきげんよう。快適なパソコンライフ送ってますか? 私はと言うと、実はとても調子がいいのです! え?なぜかって?実はこいつのおかげでしてね・・・ Keychron Q11!! このスパッと2つに割れたキーボードを使い始めてから体の調子がすこぶる良くて、 * 長年悩まされた肩痛が治った * 姿勢が良くなった * 彼女ができた * 宝くじがあたった * この前、初めてホームランを打ったの! と、いいこと三昧なんですね。 そこで今回はこのKeychron Q11を皆さんに宣伝紹介したいと思います。 私とキーボード 我々の生活にスマホが普及して久しいですが、パソコンもまだまだ現役。読者の皆様は常日頃からパソコンを使う方が多いと思います。 総務省の情報通信白書によると、パソコンの世帯保有率は70%、インターネット利用率は50%前後を推移しています。まだまだ高い水準ですね。 かく言う私もパソコンがないと生きていけない生活を送ってます。 * まず、仕事がエンジニアなので、パソコンがないと飯が食えない * 続

By Roland Thompson