在AWS Spot实例上构建可扩展的API

你好!我叫Kirill,我是Adapty的CTO。我们的大多数架构都在AWS上,今天我将讨论如何通过在生产环境中使用Spot实例将服务器成本降低3倍,以及如何将它们设置为自动伸缩。首先将概述其工作原理,然后是详细的启动说明。



什么是竞价型实例?



竞价型实例是当前处于闲置状态的其他AWS用户服务器,并且它们以很高的折扣出售(亚马逊的写入量高达90%,根据我们的经验,约为三倍,具体取决于区域,可用区和实例类型)。它们与传统产品的主要区别在于它们可以随时关闭。因此,长期以来,我们认为将它们用于原始环境或进行计算任务是正常的,并且将中间结果保存在S3或基础上,而不是用于销售。有第三方解决方案可以让您使用特价商品,但是由于我们的案例很多,因此我们没有实施。本文中描述的方法完全在标准AWS功能内运行,而无需其他脚本,王冠等。



以下是一些屏幕快照,显示了竞价型实例价格的历史记录。



位于eu-west-1地区(爱尔兰)的m5.large价格大多稳定3个月,目前可节省2.9倍



图片



us-east-1地区(弗吉尼亚北部)的m5.large价格已经连续3个月变化,根据不同的可用区域,目前节省的费用是2.3倍至2.8倍



图片



t3.small美国东1区(N.弗吉尼亚州)。价格稳定3个月,目前可节省3.4倍



图片



服务架构



下图显示了我们将在本文中讨论的服务的基本体系结构。



图片



Application Load Balancer → EC2 Target Group → Elastic Container Service



Application Load Balancer (ALB), EC2 Target Group (TG). TG , ALB Elastic Container Service (ECS). ECS — Kubernetes AWS, Docker .



, . ECS TG, ( Kubernetes ), . TG , health check, - , .



EC2 Auto Scaling Groups + ECS Capacity Providers



EC2 Auto Scaling Groups (ASG). , . AWS ECS. ECS , , CPU, RAM . , .



ECS Capacity Providers (ECS CP). ECS ASG, , ( ASG). , ECS CP , ASG, . ECS CP , , , .



EC2 Launch Templates



, , , — EC2 Launch Templates. , , . , , . , . , , ECS .



ECS_ENABLE_SPOT_INSTANCE_DRAINING=true. , ECS , , , Draining. , , , . . 2 . 2 , .



— AWS Elastic File System (EFS) ECS, , , . SIGINT ( Draining) 30 , , ECS_CONTAINER_STOP_TIMEOUT. 2 .





. , . , - . AWS, CloudFormation Terraform. Adapty Terraform.



EC2 Launch Template



, . EC2 -> Instances -> Launch templates.



Amazon machine image (AMI) — , . ECS Amazon. ECS. ID , Amazon ECS-optimized AMIs, AMI ID . , us-east-1 ID — ami-00c7c1cf5bdc913ed. ID Specify a custom value.



Instance type — . , .



Key pair (login) — , SSH, .



Network settings — . Networking platform Virtual Private Cloud (VPC). Security groups — . , , . 2 , , (inbound) 80 (http) 443 (https), , . (outbound) TCP . , , - .



Storage (volumes) — . , AMI, ECS Optimized — 30 GiB.



Advanced details — .



Purchasing option — . , , Auto Scaling Group, .



IAM instance profile — , . , ECS, , ecsInstanceRole. , , , . .

, , . EBS-optimized instance T2/T3 Unlimited, burstable .



User data — . /etc/ecs/ecs.config, ECS.

, user data:



#!/bin/bash
echo ECS_CLUSTER=DemoApiClusterProd >> /etc/ecs/ecs.config
echo ECS_ENABLE_SPOT_INSTANCE_DRAINING=true >> /etc/ecs/ecs.config
echo ECS_CONTAINER_STOP_TIMEOUT=1m >> /etc/ecs/ecs.config
echo ECS_ENGINE_AUTH_TYPE=docker >> /etc/ecs/ecs.config
echo "ECS_ENGINE_AUTH_DATA={\"registry.gitlab.com\":{\"username\":\"username\",\"password\":\"password\"}}" >> /etc/ecs/ecs.config


ECS_CLUSTER=DemoApiClusterProd — , , . , .



ECS_ENABLE_SPOT_INSTANCE_DRAINING=true — , , Draining.



ECS_CONTAINER_STOP_TIMEOUT=1m — , SIGINT, 1 , .



ECS_ENGINE_AUTH_TYPE=docker — , docker-



ECS_ENGINE_AUTH_DATA=... — container registry, Docker . , .



Docker Hub, ECS_ENGINE_AUTH_TYPE ECS_ENGINE_AUTH_DATA .



: AMI, Docker, Linux, ECS . , . email , Lambda-, Launch Template AMI.



EC2 Auto Scaling Group



Auto Scaling Group . EC2 -> Auto Scaling -> Auto Scaling Groups.



Launch template — . .



Purchase options and instance types — . Adhere to launch template Launch Template. Combine purchase options and instance types . .



Optional On-Demand base — , , .



On-Demand percentage above base — , 50-50 , 20-80 4 . 50-50, 20-80, 0-100.



Instance types — , . , . , . , )



图片



Network — , VPC , .



Load balancing — , , . Health checks .



Group size — . , .



Scaling policies — , , ECS , .



Instance scale-in protection — . , ASG , . , , ECS Capacity Provider.



Add tags — ( Tag new instances). Name, , , , .



图片



Advanced configurations, .



Termination policies — , . . , . Launch Template (, AMI, , ). , . .



图片



: , Instance Refresh. Lambda- , . instance scale-in protection . , , Instance management.



Application Load Balancer EC2 Target Group



EC2 → Load Balancing → Load Balancers. Application Load Balancer, .



Listeners — 80 443 80 443 .



Availability Zones — .



Configure Security Settings — SSL- , — ACM. Security Policy , ELBSecurityPolicy-2016-08. , DNS name, CNAME . , Cloudflare.



图片



Security Group — , EC2 Launch Template → Network settings.



Target group — , , . Target type Instance, Protocol Port , HTTPS , . , 80 .



Health checks — . , -, -. , , . Success codes 200-399, Docker , , 304 .



图片



Register Targets — , ECS, .



: , S3 . , SQL- S3 Athena. - . S3 .



ECS Task Definition



, , , . ECS → Task Definitions.



Launch type compatibility — EC2.



Task execution IAM roleecsTaskExecutionRole. , .



Container Definitions Add Container.



Image — , Docker Hub bitnami/node-example:0.0.1.



Memory Limits — . Hard Limit — , , docker kill, . Soft Limit — , , . , 4 GiB , soft limit — 2048 MiB, 2 . 4 GiB — , 4096 MiB, ECS Instances . Soft limit hard limit. , , .



Port mappingsHost port 0, , , Target Group. Container Port — , , , , Dockerfile . 3000, Dockerfile .



Health check — , , Target Group.



Environment — . CPU units — Memory limits, . — 1024 , , 512, 4 . CPU units , .



Command — , . gunicorn, npm . , CMD Dockerfile. npm,start.



Environment variables — . , Secrets Manager Parameter Store.



Storage and Logging — CloudWatch Logs ( AWS). Auto-configure CloudWatch Logs. Task Definition CloudWatch. , Retention period Never Expire . CloudWatch Log groups, .



图片



ECS Cluster ECS Capacity Provider



ECS → Clusters, . EC2 Linux + Networking.



Cluster name — , , Launch Template ECS_CLUSTER, — DemoApiClusterProd. Create an empty cluster. Container Insights, CloudWatch. , ECS Instances , Auto Scaling group.



图片



Capacity Providers . , , ECS . , .



Auto Scaling group — .



Managed scaling — , .



Target capacity % — . 100%, . 50%, . , , , .



Managed termination protection — , . , Target capacity %.



ECS Service



:) , Services.



Launch type — Switch to capacity provider strategy .



图片



Task Definition — Task Definition .



Service name — , , Task Definition.



Service type — Replica.



Number of tasks — . , .



Minimum healthy percent Maximum percent — . 100 200, , 2 , . 1 , min=0, max=100, , , . 1 , min=50, max=150, , 1 .



Deployment type — Rolling update.



Placement Templates — . AZ Balanced Spread — , , . BinPack — CPU Spread — AZ, CPU. , .



图片



Load balancer type — Application Load Balancer.



Service IAM roleecsServiceRole.



Load balancer name — .



Health check grace period — , 60 .



Container to load balance — Target group name , .



图片



Service Auto Scaling — . Configure Service Auto Scaling to adjust your service’s desired count. .



IAM role for Service Auto ScalingAWSServiceRoleForApplicationAutoScaling_ECSService.



Automatic task scaling policies — . 2 :



  1. Target tracking — ( CPU/RAM ). , 85%, , , . , , (Disable scale-in).
  2. Step scaling — . (CloudWatch Alarm), , , .


, , , .





Docker , .



图片



  1. , . .
  2. , , .
  3. , .
  4. , , 3 .
  5. , , .
  6. Capacity Provider, (), .
  7. .


, , email-, .



. , - . 1+ . API, . , - , , .



, ECS - .



, serverless ( ) GitLab CI Terraform Cloud.



, !




All Articles