Home - Public Cloud Review

Latest Articles

- Public Cloud Compute Services Review (May 2020 update)
  The objective of this post is to analyze the Compute Services offered by the four main providers of public cloud; AWS, GCP, Azure and Alibaba Table of Contents Public Cloud Compute ServicesIaaS (Infrastructure as a Service)CaaS (Container as a Service)AaaS (Application as a Service)FaaS (Function as a Service)Public Cloud Compute Services Use Cases & RecommendationsIaaSAaaSCaaSFaaSAWS Compute ServicesAmazon Elastic Compute Cloud (Amazon EC2)TechnologySLAMachine TypesMachine OptionsDisks (Block & File Devices)Instance store volumesElastic Block Storage (EBS)Cloud File StorageAuto ScalingParallel ClusterBilling ModelOtherAmazon Lightsail EC2 Container Service (ECS) & Elastic Container Service for Kubernetes (EKS)AWS Elastic Beanstalk and AWS BatchAWS LambdaLanguage RuntimesEvents and TriggersAWS Serverless Application Model (AWS SAM) and Serverless Application RepositoryTimeoutSLABilling ModelGCP Compute ServicesGoogle Compute EngineTechnologySLAMachine TypesMachine OptionsDisks (Block & File Devices)Persistence DiskLocal DiskFile ServerRam DiskAuto ScalingBilling ModelOtherGoogle Kubernetes Engine & Registry (and Build)Google App EngineGoogle FunctionsLanguage RuntimesEvents and TriggersTimeoutSLABilling ModelAzure Cloud Compute ServicesAzure VMTechnologySLAMachine TypesMachine OptionsDisks (Block & File Devices)Azure VMs use three types of Disks Storage:Azure File Storage:Auto ScalingAzure CycleCloudBilling ModelOtherAzure Kubernetes Service (AKS)Azure Apps Service, Azure Cloud Services and Azure BatchAzure Spring Cloud (preview)Azure FunctionsLanguage RuntimesTimeoutSLABilling ModelAlibaba Cloud Compute ServicesAlibaba Elastic Compute Service (ECS)TechnologySLAMachine TypesMachine OptionsDisks (Block & File Devices)Alibaba Disk StorageAlibaba NASAuto ScalingBilling ModelOtherContainer Service, Container Service for Kubernetes & Elastic Container Instance (ECI)Simple Application Server & Batch ComputeFunctions ComputeLanguage RuntimesEvents and TriggersTimeoutSLABilling ModelPublic Cloud Compute Services: IaaS ComparisonPublic Cloud Compute Services: CaaS ComparisonPublic Cloud Compute Services: AaaS ComparisonPublic Cloud Compute Services: FaaS ComparisonConclusionFrom the point of view of the layers offeredIaasCaaSAaaSFaaSFrom the point of view of the developer who has to create an application in the cloud.From the the point of view of evolution during 2019-2020 Public Cloud Compute Services We can define Public Cloud Compute Services as the Cloud Platform or Engine to execute your business logic. Ok this is a very generic definition, but we can understand the services better if we go deeper into the next level. In general, AWS, GCP, Azure and Alibaba have structured his Compute Services in four types of services: Infrastructures as a Service (IaaS) Container as a Service (CaaS) Application as a Service (AaaS) Functions as a Service (FaaS) Moving from a model of high level of configurability and access of the underline infrastructure (IaaS) to a Serveless Model were the developer only have to take care of the application code (FaaS). IaaS (Infrastructure as a Service) IaaS was the first computing services offer by the Public cloud Provider and now it is a commodity.IaaS provides the basic building blocks for cloud IT and typically provide access to networking features, computers (virtual or on dedicated hardware), and data storage space. Infrastructure as a Service provides you with the highest level of flexibility and management control over your IT resources and is most similar to existing IT resources that many IT departments are familiar with today. During the last years all the Public Cloud has aligned the offering covering the following features: Predefined Virtual Machines with a wide range of VCPU’s and Memory depending of your type of workload: Standard o General Purpose High CPU Optimize High Memory Optimize Custom Virtual Machines where you can combine Cores and Memory to cover your specific needs Graphics processing units (GPUs) to accelerate specific workloads on your instances such as machine learning and data processing. Linux & Windows Support SSD and Magnetic storage local o network disks Supports Auto Scaling Supports different model of Billing; On demand, Reserved or Preemptible Images and Instance templates management Custom or default Virtual private network to deploy the VM And offer different models of Machine agreements: Dedicated Instances are instances that run in a VPC on hardware that’s dedicated to a single customer. Your Dedicated instances are physically isolated at the host hardware level from instances that belong to other accounts. Dedicated instances may share hardware with other instances from the same account that are not Dedicated instances. Pay for Dedicated Instances On-Demand, Reserved Instances, or Spot Instances. On-Demand Instances let you pay for compute capacity by unit of time with no long-term commitments or upfront payments. Perfect for users that want the low cost and flexibility without any up-front payment or long-term commitment Applications with short term, spiky, or unpredictable workloads that cannot be interrupted Applications being developed or tested for the first time Reserved Instances provides you with a capacity reservation, and offer a significant discount on the hourly charge for an instance 1 Year to 3 Year Terms. Applications with steady state or predictable usage Applications that require reserved capacity Spot Instances: With Spot Instances, you can bid for unused capacity in a cloud vendors data center. You can save up to 90% of the cost when compared to On-Demand Instances. However, if some else bids higher than you, your Instance will be taken away. Applications that have flexible start and end times Applications that are only feasible at very low compute prices Users with an urgent need for large amounts of additional computing capacity Dedicate Host are physical server dedicated for your use. Dedicated Hosts can help you reduce costs by allowing you to use your existing server-bound software licenses. Useful for regulatory requirements that may not support multi-tenant virtualization. Great for licensing which does not support multi-tenancy or cloud deployments. Can be purchased On-Demand or Reserved On Premises that allows extend the Provider fully managed IaaS solution on-premises under an hybrid approach. CaaS (Container as a Service) CaaS provides a managed environment for deploying, managing, and scaling your containerized applications. The trend today is to use Docker containers with Kubernetes that was led by Google. Kubernetes is open source software that allows you to deploy and manage containerized applications at scale. Kubernetes manages clusters of the Public Cloud IaaS compute instances and runs containers on those instances with processes for deployment, maintenance, and scaling. Using Kubernetes, you can run any type of containerized applications using the same toolset on-premises and in the cloud. In the future the vendors will also provide solutions to cover all life cycle management including Continuous Integration and Continuous Delivery customize for a Kubernetes/Docker environment. Up to date they are offering approaches based on Open Source solutions. Finally the trend is start to offer CaaS with Kubernetes in a Serverless mode. AaaS (Application as a Service) AaaS is the next level of abstraction provide by the Public Cloud providers to simplify the work of deploying web and mobile applications offering a fully managed platform that completely abstracts away infrastructure so you focus only on code. In addition to the Infrastructure abstraction the AaaS also cover the life cycle management of the application to perform more robust deployment workflows than deploying your website directly to production. Finally, under AaaS is also cover the Batch Engine that allows you to run applications, long-running scripts, or heavy compute scripts without creating or managing the underlying infrastructure of VM pool. AaaS seems the optimal solution for new applications, however there are some issues: Each Public Cloud offer a different approach (and no one standard) that means a strong Lock-in with the Vendor It seems that Public Cloud vendors have stopped betting on this initiative and focus on the options of CaaS and FaaS There are limitations in the use of third-party products, languages and application architecture. FaaS (Function as a Service) FaaS is the maximum level of abstraction provide by the Public Cloud vendors to simplify the deployment of code. FaaS is a Serverless execution environment for building cloud services encapsulated in functions. With FaaS you write simple, single-purpose functions that can be used in the following way: As an event-driven compute service where the function runs in response to events. As a compute service to run your function in response to HTTP requests. Functions are really Serverless and Scales automatically. Although the technology used for FaaS in each Public Cloud vendor is different, the interfaces and features are very similar, which allows with a light architecture to avoid the Lock-in. The billing model is also very similar; Pay only while your code runs. However, FaaS is not the perfect solution to develop applications. It has limitations that must be taken into account as: Limited execution Timeout Latency to start the function Limited languages So up to date, Functions are appropriate anytime you want to use Serverless infrastructure to run code snippets that no need a low latency response. In addition, some providers are empowering the Serverless model with the concept of Serverless Applications as a combination of Functions and the rest of resources requires to run an application like interfaces API’s, events, etc. Public Cloud Compute Services Use Cases & Recommendations IaaS AaaS CaaS FaaS AWS Compute Services Amazon Elastic Compute Cloud (Amazon EC2) Technology The technology behind AWS EC2 VMs is Xen SLA Monthly Uptime Percentage to Customer of at least 99.99% Machine Types Selection of instance types optimized to fit different use cases. Up to 96 VCPU & 768 GB Memory. GPU Up to 16 GPU & 64 GB of GPU Memory Machine Options Dedicated Instances are Amazon EC2 instances that run in a VPC on hardware that’s dedicated to a single customer. Your Dedicated instances are physically isolated at the host hardware level from instances that belong to other AWS accounts. Dedicated instances may share hardware with other instances from the same AWS account that are not Dedicated instances. Pay for Dedicated Instances On-Demand, save up to 70% by purchasing Reserved Instances, or save up to 90% by purchasing Spot Instances. On Demand Instances you pay for compute capacity by per hour or per second depending on which instances you run. No longer-term commitments or upfront payments are needed. Reserved Instances provide you with a significant discount (up to 75%) compared to On-Demand instance pricing. In addition, when Reserved Instances are assigned to a specific Availability Zone, they provide a capacity reservation, giving you additional confidence in your ability to launch instances when you need them. Spot Instances– Amazon EC2 Spot instances allow you to request spare Amazon EC2 computing capacity for up to 90% off the On-Demand price Dedicated Hosts – Physical EC2 server dedicated for your use. Dedicated Hosts can help you reduce costs by allowing you to use your existing server-bound software licenses. Can be purchased On-Demand (hourly) Can be purchased as a Reservation for up to 70% off the On-Demand price. On Premises – AWS Outposts that allows run AWS infrastructure and services on premises for a truly consistent hybrid experience Disks (Block & File Devices) Amazon EC2 supports two types of block devices: Instance store volumes (virtual devices whose underlying hardware is physically attached to the host computer for the instance) and EBS volumes (remote storage devices), and a File Devices under Cloud File Storage Instance store volumes An instance store provides temporary block-level storage for your instance. This storage is located on disks that are physically attached to the host computer. Instance store is ideal for temporary storage of information that changes frequently, such as buffers, caches, scratch data, and other temporary content, or for data that is replicated across a fleet of instances, such as a load-balanced pool of web servers. SSD (up to 60 TB) and Magnetic (up to 48TB) Elastic Block Storage (EBS) Amazon EBS allows you to create storage volumes and attach them to Amazon EC2 instances. Once attached, you can create a file system on top of these volumes, run a database, or use them in any other way you would use a block device. Amazon EBS volumes are placed in a specific Availability Zone, where they are automatically replicated within the same AZ to protect you from the failure of a single component. You can create EBS General Purpose SSD (gp2), Provisioned IOPS SSD (io1), Throughput Optimized HDD (st1), and Cold HDD (sc1) volumes up to 16 TiB in size. Cloud File Storage Cloud file storage is a method for storing data in the cloud that provides servers and applications access to data through shared file systems. This compatibility makes cloud file storage ideal for workloads that rely on shared file systems and provides simple integration without code changes. Amazon Cloud File Storage systems can store petabytes of data. Auto Scaling AWS Auto Scaling monitors your applications and automatically adjusts capacity to maintain steady, predictable performance at the lowest possible cost. Features: Auto Scaling Plans Maintain current instance levels based on a periodic health check Manual Scaling where you specify the change in the maximum, minimum or desired capacity Scheduling Scaling for predictable changes Dynamic Scaling based on a policy Auto Scaling Group is a collection of AWS EC2 instances by the Autoscaling Service that have a minimum, maximum, and, desired number of EC2 instances. Scaling Policy can be associated with CloudWatch alarms The cooldown period is a configurable setting for your Auto Scaling group that helps to ensure that it doesn’t launch or terminate additional instances before the previous scaling activity takes effect. Parallel Cluster AWS ParallelCluster is an AWS-supported open source cluster management tool (based on CfnCluster project) that helps you to deploy and manage High Performance Computing (HPC) clusters in the AWS Cloud Billing Model On Demand: Pricing is per instance-hour consumed for each instance, from the time an instance is launched until it is terminated or stopped. Each partial instance-hour consumed will be billed per-second for Linux Instances and as a full hour for all other instance types. Discounts: Spot Instances up to 90% Reserved Instances up to 75% Dedicated Host depending on your legacy SW licenses reutilizations Other Linux & Windows Support Public and Custom Image Support Snapshot support Start & Termination Script Migration tools and methodology VMware Cloud on AWS is an integrated cloud offering jointly developed by AWS and VMware that allows organizations to seamlessly migrate and extend their on-premises VMware vSphere-based environments to the AWS Cloud running on Amazon EC2 bare metal infrastructure. Amazon Lightsail Lightsail is a lightweight, simplified product offering of AWS, hard disks are fixed size EBS SSD volumes, instances are still billable when stopped, security group rules are much less flexible, and only a very limited subset of EC2 features and options are accessible. Lightsail has been created for customers who want a very simple to understand hosting plan and host simple websites. EC2 Container Service (ECS) & Elastic Container Service for Kubernetes (EKS) AWS offer two options for CaaS: EC2 Container Service (ECS). This was the first version of CaaS. It is a highly scalable, fast, container management AWS service that makes it easy to run, stop, and manage Docker containers on a cluster. Elastic Container Service for Kubernetes (EKS). Amazon EKS runs the Kubernetes management infrastructure. Applications running on any standard Kubernetes environment are fully compatible and can be easily migrated to Amazon EKS The original AWS solution for CaaS was ECS, however due to the market pressure with Kubernetes AWS decided to release the managed service of Kubernetes EKS. Currently the integration of EKS with the rest of AWS services is not as complete as ECS but it is a matter of time. Clearly the winning bet is EKS given the compatibility with other managed services of kubernetes and implementations on premise. Amazon EKS features: AWS Load-balancing integration. Automatic scaling of your cluster’s node instance count Automatic upgrades for your cluster’s node software Hybrid Networking Workload Portability, on-premises and cloud Identity and Access Management Integration Logging and Monitoring Amazon ECR Registries allows o host your images in a highly available and scalable architecture, allowing you to deploy containers reliably for your applications. You can use your registry to manage image repositories and Docker images. Each AWS account is provided with a single (default) Amazon ECR registry with the additional features: Fine-grained access control. Existing CI/CD integrations You pay per hour for each Amazon EKS cluster that you create and for the AWS resources you create to run your Kubernetes worker nodes. AWS Elastic Beanstalk and AWS Batch AWS Elastic Beanstalk is an easy-to-use service for deploying and scaling web applications and services developed with Java, .NET, PHP, Node.js, Python, Ruby, Go, and Docker on familiar servers such as Apache, Nginx, Passenger, and IIS. Features: Wide Selection of Application Platforms; Java, .NET, Node.js, PHP, Ruby, Python, Go, and Docker to deploy your web applications. Variety of Application Deployment Option (Visual Studio and Eclipse) Monitoring, Logging, and Tracing Management and Updates Scaling AWS Resources Customization AWS Batch enables developers, scientists, and engineers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. AWS Batch dynamically provisions the optimal quantity and type of compute resources. Features: Support for multi-node parallel jobs Granular job definitions Simple job dependency modeling Support for popular workflow engines Dynamic compute resource provisioning and scaling Priority-based job scheduling Dynamic spot bidding Integrated monitoring and logging Fine-grained access control There is no additional charge for AWS Elastic Beanstalk and AWS Batch. You pay for AWS resources (e.g. EC2 instances or S3 buckets) you create to store and run your application. AWS Lambda Language Runtimes AWS Lambda natively supports Java 8-11, Go 1.x, PowerShell and C# (.Net Core 3.1 and 2.1), Python 3.8-3.7-3.6-2.7, Node.js 12 & 10, and Ruby 2.7-2.5 code. In addition, you can implement an AWS Lambda runtime in any programming language. A runtime is a program that runs a Lambda function’s handler method when the function is invoked. Events and Triggers HTTP— HTTP requests using Amazon API Gateway or API calls made using AWS SDKs. Amazon S3 Amazon DynamoDB Amazon Kinesis Data Streams Amazon Simple Notification Service Amazon Simple Email Service Amazon Simple Queue Service Amazon Cognito AWS CloudFormation Amazon CloudWatch Logs Amazon CloudWatch Events AWS CodeCommit Scheduled Events (powered by Amazon CloudWatch Events) AWS Config Amazon Alexa Amazon Lex Amazon API Gateway AWS IoT Button Amazon CloudFront Amazon Kinesis Data Firehose Other Event Sources: Invoking a Lambda Function On Demand AWS Serverless Application Model (AWS SAM) and Serverless Application Repository An open-source framework that you can use to build Serverless Application ( a combination of Lambda functions, event sources, and other resources that work together to perform tasks) together with a repository for serverless applications. Timeout Function execution time is limited by the timeout duration, which you can specify at function deployment time. A function times out after 3 seconds by default, but you can extend this period up to 15 minutes. When function execution exceeds the timeout, an error status is immediately returned. SLA Monthly Uptime Percentage <= 99.95% Billing Model Lambda counts a request each time it starts executing in response to an event notification or invoke call, including test invokes from the console. You are charged for the total number of requests across all your functions. Duration is calculated from the time your code begins executing until it returns or otherwise terminates, rounded up to the nearest 100ms. The price depends on the amount of memory you allocate to your function. Data Transfer out to internet The Lambda free tier includes 1M free requests per month and 400,000 GB-seconds of compute time per month. GCP Compute Services Google Compute Engine Technology The technology behind Google Cloud’s VMs is KVM SLA Monthly Uptime Percentage to Customer of at least 99.99% Machine Types Predefined machine types Predefined machine types have a fixed collection of resources (Up to 224 VCPU & 896 GB Memory). Custom machine Up to 416 VCPU & The memory per vCPU of a custom machine type must be between 0.5 GB and 8 GB per vCPU. If you require more memory, you must use one of the mega-memory machine types, which allow you to create instances with a total of 1.4 TB per VM instance. GPU Up to 8 GPU & 96 GB of GPU Memory Machine Options Dedicated Instances On Demand Instances allows you to pay a fixed rate by second with no Commitment. Reserved Instances (Committed-use discounts ) If your workload is stable and predictable, you can purchase a specific amount of vCPUs and memory for a discount off of normal prices in return for committing to a usage term of 1 year or 3 years. The discount is up to 57% for most machine types or custom machine types. The discount is up to 70% for memory-optimized machine types. Spot Instances (Preemptible VM) An instance that you can create and run at a much lower price than normal instances. However, Compute Engine might terminate (preempt) these instances if it requires access to those resources for other tasks. Get upto 79% discount Can not live Migrate and auto Restart 24 hours max use and Not covered under SLA Charged if only started for 10 min , Less use will not be billed. When you attach GPU to preemptible – you quota will be used. Compute Engine sends signal for preemption to VM 30 sec Average preemption rate varies between 5% and 15% per seven days per project Shielded VM Shielded VM offers verifiable integrity of your Compute Engine VM instances, so you can be confident your instances haven’t been compromised by boot- or kernel-level malware or rootkits. Shielded VM’s verifiable integrity is achieved through the use of Secure Boot, virtual trusted platform module (vTPM)-enabled Measured Boot, and integrity monitoring. Dedicate Host (Sole-tenant nodes) Sole-tenant nodes are physical Compute Engine servers dedicated to hosting only VM instances from your specific project. On Premises – Anthos GKE on-prem (GKE on-prem) is hybrid cloud software that brings GKE to on-premises data centers. Disks (Block & File Devices) By default, each Compute Engine instance has a single boot persistent disk that contains the operating system. When your applications require additional storage space, you can add one or more additional storage options to your instance. Persistence Disk Network Storage & Attached VM through network Interface Persistent and independent of compute(instance) Zonal (or regional with synchronous replication across two zones in a region) Used as Bootable, Snapshots Type: Standard (magnetic) up to 64TB SSD up to 64TB Resize dynamically (even when instance is running) Attached to multiple VM for read only data Automatic Encryption – You can choose your own key Lower performance with corresponding Local SSD/ Ram disk Local Disk Local Disk can be attached to VM Ephemeral in nature; Data stays on Restart but not on Instance stopped / terminate Provided high IOPS based on size of disk; Upto 680K read and 360 write You can attach a maximum of 24 local SSD partitions for a total of 9 TB per instance. Can not live migrate SCSI or NVMe Interface Not available for Shared Core File Server A file server, also called a storage filer, provides a way for applications to read and update files that are shared across machines. It manages multiples filer solution like Elastifile, Quobyte, Avere, Panzura and others. Support petabytes for Elastifile and Quobyte Ram Disk RAM disks share instance memory with your applications and use the RAM assigned to the VM instance. Faster than any disk option available Ephemeral – goes away on stop, restart, terminate Auto Scaling Autoscaling is a feature of managed instance groups. A managed instance group is a pool of homogeneous instances, created from a common instance template. An autoscaler adds or deletes instances from a managed instance group. Although Compute Engine has both managed and unmanaged instance groups, only managed instance groups can be used with autoscaler. Compute Engine offers autoscaling to automatically add or remove virtual machines from an instance group based on increases or decreases in load. This allows your applications to gracefully handle increases in traffic and reduces cost when the need for resources is lower. You just define the autoscaling policy and the autoscaler performs automatic scaling based on the measured load. Autoscaling policy and target utilization To create an autoscaler, you must specify the autoscaling policy and a target utilization level that the autoscaler uses to determine when to scale the group. You can choose to scale using the following policies: Average CPU utilization HTTP load balancing serving capacity, which can be based on either utilization or requests per second. Stackdriver Monitoring metrics Billing Model All vCPUs, GPUs, and GB of memory are charged a minimum of 1 minute. After 1 minute, instances are charged in 1 second increments. Discounts: Sustained use discounts (When an instance uses a vCPU or a GB of memory for more than 25% of a month) up to 30% Committed use discounts up to 70% Pre-emptible up to 80% Other Linux & Windows Support Public and Custom Image Support Managed and unmanaged Instance Groups Support Snapshot support Start & Termination Script Migration tools and methodology Google Kubernetes Engine & Registry (and Build) Google Kubernetes Engine provides a managed environment for deploying, managing, and scaling your containerized applications using Google infrastructure. The environment GKE provides consists of multiple machines (specifically, Google Compute Engine instances) grouped together to form a cluster. GKE clusters are powered by the Kubernetes open source cluster management system and Docker Image Support with the following features: Google Cloud Platform’s load-balancing for Compute Engine instances Node pools to designate subsets of nodes within a cluster for additional flexibility Multi-zone Clusters or Regional Clusters Automatic scaling of your cluster’s node instance count Automatic upgrades for your cluster’s node software Node auto-repair to maintain node health and availability Hybrid Networking Workload Portability, on-premises and cloud Dashboard for your project’s GKE clusters and their resources. You can use these dashboards to view, inspect, manage, and delete resources in your clusters Identity and Access Management Integration Logging and Monitoring with Stackdriver for visibility into your cluster Google Container Registry is a private container image registry that runs on Google Cloud Platform. Container Registry supports Docker Image Manifest V2 and OCI image formats with the additional features: Perform vulnerability analysis Fine-grained access control. Existing CI/CD integrations Google Cloud Build is a service that executes your builds on Google Cloud Platform’s infrastructure. Cloud Build can import source code from a variety of repositories or cloud storage spaces, execute a build to your specifications, and produce artifacts such as Docker containers or Java archives. GKE uses Google Compute Engine instances for nodes in the cluster. You are billed for each of those instances according to Compute Engine’s pricing, until the nodes are deleted. Support GKE on premises with Anthos GKE on-prem that brings Google Kubernetes Engine (GKE) to on-premises data centers. With GKE on-prem, you can create, manage, and upgrade Kubernetes clusters in your on-premises environment. Google is also pushing the concept of Serverless in containers with Knative. Knative provides an open API and runtime environment that enables you to run your serverless workloads anywhere you choose: fully managed on Google Cloud, or on Anthos on Google Kubernetes Engine (GKE), or on your own Kubernetes cluster. Batch on GKE (Batch) the GCP batch solution for scheduling and managing batch workloads. With Batch, you can leverage the on-demand and flexible nature of cloud. Batch is based on Kubernetes and containers so your jobs are portable. Google App Engine Google App Engine is a fully managed platform that completely abstracts away infrastructure so you focus only on code. Google offers two environments: App Engine Flexible Environment App Engine allows developers to focus on doing what they do best, writing code. Based on Google Compute Engine, the App Engine flexible environment automatically scales your app up and down while balancing the load. Microservices, authorization, SQL and NoSQL databases, traffic splitting, logging, versioning, security scanning, and content delivery networks are all supported natively. In addition, the App Engine flexible environment allows you to customize the runtime and even the operating system of your virtual machine using Dockerfiles. App Engine Standard Environment The App Engine standard environment is based on container instances running on Google’s infrastructure. Containers are preconfigured with one of several available runtimes. Applications run in a secure, sandboxed environment, allowing the App Engine standard environment to distribute requests across multiple servers, and scaling servers to meet traffic demands. Your application runs within its own secure, reliable environment that is independent of the hardware, operating system, or physical location of the server. Applications running in the App Engine flexible environment are deployed to virtual machine types that you specify. You are billed for each of those instances according to Compute Engine’s pricing. Applications running in the App Engine standard environment are deployed to instance classes that you specify, that have a cost per hour per instance. General Features: Fully managed serverless application platform Wide Selection of Application Platforms; Java, PHP, Node.js, Python, C#, .Net, Ruby and Go and Docker to deploy your web applications. Variety of Application Deployment Option (Cloud Source Repositories, IntelliJ IDEA, Visual Studio) Monitoring, Logging, and Diagnostics Application Versioning Scaling GCP Resources Customization Traffic Splitting Application Security Google App Engine allows Scheduling Tasks With Cron for Python. The App Engine Cron Service allows you to configure regularly scheduled tasks that operate at defined times or regular intervals. This is a basic Batch service. For more complex Batch you can use Google Data Flow. Cloud Dataflow is a fully-managed service for transforming and enriching data in stream (real time) and batch (historical) modes with equal reliability and expressiveness. Google Data Flow features: Based on Apache Beam (java & Python) Automated Resource Management Dynamic Work Rebalancing Horizontal Auto-scaling Applications running in the App Engine flexible environment are deployed to virtual machine types that you specify. You are billed for each of those instances according to Compute Engine’s pricing. Applications running in the App Engine standard environment are deployed to instance classes that you specify, that have a cost per hour per instance. Cloud Dataflow service usage is billed in per second increments, on a per job basis. Google Functions Language Runtimes Cloud Functions can be written using JavaScript (Node.js 6-8-10), Python (Python 3.7.6), or Go (Go 1.11 and 1.13(beta)) runtimes on Google Cloud Platform with Ubuntu and Debian (for node.js 6). Events and Triggers HTTP—invoke functions directly via HTTP requests. Cloud Storage Cloud Pub/Sub Cloud Firestore Firebase (Realtime Database, Storage, Analytics, Auth) Stackdriver Logging—forward log entries to a Pub/Sub topic by creating a sink. You can then trigger the function. Timeout Function execution time is limited by the timeout duration, which you can specify at function deployment time. A function times out after 1 minute by default, but you can extend this period up to 9 minutes. When function execution exceeds the timeout, an error status is immediately returned. SLA Monthly Uptime Percentage <= 99.5% Billing Model Invocations are charged at a per-unit rate, excluding the first 2 million free invocations per month and are charged regardless of the outcome of the function or its duration. Compute time. Compute time is measured in 100ms increments, rounded up to the nearest increment. Memory provisioned. Outbound data transfer (that is, data transferred from your function out to somewhere else) is measured in GB and charged at a flat rate. Outbound data to other Google APIs in the same region is free, as is inbound data Azure Cloud Compute Services Azure VM Technology Azure runs on a customized version of Hyper-V SLA Monthly Uptime Percentage to Customer of at least 99.95% Machine Types Selection of instance types optimized to fit different use cases Up to 416 VCPU & 11.4 TB Memory GPU Up to 8 GPU & 96 GB of GPU Memory Machine Options Dedicated Instances On-Demand Instances (Pay as you go) –Pay for compute capacity by the second, with no long-term commitment or upfront payments. Increase or decrease compute capacity on demand. Start or stop at any time and only pay for what you use. – allows you to pay a fixed rate by the hour (or by the second) with no Commitment. Reserved Virtual Machine Instances – An Azure Reserved Virtual Machine Instance is an advanced purchase of a Virtual Machine for one or three years in a specified region. The commitment is made up front, and in return, you get up to 72 percent price savings compared to pay-as-you-go pricing Spot (low-Priority VM) enable you to take advantage of our unused capacity. The amount of available unused capacity can vary based on size, region, time of day, and more. When deploying Low-priority VMs in VM scale sets, Azure will allocate the VMs if there is capacity available, but there are no SLA guarantees. At any point in time when Azure needs the capacity back, we will evict low-priority VMs. Low-priority Linux VMs come with 80% discount while the Windows VMs come with 60% discount. Dedicated Hosts (Isolated VM) – Azure Compute offers virtual machine sizes that are Isolated to a specific hardware type and dedicated to a single customer. These virtual machine sizes are best suited for workloads that require a high degree of isolation from other customers for workloads involving elements like compliance and regulatory requirements. Customers can also choose to further subdivide the resources of these Isolated virtual machines by using Azure support for nested virtual machines. On Premises – Azure Stack Portfolio is an extension of Azure to consistently build and run hybrid applications across datacenters, edge locations, remote offices, and cloud. Azure Stack provides customers choice and flexibility based on their solution needs from consistent hybrid cloud on-premises with Azure Stack Hub that can be connected or disconnected from public cloud, to high-performance virtualization on-premises with Azure Stack HCI or an Azure managed appliance that provides intelligent compute and AI at the edge with Azure Stack Edge. Disks (Block & File Devices) Azure VMs use three types of Disks Storage: Operating System Disk (OS Disk) The C drive in Windows or /dev/sda on Linux. This disk is registered as an SATA drive and has a maximum capacity of 2048 gigabytes (GB). This disk is persistent and is stored in Azure storage. Temporary Disk The D drive in Windows or /dev/sdb on Linux. This disk is used for short term storage for applications or the system. Data on this drive can be lost in during a maintenance event, or if the VM is moved to a different host because the data is stored on the local disk. Data Disk Registered as a SCSI drive. These disks can be attached to a virtual machine, the number of which depends on the VM instance size. Data disks have a maximum capacity of 32 TB per disk. These disks are persistent and stored in Azure Storage. There are two types of disks in Azure: Managed or Unmanaged. Unmanaged disks With unmanaged disks you are responsible for ensuring for the correct distribution of your VM disks in storage accounts for capacity planning as well as availability. An unmanaged disk is also not a separate manageable entity. This means that you cannot take advantage of features like role based access control (RBAC) or resource locks at the disk level. Managed disks Managed disks handle storage for you by automatically distributing your disks in storage accounts for capacity and by integrating with Azure Availability Sets to provide isolation for your storage just like availability sets do for virtual machines. Managed disks also makes it easy to change between Standard and Premium storage (HDD to SSD) without the need to write conversion scripts. Azure managed disks currently offers four disk types: of ultra solid-state-drives (SSD) up to 65 TB, premium SSD, standard SSD, and standard hard disk drives (HDD) that supports up to 32 TB Azure File Storage: Azure File Service is a fully managed file share service that offers endpoints for the Server Messaging Block (SMB) protocol, also known as Common internet File System or CIFS 2.1 and 3.0. This allows you to create one or more file shares in the cloud (up to 5 TB per share) and use the share for similar uses as a regular Windows File Server, such as shared storage or for new uses such as part of a lift and shift migration strategy. Auto Scaling An Azure virtual machine scale set can automatically increase or decrease the number of VM instances that run your application based on Autoscale rules. Auto-scale can be configured to make scaling decisions based on Time rule or Schedule to automatically scale the number of VM instances at fixed times. Resource Metric rule (CPU, Memory, disk,…) Custom Metric rule that your application(s) may be emitting. Azure in addition to scale up or down allows sending a notification and invoking a Webhook Azure CycleCloud An enterprise-friendly tool for orchestrating and managing High Performance Computing (HPC) environments on Azure. With CycleCloud, users can provision infrastructure for HPC systems, deploy familiar HPC schedulers, and automatically scale the infrastructure to run jobs efficiently at any scale. Billing Model Pay as you go: Pay for compute capacity by the second, with no long-term commitment or upfront payments. Increase or decrease compute capacity on demand. Start or stop at any time and only pay for what you use. Discounts: Spot Instances (low-Priority VM). Low-priority Linux VMs come with 80% discount while the Windows VMs come with 60% discount Reserved Instances up to 72% Dedicated Host depending on your legacy SW licenses reutilizations Other Linux & Windows Support Public and Custom Image Support Snapshot support Start & Termination Script Elastic IP Addresses Update and Fault Domains Azure Stack; hybrid cloud platform that lets you provide Azure services from your datacenter Azure Kubernetes Service (AKS) As AWS, Azure has decided to evolve his Container technology to Kubernetes. In fact, the previous version Azure Container Service (ACS) will be retired on January 31, 2020, and is no longer recommended for new resources. Azure Kubernetes Service (AKS) is a hosted Kubernetes service, Azure handles critical tasks like health monitoring and maintenance for you. The Kubernetes masters are managed by Azure. You only manage and maintain the agent nodes. As a managed Kubernetes service, AKS is free you only pay for the agent nodes within your clusters, not for the masters. A Kubernetes cluster is divided into two components: Cluster master nodes provide the core Kubernetes services and orchestration of application workloads. This cluster master is provided as a managed Azure resource abstracted from the user. Nodes run your application workloads. An AKS cluster has one or more nodes, which is an Azure virtual machine (VM) that runs the Kubernetes node components and container runtime. Azure AKS features: Azure Load-balancing integration. Automatic scaling of your cluster’s node instance count coordinated application upgrades Hybrid Networking Workload Portability, on-premises and cloud Identity and Access Management Integration Logging and Monitoring Azure Container Registry Simplify container development by easily storing and managing container images for Azure deployments in a central registry with the additional features: Geo-replication Fine-grained access control. Existing CI/CD integrations Azure Service Fabric is a distributed systems platform that makes it easy to package, deploy, and manage scalable and reliable microservices and containers. Service Fabric also addresses the significant challenges in developing and managing cloud native applications. Developers and administrators can avoid complex infrastructure problems and focus on implementing mission-critical, demanding workloads that are scalable, reliable, and manageable. Service Fabric is Microsoft’s container orchestrator deploying microservices across a cluster of machines. Microservices can be developed in many ways from using the Service Fabric programming models, ASP.NET Core, to deploying any code of your choice. Azure Container Instances offers the fastest and simplest way to run a container in Azure, without having to manage any virtual machines and without having to adopt a higher-level service. In any case, for scenarios where you need full container orchestration, including service discovery across multiple containers, automatic scaling, and coordinated application upgrades, the best option is Azure Kubernetes Service (AKS). Azure Kubernetes Service (AKS) is a free container service. You pay only for the virtual machines, and associated storage and networking resources consumed. Azure Apps Service, Azure Cloud Services and Azure Batch Azure App Service enables you to build and host web apps, mobile back ends, and RESTful APIs in the programming language of your choice without managing infrastructure with four components: Web Apps; Build and deploy web apps faster at scale Web App for Containers; Deploy and run containerized web apps Mobile Apps; Build mobile apps for any device API Apps Easily build and consume APIs Features: Wide Selection of Application Platforms; Java, .NET, Node.js, PHP, Python and Docker to deploy your web and mobile applications. Auto-scaling High availability Supports both Windows and Linux Enables automated deployments from GitHub, Azure DevOps, or any Git repository Monitoring, Logging, and Tracing Management and Updates Azure Cloud Services is an example of a platform as a service (PaaS). Like Azure App Service, this technology is designed to support applications that are scalable, reliable, and inexpensive to operate. In the same way that App Service is hosted on virtual machines (VMs), so too is Azure Cloud Services. However, you have more control over the VMs. You can install your own software on VMs that use Azure Cloud Services, and you can access them remotely. There are two types of Azure Cloud Services roles. The only difference between the two is how your role is hosted on the VMs: Web role: Automatically deploys and hosts your app through IIS. Worker role: Does not use IIS, and runs your app standalone. Both App Services and Cloud Services provide a lot of good features and are a simple way to deploy your applications to the Microsoft Azure cloud. The primary differentiating factor is Cloud Services offers access to the underlying Azure VMs, and App Services do not. However App Services is more convenient for these specific reasons: Combine multiple applications together to save money Free deployment slots Faster deployments Azure Batch allows run large-scale parallel and high-performance computing (HPC) batch jobs efficiently in Azure. Azure Batch creates and manages a pool of compute nodes (virtual machines), installs the applications you want to run, and schedules jobs to run on the nodes. There is no cluster or job scheduler software to install, manage, or scale. Instead, you use Batch APIs and tools, command-line scripts, or the Azure portal to configure, manage, and monitor your jobs. Features: Support for multi-node parallel jobs Granular job definitions Simple job dependency modeling Support for popular workflow engines Dynamic compute resource provisioning and scaling Priority-based job scheduling Integrated monitoring and logging Fine-grained access control Azure App Service pricing is per hour with a cost depending on the plan; Shared (free), Basic, Standard, Premium and Isolated. Azure Cloud Services pricing is per hour with a cost depending on the VM chosen Azure batch pricing is per hour with a cost depending on the VM chosen. You can also select low priority VM for higher discounts. Azure Spring Cloud (preview) Azure Spring Cloud makes it easy to deploy Spring Boot-based microservice applications to Azure with zero code changes. Spring Cloud provides lifecycle management using comprehensive monitoring and diagnostics, configuration management, service discovery, CI/CD integration, blue-green deployments, and more. Azure Functions Language Runtimes Azure Functions natively supports C#-F# (.NET Framework 4.7 &.NET Core 2.2 &3.1), JavaScript (Node 6,8,10 & 12) Java 8 , Python 3.6, 3.7 & 3.8 and PowerShell Events and Triggers HTTP & Webhooks Blob Storage Cosmos DB Event Grid Event Hubs Microsoft Graph Events Queue storage Service Bus Timer Timeout Function execution time is limited by the timeout duration, which you can specify at function deployment time. A function times out after 5 minutes by default, but you can extend this period up to 10 minutes. When function execution exceeds the timeout, an error status is immediately returned. However with Premium and App Service plan you can have up to 60 minutes of timeout. SLA Monthly Uptime Percentage <= 99.95% Billing Model Azure Functions consumption plan is billed based on per-second resource consumption and executions. Consumption plan pricing includes a monthly free grant of 1 million requests and 400,000 GB-s of resource consumption per month per subscription in pay-as-you-go pricing across all function apps in that subscription. Azure Functions Premium plan provides enhanced performance and is billed on a per second basis based on the number of vCPU-s and GB-s your Premium Functions consume. Customers can also run Functions within their App Service plan at regular App Service plan rates. Alibaba Cloud Compute Services Alibaba Elastic Compute Service (ECS) Technology Transition from Xen to KVM since 2014 ECS Bare Metal Instance have a custom Hypervisor with nested virtualization SLA Monthly Uptime Percentage to Customer of at least 99.95% Machine Types Selection of instance types (families) optimized to fit different use cases Up to 208 VCPU & 3,8 TB Memory. GPU Up to 8 GPU & 256 GB of GPU Memory Machine Options Dedicated Instances Pay as you go – A postpaid method in which you can pay after using the instance. Instance usage is billed on a minute basis, and the billing unit is US$/hour. Reserved Virtual Machine Instances (subscription) – A prepaid method that allows you to use an instance only after you make the payment for it. Instance usage is billed on a monthly basis, and the billing unit is USD/month. Spot (Preemptible instances) you can set a maximum price per hour to bid for a specified instance type. If your bid is higher than or equal to the current market price, your instance is created and billed according to the current market price. You can hold a preemptible instance without interruption for at least one hour. After one hour, your bid is compared with the market price. When the market price exceeds your bid or the resource stock is insufficient, the instance is automatically released. Dedicated Hosts – Dedicated Host (DDH) is a host service that allows a tenant to use dedicated hardware resources based on Alibaba Cloud virtual hosting services. This service enables enterprises to achieve custom deployment, bring your own license (BYOL), and security and regulation compliance. DDH supports multiple types of ECS instances. Disks (Block & File Devices) Alibaba Disk Storage Cloud disks that can be attached to only one ECS instance in the same zone of the same region. System disks: have the same life cycle as the ECS instance to which it is mounted. A system disk is created and released at the same time as the instance. Shared access is not allowed. Up to 500GB. Data disks: can be created separately or at the same time as ECS instances. A data disk created with an ECS instance has the same life cycle as the instance, and is created and released along with the instance. Data disks created separately can be released independently or at the same time as the corresponding ECS instances. Shared access is not allowed. Performance-based category; ESSD, SSD, Ultra Cloud Disks and Basic Cloud disks up to 32TB per disk. Shared Block Storage is a block-level data storage service with strong concurrency, high performance, and high reliability. It supports concurrent reads from and writes to multiple ECS instances. Shared Block Storage can be mounted to a maximum of 8 ECS instances. SSD and Ultra Cloud Disks up to 32TB per disk. Local disks are the disks attached to the physical servers (host machines) on which ECS instances are hosted. They are designed for business scenarios requiring high storage I/O performance. Local disks provide local storage and access for instances, and feature low latency, high random IOPS, high throughput, and cost-effective performance. SSD up to 8x 1,788GB GB and SATA HDD up to 154TB Alibaba NAS A storage space designed to store massive amounts of unstructured data that can be accessed by using standard file access protocols , such as the Network File System (NFS) protocol for Linux, and the Common Internet File System (CIFS) protocol for Windows. You can set permissions to allow different clients to access the same file at the same time. NAS is suitable for business scenarios such as file sharing across departments, non-linear file editing, high-performance computing, and containerization (such as with Docker). Support Petabytes of data Auto Scaling Auto Scaling automatically adjusts the volume of your elastic computing resources to meet your changing business needs. Based on the scaling rules that you set, Auto Scaling automatically adds ECS instances as your business needs grow to ensure that you have sufficient computing capabilities. When your business needs fall, Auto Scaling automatically reduces the number of ECS instances to save on costs. Auto Scaling provides a health check function and automatically monitors the health of ECS instances within scaling groups, so the number of healthy ECS instances in a scaling group does not fall below the minimum value that you set. Billing Model Pay as tou go: Instance usage is billed on a minute basis, and the billing unit is US$/hour. Discounts: Spot Instances (Preemptible instances). Depending of the bid price; the maximum hourly price you are willing to pay. Greater than 60% (around 80%) Reserved Virtual Machine Instances (monthly subscription) up to 60% Dedicated Host depending on your legacy SW licenses reutilizations Other Linux & Windows Support Public and Custom Image Support Snapshot support Start & Termination Script Elastic IP Addresses Cloud migration tool Container Service, Container Service for Kubernetes & Elastic Container Instance (ECI) Container Service for Kubernetes provides the high-performance and scalable container application management service, which enables you to manage the lifecycle of enterprise-class containerized applications by using Kubernetes. By simplifying the setup and capability expansion of cluster and integrating with the Alibaba Cloud abilities of virtualization, storage, network, and security, Container Service for Kubernetes makes an ideal running cloud environment for Kubernetes containers with two modes: Classic dedicated Kubernetes mode: You can get more fine-grained control over cluster infrastructure and container applications, for example, select the host instance specification and the operating system, specify Kubernetes version, custom Kubernetes attribute switch settings, and more. Alibaba Cloud Container Service for Kubernetes is responsible for creating the underlying cloud resources for the cluster, upgrading and other automated operations for the cluster. You need to plan, maintain, and upgrade the server cluster. You can add servers to or remove servers from the cluster manually or automatically. Serverless Kubernetes mode: You do not need to create the underlying virtualization resource. To launch the application directly, use Kubernetes commands to specify the application container image, CPU and memory requirements as well as external service methods. Dedicated Kubernetes cluster You must create three Master nodes and one or multiple Worker nodes for the cluster. In addition, you need to plan, maintain, and upgrade the cluster as needed. With such a Kubernetes cluster, you can control cluster infrastructures in a more fine-grained manner. Managed Kubernetes cluster You only need to create Worker nodes for the cluster, and Alibaba Cloud Container Service for Kubernetes creates and manages Master nodes for the cluster. This type of Kubernetes cluster is easy to use, low-cost, and highly available. You can focus on the services supported by the cluster without needing to operate and maintain the Kubernetes cluster Master nodes. Serverless Kubernetes cluster You do not need to create or manage any Master nodes or Worker nodes for the cluster. You can directly use the Container Service console or the command line interface to set container resources, specify container images for applications, set methods to provide services, and start applications. Alibaba Container Service for Kubernetes features: Alibaba Load-balancing integration. Automatic scaling of your cluster’s node instance count Hybrid Networking Workload Portability, on-premises and cloud Identity and Access Management Integration Logging and Monitoring Container Registry allows you to manage images throughout the image lifecycle. It provides secure image management, stable image build creation across global regions, and easy image permission management. This service simplifies the creation and maintenance of the image registry and supports image management in multiple regions. Combined with other cloud services such as Container Service, Container Registry provides an optimized solution for using Docker in the cloud. Alibaba Container Service provides the high-performance and scalable container application management service, which enables you to manage the lifecycle of containerized applications by using Docker and Kubernetes. Container Service provides multiple application release methods and the continuous delivery ability, and supports microservice architecture. By simplifying the setup of container cluster and integrating with the Alibaba Cloud abilities of virtualization, storage, network, and security, Container Service makes an ideal running cloud environment for containers. Elastic Container Instance (ECI) is an agile and secure serverless container instance service. You can easily run containers without managing servers. Also you only pay for the resources that have been consumed by the containers. ECI helps you focus on your business applications instead of managing infrastructure. You can quickly and easily deploy containers to the cloud through a two-step procedure. An ECI container group is similar in concept to a pod in Kubernetes. In any case, for scenarios where you need full container orchestration, including service discovery across multiple containers, automatic scaling, and coordinated application upgrades, the best option is Container Service for Kubernetes. Container Service is currently free of charge. Resources used in collaboration with Container Service (including Server Load Balancer and ECS) are charged separately. For ECI you incur charges based on the number of Elastic Container Instances (ECI) you use. Simple Application Server & Batch Compute Simple Application Server suits you well if all you need is a private virtual machine. It provides you the all-in-one solution to launch and manage your application, set up domain name resolution, and build, monitor, maintain your website with just a few clicks. It makes private server building much easier, and it is the best way for beginners to get started with cloud computing. Scenarios of the Simple Application Server: Building a small-sized website Building a personal blog Building a forum/community Building a knowledge or efficiency management tool Building a personal learning environment Building a small E-commerce website Building a development environment Batch Compute is a distributed cloud service suitable for processing massive volumes of data concurrently. Batch Compute supports massive concurrent jobs. The system automates resource management, job scheduling, and data loading and supports billing on a Pay-As-You-Go basis. In terms of nonprofessional, Batch Compute allows you to submit any computing program to be run on multiple Alibaba Cloud virtual machine (VM) instances. Then, the results are written to a specified persistent storage location (such as Alibaba Cloud OSS or NAS) where you can view them. Features: Support for multi-node parallel jobs Granular job definitions Job scheduling Dynamic compute resource provisioning and scaling Integrated monitoring and logging Fine-grained access control Simple Application Server provides a monthly package of resources at a fixed charge and currently supports monthly and yearly pre-payment payment methods. With Batch Compute, you pay for the compute and storage resources consumed by your jobs or clusters. There is no additional charge on resource management and job scheduling services. Functions Compute Language Runtimes Java 8, Node.js 6 & 8, PHP 7.2 and Python 2.7 & 3.6 Events and Triggers HTTP Alibaba Cloud Object Storage Service (OSS) CDN events Timer MNS topic Table Store Log Service Timeout The default function timeout is 3 seconds. Function timeout can be set with any value between 1 and 600 seconds.. SLA Monthly Uptime Percentage <= 99.95% Billing Model Alibaba Cloud Function Compute is billed on a Pay-As-You-Go basis. The fee consists of three parts: The total number of function calls Execution duration starts when your codes begin to be run and end when the result is returned or execution is terminated. The measurement granularity is 100 milliseconds. The duration price depends on the memory size that you have allocated to functions. Public Network Traffic Consumption plan pricing includes a monthly free grant of 1 million requests and 400,000 GB-s of resource consumption per month. Public Cloud Compute Services: IaaS Comparison AWS GCP Azure Alibaba Virtualization Technology Xen KVM Customized version of Hyper-V Transition from Xen to KVM since 2014 Nested Virtualization Partial in i3.metal instance Nested virtualization can only be enabled for L1 VMs running on Haswell processors or later (KVM & linux) Yes Linux and Windows Yes in ECS Bare Metal Instance SLA (Monthly Uptime Percentage to Customer) 99.99% 99.99% 99.99% 99.99% Machine Types and Sizes Up to 96 VCPU & 768 GB Memory Up to 16 GPU & 54 GB of GPU Memory Up to 416 VCPU & 1.4 TB Memory (mega-memory machine types) Up to 8 GPU & 96 GB of GPU Memory Up to 416 VCPU & 11.4 GB Memory Up to 8 GPU & 96 GB of GPU Memory Up to 208 VCPU & 3.8 TB Memory Up to 8 GPU & 256 GB of GPU Memory Machine Options Dedicated Instances. – On Demand Instances (seconds & hourly) – Reserved Instances (1-3 years) – Spot Instances Dedicated Hosts – On Demand Instances (seconds & hourly) – Reserved Instances (1-3 years) On Premises – AWS Outposts that allows run AWS infrastructure and services on premises Dedicated Instances. – On Demand Instances (seconds) – Reserved Instances (1-3 years) – Spot Instances (Preemptible VM) – Shielded VM Dedicated Hosts (Sole-tenant nodes) – On Demand Instances (seconds & hourly) – Reserved Instances (1-3 years) On Premises – Anthos GKE on-prem (GKE on-prem) is hybrid cloud software that brings GKE to on-premises data centers Dedicated Instances. – On Demand Instances (seconds & hourly) – Reserved Instances (1-3 years) – Spot Instances (low-Priority VM) Dedicated Hosts (Isolated VM) – On Demand Instances (seconds & hourly) – Reserved Instances (1-3 years) On Premises – Azure Stack Portfolio is an extension of Azure to consistently build and run hybrid applications across datacenters, edge locations, remote offices, and cloud Dedicated Instances. – On Demand Instances (minutes) – Reserved Instances (Monthly) – Spot Instances (Preemptible VM) Dedicated Hosts – Reserved Instances (Monthly) Disks (Block & File Devices) Instance store volumes attached to the host computer for the instance SSD (up to 60 TB) and Magnetic (up to 48TB) Elastic Block Storage (EBS) attached to any running instance that is in the same Availability Zone. SSD (up to 16 TB) and Magnetic (up to 16TB) Cloud File Storage allow access to data through shared file systems (petabytes of data) Local Disk attached to the host computer for the instance SSD (up to 9 TB) Persistence Disk attached to any running instance that is in the same Zone o Region. SSD (up to 64 TB) and Magnetic (up to 64TB) with the option to achieve 257 TB File server allow access to data through shared file systems (petabytes of data) RAM disks share instance memory (use the ram memory of the instance) Azure Disk Storage virtual hard disk (VHD) attached to the host computer for the instance. ultra solid-state-drives (SSD) (preview) up to 65 TB, premium SSD, standard SSD, and standard hard disk drives (HDD) that supports up to 32 TB Azure File Storage allow access to data through shared file systems (5TB per share) Cloud Disk attached to the host computer for the instance ESSD, SSD, Ultra Cloud Disks and Basic Cloud disks up to 32TB per disk. Shared Block Storage attached to any running instance that is in the same Availability Zone. SSD and Ultra Cloud Disks up to 32TB per disk Local disks are the disks attached to the physical servers (host machines) on which ECS instances are hosted. SSD up to 8×1.78 TB and SATA HDD up to 154TB Alibaba NAS allow access to data through shared file systems (petabytes of data) Autoscaling Scaling options – Manual – Schedule – Dynamic policies – Monitoring policies Cooldowns support Shutdown script Health check support Removal Policy Scaling options – Dynamic policies – Monitoring policies Cooldowns support Shutdown script Health check support Scaling options – Manual – Schedule – Dynamic policies – Monitoring policies – Application policies Cooldowns support Shutdown script (preview) Health check support Notification & webhooks support Scaling options – Manual – Schedule – Dynamic policies – Monitoring policies Cooldowns support Shutdown script Health check support Removal Policy Billing Model On Demand: Pricing is per instance-hour (Each partial instance-hour consumed will be billed per-second for Linux Instances and as a full hour for all other instance types) Discounts: – Spot Instances up to 90% – Reserved Instances (1-3 Years) up to 75% On Demand: Pricing is per instance-second (minimum 1 minute) Discounts: – Spot (Pre-emptible) Instances up to 80% – Reserved Instances (1-3 Years) up to 70% – Sustained use discounts (When an instance uses a vCPU for more than 25% of a month) up to 30% On Demand: Pricing is per instance-second Discounts: – Spot (low-Priority VM) Instances up to 80% Linux and 60% Windows – Reserved Instances (1-3 Years) up to 72% On Demand: Pricing is per instance-minute Discounts: – Spot (Pre-emptible) up to 60%-80% – Reserved Instances (monthly) up to 60% Other – Linux & Windows Support – Public and Custom Image Support – Snapshot support – Migration tool & methodology – Lightweight version (lightsail) – VMware Cloud on AWS – Parallel Cluster management based on Opensource – Linux & Windows Support – Public and Custom Image Support – Snapshot support – Migration tool & methodology – Managed and unmanaged Instance Groups Support – Linux & Windows Support – Public and Custom Image Support – Snapshot support – Migration tool & methodology – Update and Fault Domains – Azure CycleCloud An enterprise-friendly tool for orchestrating and managing High Performance Computing (HPC) environments on Azure – Linux & Windows Support – Public and Custom Image Support – Snapshot support – Cloud migration tool Public Cloud Compute Services: CaaS Comparison AWS GCP Azure Alibaba Custom Container Service EC2 Container Service (ECS) Azure Container Service (ACS) that will be retired on January 31, 2020 Alibaba Container Service Kubernetes Container Service Elastic Container Service for Kubernetes (EKS) -AWS Load-balancing integration. -Automatic scaling of your cluster’s node instance count -Automatic upgrades for your cluster’s node software -Hybrid Networking -Workload Portability, on-premises and cloud -Identity and Access Management Integration -Logging and Monitoring Google Kubernetes Engine (GKE) -GCP load-balancing integration – Node pools to designate subsets of nodes within a cluster for additional flexibility -Multi-zone Clusters or Regional Clusters -Automatic scaling of your cluster’s node instance count -Automatic upgrades for your cluster’s node software -Node auto-repair to maintain node health and availability -Hybrid Networking -Workload Portability, on-premises and cloud -Dashboard for GKE clusters and their resources. -Identity and Access Management Integration -Logging and Monitoring Azure Kubernetes Service (AKS) –Azure Load-balancing integration. -Automatic scaling of your cluster’s node instance count -coordinated application upgrades -Hybrid Networking -Workload Portability, on-premises and cloud -Identity and Access Management Integration -Logging and Monitoring Alibaba Container Service for Kubernetes with 3 options: – Dedicated Kubernetes cluster – Managed Kubernetes cluster – Serverless Kubernetes cluster –Alibaba Load-balancing integration. -Automatic scaling of your cluster’s node instance count -Hybrid Networking -Workload Portability, on-premises and cloud -Identity and Access Management Integration -Logging and Monitoring Registry service Amazon ECR Registry – Fine-grained access control. -Existing CI/CD integrations Google Container Registry – Perform vulnerability analysis – Fine-grained access control. -Existing CI/CD integrations Azure Container Registry – Geo-replication – Fine-grained access control. -Existing CI/CD integrations Alibaba Container Registry Billing Model – Amazon EKS cluster (per hour) – AWS resources you create to run your Kubernetes worker nodes. – Node instances according to VM, Storage and Network pricing – Node instances according to VM, Storage and Network pricing – Node instances according to VM, Storage and Network pricing Other Services Google Cloud Build to executes your builds on Google Cloud Platform’s infrastructure. GKE on premises with Anthos GKE on-prem that brings Google Kubernetes Engine (GKE) to on-premises data centers. Knative provides an open API and runtime environment that enables you to run your serverless workloads anywhere you choose Service Fabric Microsoft’s container orchestrator deploying microservices across a cluster of machines. Microservices can be developed in many ways from using the Service Fabric programming models, ASP.NET Core, to deploying any code of your choice. Azure Container Instances (ACI) offers the fastest and simplest way to run a container in Azure, without having to manage any virtual machines and without having to adopt a higher-level service. Elastic Container Instance (ECI) is an agile and secure serverless container instance service. You can easily run containers without managing servers. Public Cloud Compute Services: AaaS Comparison AWS GCP Azure Alibaba Web Apps AWS Elastic Beanstalk -Wide Selection of Application Platforms; Java, .NET, Node.js, PHP, Ruby, Python, Go, and Docker to deploy your web applications. -Variety of Application Deployment Option (Visual Studio and Eclipse) -Monitoring, Logging, and Tracing -Management and Updates -Scaling -AWS Resources Customization Google App Engine (Standard and Flexible environment) – Fully managed serverless application platform -Wide Selection of Application Platforms; Java, PHP, Node.js, Python, C#, .Net, Ruby and Go and Docker to deploy your web applications. -Variety of Application Deployment Option (Cloud Source Repositories, IntelliJ IDEA, Visual Studio) -Monitoring, Logging, and Diagnostics -Application Versioning -Scaling -GCP Resources Customization – Traffic Splitting – Application Security Azure App Service –Wide Selection of Application Platforms; Java, .NET, Node.js, PHP, Python and Docker to deploy your web and mobile applications. -Auto-scaling -High availability -Supports both Windows and Linux -Enables automated deployments from GitHub, Azure DevOps, or any Git repository -Monitoring, Logging, and Tracing -Management and Updates Cloud Services Offers access to the underlying Azure VMs Azure Spring Cloud (preview) Azure Spring Cloud makes it easy to deploy Spring Boot-based microservice applications to Azure with zero code changes Simple Application Server It provides you the all-in-one solution to launch and manage your application, set up domain name resolution, and build, monitor, maintain your website with just a few clicks. Focus on beginners to get started with cloud computing. Batch Apps AWS Batch –Support for multi-node parallel jobs -Granular job definitions -Simple job dependency modeling -Support for popular workflow engines -Dynamic compute resource provisioning and scaling -Priority-based job scheduling -Dynamic spot bidding -Integrated monitoring and logging -Fine-grained access control App Engine Cron Service (basic batch only scheduling tasks) Batch on GKE A cloud-native solution for scheduling and managing batch workloads. With Batch, you can leverage the on-demand and flexible nature of cloud. Batch is based on Kubernetes and containers so your jobs are portable. Cloud Dataflow – Based on Apache Beam (java & Python) -Automated Resource Management -Dynamic Work Rebalancing -Horizontal Auto-scaling Azure Batch –Support for multi-node parallel jobs -Granular job definitions -Simple job dependency modeling -Support for popular workflow engines -Dynamic compute resource provisioning and scaling -Priority-based job scheduling -Integrated monitoring and logging -Fine-grained access control Batch Compute –Support for multi-node parallel jobs -Granular job definitions -Job scheduling -Dynamic compute resource provisioning and scaling -Integrated monitoring and logging -Fine-grained access control Billing Model You pay only for AWS resources (e.g. EC2 instances or S3 buckets) you create to store and run your application App Engine flexible you pay only for the resources allocated App Engine standard environment are deployed to instance classes that you specify, that have a cost per hour per instance. Cloud Dataflow service usage is billed in per second increments, on a per job basis. Azure App Service pricing is per hour with a cost depending on the plan. Azure Cloud Services pricing is per hour with a cost depending on the VM chosen Azure batch pricing is per hour with a cost depending on the VM chosen. You can also select low priority VM for higher discounts. Simple Application Server provides a monthly package of resources at a fixed charge and currently supports monthly and yearly pre-payment payment methods. Batch Compute, you pay for the compute and storage resources consumed by your jobs or clusters. Public Cloud Compute Services: FaaS Comparison AWS GCP Azure Alibaba Language Runtimes – JavaScript (Node.js 12 & 10) – Python 3.8-3.6-3.7-2.7, – Go (1.x) – Java 8-11 – PowerShell – C# (.Net Core 3.1 and 2.1) – Ruby 2.7-2.5 – JavaScript (Node.js 6-8-10) – Python (3.7.6) – Go (1.11 and 1.13(beta)) – JavaScript (Node.js 6,8,10&12) – Python 3.6-3.7-3.8 – Java 8 – C#-F# (.NET Framework 4.7 &.NET Core 2.2 & 3.1) – JavaScript (Node.js 6 & 8) – Python 2.7 & 3.6 – Java 8 – PHP 7.2 SLA (Monthly Uptime Percentage to Customer) <= 99.95% <= 99.5% <= 99.95% <= 99.95% Events and Triggers – HTTP— HTTP requests. – Amazon S3 – Amazon DynamoDB – Amazon Kinesis Data Streams – Amazon Simple Notification Service – Amazon Simple Email Service – Amazon Simple Queue Service – Amazon Cognito – AWS CloudFormation – Amazon CloudWatch Logs – Amazon CloudWatch Events – AWS CodeCommit – Scheduled Events (powered by Amazon CloudWatch Events) – AWS Config – Amazon Alexa – Amazon Lex – Amazon API Gateway – AWS IoT Button – Amazon CloudFront – Amazon Kinesis Data Firehose – Other Event Sources: Invoking a Lambda Function On Demand – HTTP— HTTP requests. – Cloud Storage – Cloud Pub/Sub – Cloud Firestore -Firebase (Realtime Database, Storage, Analytics, Auth) –Stackdriver Logging—forward log entries to a Pub/Sub topic by creating a sink. You can then trigger the function – HTTP & Webhooks – Blob Storage – Cosmos DB – Event Grid – Event Hubs – Microsoft Graph Events – Queue storage – Service Bus – Timer – HTTP— HTTP requests. – Alibaba Cloud Object Storage Service (OSS) – CDN events – Timer – MNS topic – Table Store – Log Service Timeout Default 3 Seconds. Up to 15 Minutes Default 1 Minute. Up to 9 Minutes Default 5 Minute. Up to 10 Minutes. (with Premium and App Service plan you can have up to 60 minutes of timeout) Default 3 Seconds Up to 10 Minutes Billing Model Number of requests + Execution time + Memory allocated + Networking. Outbound data transfer 1M free requests per month and 400,000 GB-seconds of compute time per month Number of requests + Compute time + Memory allocated + Networking. Outbound data transfer 2M free requests per month regardless duration Number of requests + Execution time + Memory allocated + Networking. Outbound data transfer 1M free requests per month and 400,000 GB-seconds of compute time per month Customers can also run Functions within their App Service plan at regular App Service plan rates Number of requests + Execution time + Memory allocated + Public Network Traffic 1M free requests per month and 400,000 GB-seconds of compute time per month Conclusion The cloud computing services offered by AWS, GCP, Azure and Alibaba can be analyzed from two perspectives; From the point of view of the services offered in each layer From the point of view of the developer who has to create an application in the cloud. From the point of view of the evolution during 2019-2020 From the point of view of the layers offered Iaas Currently the four Vendors analyzed offer very similar services with price and SLA models also almost identical. The decision to choose a vendor depends on factors such as; The presence of the vendor in your country and the level of commitment when making discounts. In this aspect Azure seems to have more freedom when it comes to adapting to the needs of a client. The need to use other services of the provider. For example, if a Big Data and Analytics service is required, GCP is probably the best option. If a full stack cloud service platform is required, AWS could be the best option. If your platform is Microsoft, Azure should be the first vendor to evaluate. If your market is mainly China, Alibaba could be an option. The presence of Data Centres from the vendor in your country that allows low-latency hybrid solutions. The level of support of hybrid approach that clearly is the 2019-2010 big trend Probably in IaaS environment it may be convenient to have two suppliers to be able to contrast services and price. CaaS CaaS in my opinion is the future of cloud computing services and more specifically Kubernetes that allows you a portability of your solution to other vendor and the ability to define hybrids cluster. In this aspect, GCP has the lead since it was the precursor of Kubernetes. However, both AWS, Azure and Alibaba have rotated quickly to include Kubernetes as their star solution for CaaS. This is where big changes are taking place (as in fact has happened in 2019); Offer Kubernetes in Serverless mode Expand the registry services to cover the full application life cycle management Include a Microservices Architecture Include Batch services (already cover by GKE) Although GCP is the one that has more experience about Kubernetes, we must be aware to the movements of the rest of the vendors who have understood the relevance of offering CaaS over Kubernetes. AaaS AaaS (Application as a Service) was an attempt by the vendors to offer a simplified Web development. However, it will generate a Lock-in with the vendor that is currently not acceptable if we want to ensure the future portability of our applications. The trend is that under CaaS with Kubernetes begins to offer serverless models and microservices architecture that does not tie you to the provider. Under AaaS there are also batch solutions, where the scheduler and job management model is very similar, but in my opinion, a more portable solution under Kubernetes will be offered soon. So in summary, try to avoid the AaaS offering by the vendors (with the exception of Batch solutions that have a low lock-in and there is not alternatives in other layers) FaaS Regarding FaaS, like GCP with Kubernetes, AWS was the precursor of the serverless model of functions offering the most robust and integrated solution. The rest of the providers quickly included this capacity with a similar approach. The good news is that with a minimal architecture layer it is possible to develop easily portable functions among vendors. The bad news is that the functions are not for a general purpose and applies only to use cases that do not require a guaranteed latency (although it is possible that some vendor with an additional cost could guarantees latency) and do not contain long-term processes. From the point of view of the developer who has to create an application in the cloud. If I were a developer, my bet would be clearly towards a CaaS docker model orchestrated by Kubernetes complemented with FaaS for streaming processing and some simple microservices. The registry Service offer by the vendors is a good starting point but it is necessary to strengthen the life cycle management with products like: Helm (Package manager for Kubernetes) Spinnaker (Continuous Delivery Platform aligned with Kubernetes) Jenkins (Continuous Integration Platform) In this area GCP are defining Devops model for Kubernetes that could be also a reference. Finally during 2019 GCP provides a batch solution over GKE that covers the big gap that had previously. From the the point of view of evolution during 2019-2020 The main evolutions of the computer service of the four providers during 2019-2020 have focused on: Provide hybrid cloud capabilities. Here the forerunner was Azure that has expanded the reach and followed by AWS with its service of AWS Outposts and GCP with its solution of Kubernetes Anthos GKE on-premises Allow High Performance Computing (HPC) Configurations with parallel cluster and management tools Add serverless capabilities to container solutions Implement Microservices alternatives such as support for Spring Boot-based microservice applications in Azure Improve the development life cycle of CaaS and FaaS solutions Cover necessary gaps such as Batch support in GCP... Read more...
- Advantages and disadvantages of moving workloads to a Public Cloud
  Table of Contents Advantages of Public CloudDisadvantages of Public CloudNo requirement of IT InvestmentPay per UseElasticity/ScalabilitySelf Management-AutomationGlobal DeploymentReliability CostSecurity & ComplaintSLA responsibilityVendor Lock-inLatency in Hybrid models Advantages of Public Cloud No requirement of IT investment Pay per Use Elasticity/Scalability Key services out of the box; increase speed deployment & Agility Self Management-Automation Global Deployment Reliability Cost Security & Complaint Ecosystem of additional services Disadvantages of Public Cloud Cost Security & Compliant SLA responsibility Vendor Lock-in Latency in Hybrid models Let’s see in detail each characteristic of the public Cloud and when it is an advantage or inconvenience. No requirement of IT Investment It is one of the clearest advantages of Public Cloud , along with the payment per use. This capacity is what allows new businesses to take off without the need for heavy investments and the possibility of carrying out multiple tests and errors without financial consequences.The saving of IT investments not only covers the HW, SW, Network and Security, but also the need for a CPD and all the expenses involved. Pay per Use The payment for use is one of the main characteristics of the Public Cloud that allows to pay exclusively for the use of a resource (time or requests) while it is active and to stop paying when it is inactive without any commitment of permanence.The concept of pay-per-use has many nuances in the Public Cloud and it is fundamental to understand them. For example while executing a VM instance a charge is made for hours or minutes and when the instance is deleted the charge stops occurring. But there is also the option to stop the instance without deleting it where the charge would also stop (except for the reserved disk).Another example is the functions where the concept of payment for use is mapped to the number of invocations regardless of the time that has passed between them.Pay per Use is an advantage for variable workload, however, as we are going to see in the cost feature, for steady workload other options are better. Elasticity/Scalability Elasticity/Scalability is one of the best known features of the cloud public, but it has a trick. We are talking mainly about horizontal Elasticity/Scalability. The vertical scalability provided by the Public Cloud is limited and far from the high-end servers and mainframes of traditional companies. In addition in most of the Public Cloud the vertical elasticity is also limited, forcing you to relaunch the VM Instance to increases/change the assigned CPU. Therefore, in order to make use of the much-proclaimed Elasticity/Scalability of Public Cloud, it is necessary that your applications are designed to be able to scale horizontally; micro services, NoSQL, stateless…. Key services out of the box; increase speed deployment & AgilityThe Public Cloud offers out of the box all the services necessary to deploy business applications: Compute, Networking, Storage and Databases, Middleware, Management & Development tools, Identity & Security, Big Data, Machine Learning, …In addition, together with these services, they offer specific patterns for each use case and industry, which greatly accelerates the deployment of applications. Self Management-Automation By definition, the Public Cloud is based on the Software defines approach, which implies a high level of automation. All the configuration of a solution in the cloud is defined in a parametrized way and tools are offered to deploy all the resources required by your application in a declarative format (based on templates to programmatically control what gets deployed). This is a clear advantage over legacy CPD.In addition all the cloud services and resources are integrated with a common Monitoring, Logging and Error Reporting system. Global Deployment The Public Cloud allows applications to be deployed globally by replicating solution configurations in the corresponding regions in a quick and economical way.They also offer storage service that is replicated automatically between regions allowing users to access content in an efficient manner. Finally they also offer a Global Content Delivery Network (CDN) globally distributed edge points around the world to accelerate content delivery for websites and applications. Reliability The Public Cloud allows the deployment of low cost Disaster Recovery solutions. Since it is possible to define a complete solution configuration by SW, it is not necessary to have a replica of the CPD waiting for a problem.For those cases where fast high availability is required, all Public Clouds offer a global balancing system that allows redistributing the load between different zones and regions.In general Cloud disaster recovery systems can be deployed much more quickly and with better control over your resources. Cost The cost of Public Clouds is another topic of debate. There are certain use cases in which the cost of a Public Cloud is clearly more optimal, but in other cases they require a more detailed analysis.The optimal uses cases in terms of costs to move workload to the Public Cloud are: Those that require a new investment in infrastructure Fluctuating work load Global workload For 24×7 stable workloads, further analysis is necessary. In fact, the pay-per-use model is usually not the best option and you have to change to a subscription model with a commitment to use where discounts can reach 70%.In any case, the management of costs in a Public Cloud is completely different from a traditional environment, so it is necessary to assign a specific staff to monitor invoices and constantly identify options for improvement in costs by volume or billing model. In addition, it is required an exhaustive control of resources not used to be eliminated from invoicing. Security & Complaint Security is an area in which all Public Cloud are focusing.Currently the Public Cloud compliance and security level is much higher than a traditional CPD, but unfortunately there are still certain regulatory aspects in each country that require additional and specific approval by the local regulatory Entity. In addition the Public Cloud provider manages security of the cloud. Security in the cloud is the responsibility of the customer. That means that you need to install and configure additional layers of security. Ecosystem of additional servicesHaving access to the entire ecosystem of cloud services is one of the main advantages of cloud applications. In addition this ecosystem is constantly growing and improving incorporating new trends (IoT, AI, …). SLA responsibility SLA responsibility is another controversial issue in the Public Cloud. Although service levels are well defined, the counterpart is not clear in case of an impact on the company. In fact, in the last cases of breach of service level the lawyers of the Public Cloud managed to soften any type of compensation. Vendor Lock-in Vendor Lock-in is a risk when moving to a specific Cloud. However, with a good architecture that isolates applications from the dependencies of each cloud platform, it is possible to reduce it. In fact, using containers and functions together with an Architecture layer to reduce dependencies should reduce the lock-in. You should avoid what the vendors call Apps as a Service, and in addition the use of multi-cloud database (like Mongo DB) also reduces the level of lock-in at database layer. Finally another area to reduce the lock-in is the deployment tool and language used for provision/config all the infrastructure resources in your cloud environment. Again, try to use standard solutions (like Cheff or Puppet) and build a layer to isolate dependencies. Latency in Hybrid models Latency in hybrid models is an aspect to consider when it comes to migration to Public Cloud based on an on-premise solution.This Latency can be mitigated with dedicated communication lines and by choosing autonomous workloads with low dependence on legacy systems.Additionally, if one of the areas of the public provider is fortunately in the same geographical area as the on-premise CPD, it would be possible to establish low-latency communications. ... Read more...