The Beginning: Amazon Web Services (AWS), Elastic Compute Cloud (EC2), rented VMs, launched in 2006
Programmers previously optimized when things were too slow, but now we need to optimized when it is too expensive (cost is not always obvious at the moment you are running a job so that need to do "back of the envelope" estimates until you get a bill)
Service Types:
IaaS (Infrastructure as a Service)
EC2, other services that feel closer to raw hardware
virtual disks, virtual network, some storage systems, etc.
cheap+flexible -- you can deploy anything on it (Cassandra, Kafka, etc).
PaaS (Platform as as Service)
Cloud provider has deployed systems on the infracture; you pay to use the deployed system
databases, application framework/platforms, ML training/deployment systems less flexible, easier to use
often more expensive (though not necessarily more than doing it yourself due to effiencies available to cloud provider but not you)
Lock In: If it's hard to move to a competing cloud, you are "locked in".
Compute
Memory
IaaS
memory is often roughly proportional to CPU resources
"memory optimized" VMs skew heavy on RAM (very expensive! at high end >10 TB)
PaaS: often open-sources platforms provided as a service.
Storage
VM disks are virtual block devices
Network
Common Pattern
Duration is calculated from the time your code begins executing until it returns or otherwise terminates, rounded up to the nearest 1 ms (check if you have a large number of small ops getting rounded up)
On-Demand vs. Spot Instances
Scaling and Billing
Google Architecture (early systems)
Google (Papers) => Hadoop (open-source software)
HDFS - Spark - Cassandra - Kafka
Colossus (HDFS)
Colossus is indirectly available to customers via GCS and other services
BigQuery (Spark)
Blurred analytics architecture
For analytics, we'll want a column-oriented format:
sucessor to ColumnIO in Google
optimized for repeated values
optimization: run-length encoding
optimization: dictionary encoding
protocol buffers (protobufs)
BigTable (Cassandra)
Kafka