Memo: Innovation with data
2 min readMay 15, 2024
this section focus on GCP data management service — prepare for GCP CDL.
<Storage>
- Persistent disk: Block storage for VM instance.
- Cloud Storage Service (GCS): Serverless object storage with global edge caching with 3 types:
1. Standard: 0 day min storage duration, frequently access
2. Nearline: 30 day min storage duration, access once per month
3. Coldline: 90 day min storage duration
4. Archive: 365 day min storage duration, rarely access - FileStore: File-system storage, high performance NFS file server
- Cloud Storage transfer service: migration storage data between cloud storage service (eg. s3 and GCS)
- Transfer appliance: transfer TB / PB data and faster to ship physical
<Database>
SQL:
- Cloud SQL: simplifies db management MySQL, PostgresSQL
- Cloud spanner: GCP’s relational db with Global distribute, HA and Scalability
- Cloud Big Query: serverless, High scalable Data warehouse using SQL queries, integrate with ML (BigQuery ML)
- Database Migration Service (DMS): serverless seamless migration db’s on-premise to CloudSQL
No-SQL:
- FireStore: Document db(KV), real-time
- Cloud BigTable: High-throughput fully management document db for large-scale analytic workload
Cache:
- Memorystore: high-performance caching
<Data analytics>
- Cloud BigQuery: Data warehouse of data
- Cloud Proc: Big data, batch process manage Apache Spark, Hadoop
- Cloud Prep: clean data, prepare for analytic
- Dataflow: realtime batch, process log and aggregate metric
- Data fusion: build data pipeline
- Pub/sub: Event stream
<Operations>
- Cloud monitoring: visibility performance, availability and health of cloud
- Cloud logging: monitor and alert log data