Files
nexus/#recycle/Technical/Cloud & DevOps/What I know about Cloud Service Delivery.md
2026-03-23 20:57:45 +08:00

107 lines
5.0 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
## Cloud Service Delivery
Cloud Service Delivery encompasses **the entire lifecycle of making cloud services operational, available, secure, performant, and valuable to end-users and customers.**
**In essence, Cloud Service Delivery is the bridge between the raw capabilities of cloud technology (IaaS, PaaS, SaaS) and the reliable, secure, performant, and cost-effective services that businesses and users actually consume.**
Cloud Service Delivery Team:
- Cloud Infrastructure Engineer
- Cloud Operation Engineer (DevOps/SRE)
- Cloud Security Specialists
- Cloud Support Engineer
- Cloud FinOps Engineer
-
1. **Service Provisioning & Deployment:**
- Setting up cloud infrastructure (servers, storage, networking).
- Automating deployment of applications and platforms.
- Configuring services according to customer requirements.
- Managing resource allocation and scaling
- Best Practice
-
2. **Infrastructure Management:**
- Monitoring health, performance, and capacity of compute, storage, network resources.
- Patching and updating underlying infrastructure (hypervisors, hosts).
- Managing physical data center aspects (power, cooling, hardware lifecycle) _if using private/hybrid cloud_.
- Ensuring high availability and disaster recovery setups.
- Best Practice:
- AWS CloudWatch as a data source in Grafana Monitoring Tool
-
3. **Platform Management (for PaaS):**
- Managing middleware, databases, development tools, and runtime environments.
- Ensuring platform scalability, security, and performance.
- Applying patches and updates to platform components.
4. **Application Operations & Management (for SaaS/IaaS-hosted apps):**
- Monitoring application performance, uptime, and user experience.
- Deploying application updates and bug fixes.
- Managing application configuration and secrets.
- Ensuring application scalability and resilience.
-
5. **Security & Compliance Management:**
- Implementing and managing security controls (firewalls, IDS/IPS, encryption, IAM).
- Vulnerability scanning and patch management.
- Security incident monitoring and response.
- Ensuring compliance with regulations (GDPR, HIPAA, PCI-DSS, etc.).
- Auditing and logging management.
- Best Practice
- Cloud Application WAF management
- IP white list support to tenant level
- Security Scanning
- Security Guidance
6. **Performance & Availability Monitoring:**
- 24/7 monitoring of all service components (infrastructure, platform, application).
- Setting and tracking SLAs (Service Level Agreements) and SLOs (Service Level Objectives).
- Proactive detection and resolution of performance bottlenecks and potential failures.
- Managing incident response to outages or degradation.
- Best Practice:
- Service Availability Check (APM/BPM, New Relic, AWS CloudWatch Synthetic, Health Page)
- SLA -Service Level Agreement - 99.9% vs 99.99% [uptime](https://uptime.is/)
- SLO - Service Level Objective
- Proactive detection (Grafana Alerting different severity)
7. **Incident & Problem Management:**
- Responding to alerts and service disruptions.
- Troubleshooting issues across the stack.
- Restoring service quickly (incident management).
- Identifying root causes and implementing permanent fixes (problem management).
- Best Practice
8. **Change & Configuration Management:**
- Controlling and documenting changes to the cloud environment.
- Managing configurations consistently and securely (Infrastructure as Code - IaC).
- Minimizing risk associated with changes through testing and rollback plans.
9. **Cost Management & Optimization:**
- Monitoring cloud resource consumption and spending.
- Identifying and eliminating waste (idle resources, over-provisioning).
- Right-sizing resources.
- Utilizing reserved instances or savings plans effectively.
- Providing cost visibility and reporting.
10. **Customer Onboarding & Support:**
- Guiding new customers/users through setup and access.
- Providing user documentation and training resources.
- Operating a service desk/helpdesk for user issues and requests (ticketing system).
- Handling billing inquiries and account management.
-
11. **Service Governance & Lifecycle Management:**
- Defining service catalogs and service levels (SLAs).
- Managing the lifecycle of services (introduction, operation, retirement).
- Continuous service improvement based on metrics and feedback.
- Vendor management (for public cloud providers or third-party tools).
- Best Practice:
-
12. **Backup, Recovery & Disaster Management:**
- Implementing and managing data backup strategies.
- Testing restore procedures.
- Maintaining and testing disaster recovery (DR) plans and infrastructure.
- Executing failover and failback procedures during disasters.
## Cloud DevOps Maturity Model
## AIOps