Monitor the health of your services and work with developers to increase the velocity of changes using built-in support for service monitoring.
• Select metrics for SLIs, set SLOs, and track error budgets to mitigate risk for the service.
• Use powerful dashboards to aggregate metrics and logs, including golden signals to reduce MTTR and quickly answer questions about service health.
• Take ownership of platform-related incident management and resolution, ensuring timely communication and effective problem-solving.
• Automate various provisioning and maintenance tasks using scripts and automation tools