Kỹ năng

Mô tả công việc

Job Description: Datacenter Observability and Site Reliability Engineer
Roles and Responsibilities:
Observability and Monitoring:
• Design, implement, and maintain observability solutions for datacenter infrastructure.
• Develop, deploy, and maintain the operational and reliability components of a large-scale Observability and Telemetry collection platform, emphasizing performance at scale, real-time monitoring, logging, and alerting. • Participate in and enhance the entire lifecycle of services, from inception and design to deployment, operation, and refinement.
• Develop and optimize monitoring systems to ensure high availability and performance.
• Create and manage dashboards, alerts, and reports to provide visibility into system health and performance.
Site Reliability Engineering (SRE):
• Implement SRE best practices to improve the reliability, scalability, and performance of datacenter services.
• Develop and maintain automation scripts for infrastructure provisioning, monitoring, and management.
• Conduct root cause analysis and post-mortem reviews to prevent recurrence of incidents.
Performance Optimization:
• Analyze and optimize the performance of datacenter systems and applications.
• Implement best practices for resource utilization and efficiency.
Collaboration:
• Work closely with other engineering teams to understand and meet their observability and reliability requirements.
• Collaborate with hardware and software vendors to evaluate and integrate new technologies.
Security and Compliance:
• Ensure that observability and reliability solutions comply with security policies and industry standards.
• Implement and maintain security measures to protect data and infrastructure. Troubleshooting and Support:
• Provide support for observability and reliability-related issues, including debugging and resolving hardware and software problems.
• Develop and maintain documentation for troubleshooting procedures and best practices.
Continuous Improvement:
• Stay updated with the latest advancements in observability and SRE technologies and integrate them into the infrastructure.
• Continuously improve the reliability, scalability, and performance of datacenter services.

Yêu cầu công việc

Technical Skills:
• Proficiency in observability tools and technologies (e.g., Prometheus, Grafana, ELK Stack).
• Experience with SRE practices and tools (e.g., Kubernetes, Docker, Terraform).
• Strong programming and scripting skills (e.g., Python, Go, Bash).
• Familiarity with cloud platforms (AWS, Azure, GCP) and their observability and
reliability services.
Soft Skills:
• Strong problem-solving skills and attention to detail.
• Excellent communication and collaboration skills.
• Ability to work in a fast-paced, dynamic environment.

Thời gian làm việc

Trong tuần: Từ thứ 2 - thứ 6

Trong ngày: Từ 08:30 giờ - 18:00 giờ

Quyền lợi ứng viên

- No probationary period, full-time job with 100% salary
- Opportunity to work in teams with many leading experts in the IT field domestically and internationally.
- Opportunity to carry out ambitious projects in many countries, access the latest technologies and learn from talented colleagues.
- Work in a young, dynamic, modern and multicultural environment; Communication activities and events on holidays take place regularly.
- Opportunity to advance according to ability with corresponding rank and salary increases.
- Right to participate in soft skills training courses (logical thinking, creative thinking, communication skills, project management skills, negotiation skills ...) and Japanese language classes.
- And many other attractive benefits...

Địa chỉ làm việc

remote

Báo cáo lỗi

Việc làm cùng kỹ năng

Devops onsite Trung Kính

30-40 triệu
Hà Nội

DevOps Infrastructure

Tiền thưởng

Đăng nhập để xem

Giới thiệu ngay

DP_TUYỂN DỤNG BRIDGE SYSTEM ENGINEER

50-70 triệu
Đà Nẵng

Bridge Engineer System Engineer

What will you do in this position? Act as a bridge between the development team in Vietnam and Japanese clients. Discuss, negotiate, and analyze software project requirements, then communicate them to the project team in Vietnam. Propose solutions for client requirements and project issues. Work closely with the project team to develop effective problem-solving strategies. Monitor project progress, manage teams, and oversee project execution. Provide regular work reports to clients. Welcome and assist Japanese clients during office visits and business meetings to enhance project acquisition opportunities. Additional job details will be discussed during the interview. Possible short-term onsite assignments in Japan if required.