We are seeking an experienced Senior Site Reliability Engineer (SRE) to join our team and help build scalable, reliable, and secure infrastructure for our applications. The ideal candidate will have a deep understanding of cloud infrastructure, automation, observability, and incident management, ensuring high availability and optimal performance. Responsibilities: Design, implement, and maintain scalable, resilient infrastructure on cloud platforms (AWS, Azure, or GCP). Develop and manage CI/CD pipelines to streamline deployments and improve system reliability. Automate infrastructure provisioning, monitoring, and incident response using Terraform, Ansible, or similar tools. Monitor system performance and troubleshoot issues to improve uptime and response times. Implement observability solutions, including logging, monitoring, and alerting, using tools like Prometheus, Grafana, Datadog, or ELK Stack. Establish best practices for incident response and ensure post-mortem analysis is conducted for critical incidents. Collaborate with development and operations teams to enhance system reliability and ensure security compliance. Optimize cloud costs while maintaining system performance and availability. Requirements: 5+ years of experience in Site Reliability Engineering, DevOps, or a related field. Strong expertise in cloud platforms (AWS, Azure, or GCP) and container orchestration (Kubernetes, Docker). Proficiency in Infrastructure as Code (IaC) tools such as Terraform, CloudFormation, or Ansible. Hands-on experience with CI/CD tools (Jenkins, GitHub Actions, GitLab CI/CD, or ArgoCD). Deep understanding of Linux systems, networking, and security best practices. Experience with observability tools (Prometheus, Grafana, Datadog, New Relic, or ELK). Proficiency in scripting and automation using Python, Bash, or Go. Strong knowledge of database administration (SQL and NoSQL databases). Familiarity with incident response, root cause analysis, and post-mortem processes. Excellent problem-solving skills and ability to work in a fast-paced environment. Preferred Qualifications: Experience with distributed systems, microservices architecture, and event-driven systems. Knowledge of security best practices, including IAM, encryption, and compliance standards. Understanding of FinOps for cloud cost optimization. Prior experience in a high-traffic production environment.
Keyword: Python Development
Contractor Tier: Hourly: $30.00 - $100.00
Price: $30.0
Amazon Web Services Docker DevOps Amazon EC2 Ansible Kubernetes CI/CD System Administration
Looking for an autonomous individual to handle our adult-themed social media content. This unique opportunity lets you grow into a social media expert for adult content. Key Responsibilities: 1. Craft and execute a social media strategy aimed at amplifying our adult co...
View JobWe are seeking a highly skilled Python developer for a detail-oriented project, focusing on backend development, data analysis, and cloud-based functions. The ideal candidate will be located in North, Central, or South America and possess a strong technical skill set al...
View JobLooking for dev team who are interested in revenue share for eCommerce specific AI Voice Agent (AVA) SaaS. We are an eCommerce and initially looked for voice agent services, but there was nothing specifically designed for eCommerce. Many say they do for eCom but they do...
View Job