Senior Site Reliability Engineer (SRE)

$65.00
Expert

We are seeking an experienced Senior Site Reliability Engineer (SRE) to join our team and help build scalable, reliable, and secure infrastructure for our applications. The ideal candidate will have a deep understanding of cloud infrastructure, automation, observability, and incident management, ensuring high availability and optimal performance. Responsibilities: Design, implement, and maintain scalable, resilient infrastructure on cloud platforms (AWS, Azure, or GCP). Develop and manage CI/CD pipelines to streamline deployments and improve system reliability. Automate infrastructure provisioning, monitoring, and incident response using Terraform, Ansible, or similar tools. Monitor system performance and troubleshoot issues to improve uptime and response times. Implement observability solutions, including logging, monitoring, and alerting, using tools like Prometheus, Grafana, Datadog, or ELK Stack. Establish best practices for incident response and ensure post-mortem analysis is conducted for critical incidents. Collaborate with development and operations teams to enhance system reliability and ensure security compliance. Optimize cloud costs while maintaining system performance and availability. Requirements: 5+ years of experience in Site Reliability Engineering, DevOps, or a related field. Strong expertise in cloud platforms (AWS, Azure, or GCP) and container orchestration (Kubernetes, Docker). Proficiency in Infrastructure as Code (IaC) tools such as Terraform, CloudFormation, or Ansible. Hands-on experience with CI/CD tools (Jenkins, GitHub Actions, GitLab CI/CD, or ArgoCD). Deep understanding of Linux systems, networking, and security best practices. Experience with observability tools (Prometheus, Grafana, Datadog, New Relic, or ELK). Proficiency in scripting and automation using Python, Bash, or Go. Strong knowledge of database administration (SQL and NoSQL databases). Familiarity with incident response, root cause analysis, and post-mortem processes. Excellent problem-solving skills and ability to work in a fast-paced environment. Preferred Qualifications: Experience with distributed systems, microservices architecture, and event-driven systems. Knowledge of security best practices, including IAM, encryption, and compliance standards. Understanding of FinOps for cloud cost optimization. Prior experience in a high-traffic production environment.

Keyword: Linux

Contractor Tier: Hourly: $30.00 - $100.00

Price: $65.0

I want to apply

Fix Web-App Connection Issue: Clojure-Datomic-PostgreSQL

**Summary of Issue** We are facing a deployment and runtime failure in our **Clojure-based web application** primarily due to **misconfiguration and integration issues between Datomic On-Prem, PostgreSQL, and the Clojure app itself.** --- **Core Problem** The main is...

View Job

Multi-Tenant CRM Migration to AWS

I'm seeking a seasoned AWS professional to migrate my multi-tenant CRM from A2 Hosting to AWS Vapor. Ensuring optimal functionality, scalability, and security throughout the process is crucial. Key Responsibilities: - Establish Production and Staging environments...

View Job

Instalación y habilitación software lector placa patente, para cáma...

Hola, busco montar sobre una maquina virtual, el software OpenALPR, o similar, para realizar observacion en tiempo real de placas patente. La idea es ocupar en sector poblado y que muestra patentes en tiempo real Si es posible, poder activar Reles de luz roja o verde, d...

View Job