Senior Site Reliability Engineer

2 days agoSenior Anywhere Remote Full Time Devops Jobs by Foxbit

Skills

Secops Loki Iac Bash Ci Sre Github Actions Terraform Security Monitoring Cloud Devops Aws Grafana Prometheus Finops Python Compliance Go Cd Docker Kubernetes Rabbitmq Kafka Automation

About the Role

As a Site Reliability Engineer Senior you will ensure the reliability, availability and performance of our systems. You will monitor and manage incidents, automate remediation, and implement scalable solutions. You will work with the Development and Infrastructure teams to optimize the lifecycle of applications from design to operation, use Terraform to manage infrastructure as code on AWS, and build dashboards with Grafana and Prometheus to visualize metrics. You will help improve DevOps and SRE practices, promoting automation and fast feedback, and you will keep security and compliance considerations in mind.

Requirements

Minimum of 5 years of experience as a software engineer or SRE with a focus on high availability in financial operations
Proficiency in Kubernetes for container orchestration and cluster management
Strong knowledge of AWS and its services such as EC2, RDS and S3
Experience with Infrastructure as Code IaC especially Terraform
Experience with monitoring and observability using Prometheus Grafana and other metrics tools
Strong automation of infrastructure CI CD and DevOps practices
Experience troubleshooting and resolving complex production issues
Knowledge of security and compliance practices in cloud environments
Desirable Familiarity with Grafana Loki
Desirable Experience with Docker and containerized environments
Desirable Knowledge of scripting languages such as Python GoLang or Bash
Desirable Experience with CI/CD tools like GitHub Actions
Desirable Knowledge of RabbitMQ or Kafka
Desirable FinOps practices for cloud cost optimization
Desirable Knowledge of financial markets and or cryptocurrencies will be a differentiator
Desirable Experience with security of information applied to operations SecOps

Responsibilities

Ensure reliability, availability and performance of systems by automating processes and implementing scalable solutions
Monitor and manage infrastructure incidents, ensuring rapid resolution of critical issues and building automation to prevent recurrence
Collaborate with Development and Infrastructure teams to optimize the lifecycle of applications from design to operation
Implement and maintain infrastructure as code using Terraform
Monitor and optimize AWS resource usage and implement cost efficient practices
Create dashboards and reports with Grafana and Prometheus to visualize metrics
Contribute to continuous improvement of DevOps and SRE practices by promoting automation and fast feedback

Benefits

Health plan SulAmérica
Dental plan SulAmérica
Life insurance Prudential
Swile card – 1168.64 BRL
Transport allowance or home office stipend
Learning incentives such as workshops courses and language programs after 3 months
Payroll loan after 3 months
One day off in the week of your birthday
Discounts on trading fees
Referral program
PLR – Profit sharing

Skills

About the Role

Requirements

Responsibilities

Benefits

Similar Jobs