logo

Senior Site Reliability Engineer (SRE) – Marketing (ID: 2223)

Truelogic is a leading provider of nearshore staff augmentation services, located in New York. Our team of 500 tech talents is driving digital disruption from Latin America to the top projects in U.S. companies. Truelogic has been helping companies of all sizes to achieve their digital transformation goals.

Would you like to make innovation happen? Have you ever dreamed of building Products that impact millions of users? Nice! Then we have a seat for you on our team!

 

What are you going to do?

You will be able to work in a forward-thinking and growth-oriented environment at an evolved dating app and a home for relationships.

To fill a unique market position, we seek a Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of our cloud infrastructure. You will monitor system health, respond to incidents, optimize cloud resources, and implement best practices for high availability and disaster recovery.

  • Monitor and maintain the reliability and performance of AWS infrastructure for mobile and backend services.
  • Implement and manage monitoring and alerting tools to identify and resolve incidents in real-time.
  • Collaborate with the DevOps team to optimize resource usage and ensure cost-effective scaling.
  • Ensure high availability and disaster recovery strategies are in place and regularly tested.
  • Investigate system failures and performance bottlenecks, providing long-term solutions.
  • Conduct post-incident reviews and implement action items to prevent future issues.

What will help you succeed

  •  5+ years of experience as a Site Reliability Engineer (SRE) or similar role in a cloud environment.
  • Expertise in AWS cloud services, including EC2, RDS, Lambda, S3, and CloudWatch.
  • Experience with infrastructure-as-code (Terraform, CloudFormation) and automation. This is a must-have experience