Motivated Site Reliability Engineer with over 7 years of IT experience, specializing in high-availability cloud infrastructure and automation. Leading support engineers, improving escalation processes, and advancing observability.
Overview
7
7
years of professional experience
Work History
Site Reliability Engineer
Itransition
08.2021 - Current
Led the integration and management of observability tools (New Relic, Prometheus, Grafana).
Rolled out alert systems and escalation processes, crucial to the observability framework.
Developed incident response strategies and runbooks, reducing Mean Time to Recovery (MTTR) by 50% and enhancing system resilience, in line with established SLOs.
Integrated monitoring automation deployment, speeding up operational readiness by 25%, and enhancing response capabilities.
Translated complex business requirements into clear observability and service-level objectives, ensuring systems met both technical and business needs.
Managed AWS cloud infrastructure, and Kubernetes, ensuring high availability, security, and scalability, supporting mission-critical operations.
Designed and executed infrastructure automation scripts and tools, significantly reducing manual workloads and enhancing system reliability. Expert in Terraform, scripting with Python, and using tools like TeamCity, Jenkins, and Azure DevOps.
DevOps Engineer (Performance Test Engineer)
A1QA
09.2019 - 07.2021
Designed and implemented cloud infrastructure using AWS for diverse project domains.
Conducted blameless post-incident reviews to identify root causes and prevent future outages.
Automated incident response processes and developed, leading to a reduction in RTO and increased team productivity.
Conducted load, stress, and endurance tests using Jmeter, Gatling, K6 to simulate realistic user activities.
Improved scalability, performance, and load-balancing aspects of systems, that resulted in improvements in application responsiveness during peak usage.
Minimized operational disruptions and enhanced production efficiency by identification and resolution of bottlenecks.
IT Systems Administrator
Department Of Safety And Security
08.2017 - 08.2019
Administered and secured IT infrastructure, focusing on network health and data security.
Implemented and maintained Windows server configurations to ensure high availability and security of business-critical applications.
Developed software to automate provisioning infrastructure and software for new users.
Education
Bachelor of Science - Electronics
BSUIR
Minsk, Belarus
07.2017
Skills
AWS, Azure
Terraform, CloudFormation, Bash, Python
CI/CD Jenkins, TeamCity, Azure DevOps
Docker, Kubernetes
Monitoring with Prometheus, Grafana, New Relic, CloudWatch
Front-end/Back-end, API Services, Web Infrastructure
Linux and Windows administration
Incident Management: OpsGenie, PagerDuty
Timeline
Site Reliability Engineer
Itransition
08.2021 - Current
DevOps Engineer (Performance Test Engineer)
A1QA
09.2019 - 07.2021
IT Systems Administrator
Department Of Safety And Security
08.2017 - 08.2019
Bachelor of Science - Electronics
BSUIR
Similar Profiles
Natalia PaluyanNatalia Paluyan
Senior Business Analyst at ItransitionSenior Business Analyst at Itransition