Develop, manage, and maintain Terraform modules for provisioning and managing Azure IaaS infrastructure
Follow Infrastructure-as-Code (IaC) best practices, emphasizing modular, reusable, and scalable deployments
CI/CD & Release PipelinesImplement and enhance CI/CD pipelines using GitHub Actions and Jenkins
Automate build, test, and deployment workflows
Container OrchestrationManage the Docker container lifecycle and deploy applications on Kubernetes clusters
Use Helm for templating, packaging, and managing Kubernetes resources
Configuration & Release ManagementConfigure and manage systems using Ansible (or similar tools) to maintain environment consistency
Messaging & Database ServicesImplement and support Azure Service Bus for messaging in distributed systems
Manage and optimize MongoDB deployments (both self-hosted and managed)
Cloud Infrastructure OperationsProvision, configure, and maintain Azure IaaS resources, including VMs, networking, storage, and identity services
Scripting & AutomationCreate automation scripts in Shell (Bash), PowerShell, JavaScript, and Python for infrastructure orchestration
Reliability & ObservabilityDefine and monitor SLIs, SLOs, and Error Budgets
Implement monitoring, logging, and incident response processes to ensure system reliability and fast recovery
Collaboration & CultureCollaborate with developers, architects, and support teams to improve system reliability and CI/CD efficiency
Participate in on-call rotations, incident retrospectives, and continuous improvement initiatives
Skills & Qualifications Required:Deep expertise in Terraform and IaC workflows
Strong experience with GitHub Actions and Jenkins for CI/CD
Solid experience with Docker, Kubernetes, and Helm
Practical knowledge of Ansible or similar configuration management tools
Working knowledge of Azure IaaS (VMs, networking, storage, identity)
Hands-on experience with Azure Service Bus and MongoDB
Proficient in scripting using Bash, PowerShell, JavaScript, and Python
Strong Linux administration and networking fundamentals
Skilled in configuration, incident resolution, and root cause analysis
Preferred:Experience with observability platforms such as Prometheus, Grafana, ELK, or Datadog
Bachelor's degree in Computer Science, Engineering, or equivalent technical field
Ideal Candidate ProfileA DevOps/SRE expert who builds highly automated, stable, and cost-efficient infrastructure
A collaborative engineer who actively improves observability, reduces manual toil, and handles incidents effectively
A proactive technologist committed to continuous improvement in reliability, automation, and performance