Principal Cloud SRE Engineer
Oracle
Principal Cloud SRE Engineer
Location: Casablanca, Morocco (onsite work mode)
Job Summary:
As a Principal Cloud Engineer (SRE), you will play a key role in ensuring the reliability, performance, and scalability of modern cloud-based data platforms. This position involves close collaboration with development, operations, and security teams to automate processes, monitor system health, and maintain optimal uptime for critical production workloads. You will leverage your technical expertise to design, automate, and maintain large-scale data pipelines and lakehouse infrastructure, supporting mission-critical data engineering and analytics initiatives.
Key Responsibilities:
Design, implement, and maintain scalable, secure cloud infrastructure for large data platforms (data lakes, data warehouses, and lakehouse solutions) on OCI, AWS, Azure, or GCP. Collaborate with Data Engineering teams to build robust, automated ETL/ELT pipelines using tools such as Apache Spark, Databricks, Kafka, or Oracle Cloud Data Integration. Implement site reliability engineering best practices tailored for data systems: SLO/SLI definition, error budgeting, automated monitoring, data integrity validation, and incident response for data workloads. Design and optimize data storage solutions leveraging both structured and unstructured storage (object storage, data lake/lakehouse platforms like Delta Lake, Iceberg etc.,). Automate infrastructure provisioning and CI/CD deployments for data pipelines and analytic workloads with tools like Terraform, Ansible, or CloudFormation. Instrument and monitor data platform components for performance, availability, resource consumption, and data quality using observability tools (e.g., Grafana, Splunk). Troubleshoot and resolve complex data pipeline or infrastructure issues, conducting root cause analyses and post-incident reviews. Advocate for and implement security, governance, and compliance best practices—including data privacy, encryption, and access controls. Mentor junior team members and promote knowledge sharing around data platform reliability.
Qualifications:
Bachelor’s or Master’s in Computer Science, Engineering, Data Science, or related field, or equivalent experience. 6+ years’ experience in cloud engineering, SRE, or DevOps roles with at least 4 years supporting data engineering initiatives. Practical experience designing and operating large-scale cloud-based data platforms (data lakes, warehouses, or lakehouses). Strong hands-on skills with infrastructure-as-code (e.g., Terraform), automation (Python/Scala), and containerization (Kubernetes, Docker). Familiarity with data processing frameworks (Apache Spark, Databricks, Hadoop), as well as orchestration tools (Airflow, Oozie, or similar). Working knowledge of distributed storage, data formats (Parquet, Avro), and modern analytics platforms. Solid understanding of networking, cloud security, and regulatory compliance for data platforms. Strong analytical, troubleshooting, and communication skills. Preferred certifications: Cloud Architect/Engineer (OCI, AWS, Azure, GCP), Databricks, or relevant data engineering credentials.
Por favor confirme su dirección de correo electrónico: Send Email