What can you expect? Work with the best and brightest software engineers! You will bring lots of energy, innovation, and excitement! Be ready to learn and explore cutting edge technologies in distributed computing and big data environments. Your have an automate anything approach to document for the benefit of others. You are an independent problem-solver who is self-directed and capable of exhibiting deftness to handle multiple simultaneous competing priorities and deliver solutions in a timely manner. Provide incident resolution for all technical production issues. Create and maintain accurate, up-to-date documentation reflecting configuration, and responsible for writing justifications, training users in complex topics, writing status reports, documenting procedures, and interacting with other Apple staff and management. Provide guidance to improve the stability, security, efficiency, and scalability of systems. Determine future needs for capacity and investigate new products and/or features. Strong troubleshooting ability will be used daily; will take steps on their own to isolate issues and resolve root causes through investigative analysis in environments where the candidate has little knowledge/experience/documentation. Help mentor and coach junior members of the team in process and engineering design. Administer and ensure the proper execution of the backup systems. Provide 24x7 on-call support to handle urgent critical issues. We are dedicated to the goal of building a culturally diverse and pluralistic team that reflects the multicultural variety of our customers.
BS in Computer Science with 10+ years related experience.
Development and operational excellence through proper automation and engineering processes using programming languages such as Go, Python, Java, or other JVM languages
Solid experience and knowledge in any one of Big Data Technologies such as Hadoop, Spark, Kafka, Apache Hbase, Presto, etc.
MS Computer Science with 5 years Industry experience
10+ years of experience in designing and managing large scale production grade deployments both in on-prem and in cloud (AWS or GCP). Experience with containers and container orchestration platforms such as Docker, Kubernetes or equivalent.
Strong proficiency with Helm and Kustomize for managing Kubernetes applications and configurations through GitOps practices
Experience with configuration management or Infrastructure as Code (IaC) tools such as Ansible, Terraform, and Crossplane is desired.
Proficient in working with Linux or other POSIX operating systems, shell scripting, and networking technologies.
Should be highly proactive with a keen focus on improving the uptime availability of our mission-critical services
Excellent verbal and written communication skills, able to collaborate cross-functionally with program managers and engineering partners
Comfortable working in a fast-paced environment while continuously evaluating emerging technologies
Familiarity with logging and observability technologies such as Splunk and Prometheus or similar
Validated software engineering experience and field in design, testing, source code management, and CI/CD practices
Strong knowledge and experience in managing & monitoring cloud based services using various probes in k8s, Splunk, Prometheus, Grafana and Mosaic
Experience in at least one of the publically available Embedding models, Large Language Models with expertise in pre-training, fine tuning data insights for accuracy and re-call
Experience in handling ETL pipelines, designing measurable SLAs for ETLs and handling of upstream and down stream data processing in near realtime
Experience in Open Source software development and CI/CD is desirable
Experience in handling architectural and design considerations such as performance, scalability, reusability and flexibility issues in distributed databases.