Senior Machine Learning Engineer
- רעננה
- משרה קבועה
- משרה מלאה
- Design and build observability and optimization tools for large-scale GenAI workloads running on Kubernetes
- Develop systems to collect and analyze model performance metrics, logs, and resource usage in real-time
- Innovate in the MLOps and AI observability domain by contributing to upstream communities
- Collaborate with product, engineering, and research teams to improve model trust and performance
- Write unit and integration tests and work with quality engineers to ensure product quality
- Use CI/CD best practices to deliver solutions into RHOAI as part of our productization efforts
- Contribute to a culture of continuous improvement by sharing technical knowledge and insights
- Communicate effectively with stakeholders and team members to ensure visibility of ML performance
- Represent RHOAI in external engagements including open source communities and customer meetings
- Mentor and guide junior engineers and contribute to team growth
- Experience in machine learning engineering, with a focus on production-grade systems
- Proficiency in Python with a focus on AI/ML infrastructure or tooling
- Experience working with Kubernetes, OpenShift, or other cloud-native platforms
- Familiarity with ML observability tools (e.g. Prometheus, OpenTelemetry, and Grafana)
- Hands-on experience with source control tools such as Git
- Passion for open-source technology and collaborative development
- Strong troubleshooting skills and system-level thinking
- Ability to work autonomously and thrive in a fast-paced environment
- Excellent written and verbal communication skills
- Master's degree or higher in computer science, machine learning, or related discipline
- Contributions to open-source projects, especially in the MLOps or ML observability domain
- Experience with public cloud services (AWS, GCP, Azure)
- Background in developing or deploying MLOps platforms or AI monitoring tools
Mploy