Principal Machine Learning Engineer - GenAI

חולון
משרה קבועה
משרה מלאה

לפני 8 ימים

The Principal Machine Learning Engineer - GenAI is responsible for hands-on design, development, and operation of large-scale systems and tools for AI model benchmarking, optimization, and validation.Unlike traditional ML Engineers focused mainly on training models, this role centers on building, running, and continuously improving the infrastructure, automation, and services that enable rigorous, repeatable, and production-grade model evaluation at scale.This is a hands-on principal role that combines strategic technical leadership with active engineering execution.You will own the architecture, implementation, and optimization of benchmarking and validation capabilities across Red Hat's AI ecosystem. This includes architecting Validation-as-a-Service platforms, delivering high-performance benchmarking pipelines, integrating with leading GenAI frameworks, and setting industry standards for model evaluation quality and reproducibility.The role demands deep GenAI domain expertise, architectural foresight, and direct coding involvement to ensure evaluation platforms are flexible, extensible, and optimized for real-world, large-scale use.What You Will Do

Architect and lead scalable benchmarking pipelines for LLM performance measurement (latency, throughput, accuracy, cost) across multiple serving backends and hardware types.
Build optimization & profiling tools for inference performance, including GPU utilization, memory footprint, CUDA kernel efficiency, and parallelism strategies.
Develop Validation-as-a-Service platforms with APIs and self-service tools for standardized, on-demand model evaluation.
Integrate and optimize model serving frameworks (vLLM, TGI, LMDeploy, Triton) and API-based serving (OpenAI, Mistral, Anthropic) in production environments.
Establish dataset & scenario management workflows for reproducible, comprehensive evaluation coverage.
Implement observability & diagnostics systems (Prometheus, Grafana) for real-time benchmark and inference performance tracking.
Deploy and manage workloads in Kubernetes (Helm, Argo CD, Argo Workflows) across AWS/GCP GPU clusters.
Lead performance engineering efforts to identify bottlenecks, apply optimizations, and document best practices.
Stay ahead of the GenAI ecosystem by tracking emerging frameworks, benchmarks, and optimization techniques, and integrating them into the platform.

What You Will Bring

Advanced Python for ML/GenAI pipelines, backend development, and data processing.
Kubernetes (Deployments, Services, Ingress) with Helm for large-scale distributed workloads.
Deep expertise in LLM serving frameworks (vLLM, TGI, LMDeploy, Triton) and API-based serving (OpenAI, Mistral, Anthropic).
GPU optimization mastery: CUDA, mixed precision, tensor/sequence parallelism, memory optimization, kernel-level profiling.
Design and operation of benchmarking/evaluation pipelines with metrics for accuracy, latency, throughput, cost, and robustness.
Experience with Hugging Face Hub for model/dataset management and integration.
Familiarity with GenAI tools: OpenAI SDK, LangChain, LlamaIndex, Cursor, Copilot.
Argo CD and Argo Workflows for reproducible ML orchestration.
CI/CD (GitHub Actions, Jenkins) for ML workflows.
Cloud expertise (AWS/GCP) for provisioning, running, and optimizing GPU workloads (A100, H100, etc.).
Monitoring and observability (Prometheus, Grafana) and database experience (PostgreSQL, SQLAlchemy).

Nice to Have

Distributed training across multi-node, multi-GPU environments.
Advanced model evaluation: bias/fairness testing, robustness analysis, domain-specific benchmarks.
Experience with OpenShift/RHOAI for enterprise AI workloads.
Benchmarking frameworks: GuideLLM, HELM (Holistic Evaluation of Language Models), Eval Harness.
Security scanning for ML artifacts and containers (Trivy, Grype).
Design of tradeoff-analysis tools for model selection and deployment.

#AIHiring**About Red Hat** is the world's leading provider of enterprise software solutions, using a community-powered approach to deliver high-performing Linux, cloud, container, and Kubernetes technologies. Spread across 40+ countries, our associates work flexibly across work environments, from in-office, to office-flex, to fully remote, depending on the requirements of their role. Red Hatters are encouraged to bring their best ideas, no matter their title or tenure. We're a leader in open source because of our open and inclusive environment. We hire creative, passionate people ready to contribute their ideas, help solve complex problems, and make an impact.**Inclusion at Red Hat**Red Hat's culture is built on the open source principles of transparency, collaboration, and inclusion, where the best ideas can come from anywhere and anyone. When this is realized, it empowers people from different backgrounds, perspectives, and experiences to come together to share ideas, challenge the status quo, and drive innovation. Our aspiration is that everyone experiences this culture with equal opportunity and access, and that all voices are not only heard but also celebrated. We hope you will join our celebration, and we welcome and encourage applicants from all the beautiful dimensions that compose our global village.**Equal Opportunity Policy (EEO)**Red Hat is proud to be an equal opportunity workplace and an affirmative action employer. We review applications for employment without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, physical or mental disability, medical condition, marital status, or any other basis prohibited by law.Red Hat does not seek or accept unsolicited resumes or CVs from recruitment agencies. We are not responsible for, and will not pay, any fees, commissions, or any other payment related to unsolicited resumes or CVs except as required in a written contract between Red Hat and the recruitment agency or party requesting payment of a fee.Red Hat supports individuals with disabilities and provides reasonable accommodations to job applicants. If you need assistance completing our online job application, email |** לפנייה למשרה יש להגיש מועמדות **|. General inquiries, such as those regarding the status of a job application, will not receive a reply.⚠הגש מועמדות באתר החברהמשרות דומות שיכולות לעניין אותך26/08/2025מרכז, 14.9 ק"מ ממיקומך👋 היי חברים, חברה גדולה פנתה אלי לסייע לה לחפש מועמדים מעולים לעולם ה-ai עבור כמה חברות טכנולוגיה שונות. התפקידים הפתוחים: מהנדס/ת ai - פיתוח והטמעת...26/08/2025תל אביב, 13.91 ק"מ ממיקומךBTC searchingOne of our clients, a rapidly scaling generative AI startup, is looking to hire a Senior AI Engineer to join their core engineering team. The comp...23/07/2025תל אביב, 13.91 ק"מ ממיקומךUnframe AI**Job Summary**As an AI Infrastructure Engineer, you'll work on the core of Unframe - building reusable AI services, scalable APIs, dynamic UI e...07/08/2025תל אביב, 13.91 ק"מ ממיקומךTattit.ioSoftware Engineer & AI Co-FounderTattit.io aiming to be the AI-powered platform revolutionizing how people find tattoo artists. Think _Spotify f...28/07/2025רעננהTexas Instruments**Job Description****Change the world. Love your job.**We are seeking an innovative engineer to pioneer AI-powered solutions that will rev...16/08/2025תל אביב, 13.91 ק"מ ממיקומךSheba Medical Center, Tel HashomerA unique opportunity to join a leading healthcare organization as a Data & AI Engineer with a core focus on large-scale data engineering and ETL proce...16/08/2025נתניה, 15.3 ק"מ ממיקומךAxioma System Engineering and Integration**חברת אקסיומה הינה חברת הנדסת מערכת ופיתוח פתרונות תוכנה ואלגוריתמיקה. סביבת עבודה צעירה ותחומי הפעילות מגוונים ובחזית הטכנולוגיה. החברה מתרחבת וקולט...16/07/2025תל אביב, 13.91 ק"מ ממיקומךElbit Systems IsraelWireless Communication Algorithm Developerזיהוי דרישה: 4424מיקום גאוגרפי: מרכזחברה: ElbitSystemsעיר: חולוןElbit Systems ...16/07/2025תל אביב, 13.91 ק"מ ממיקומךStealth StartupStealth startup is seeking an exceptional Data and Algorithm Engineer with expertise in graph database technologies to architect our context-connected...קצת עלינוMploy הוא לוח דרושים מבוסס AI, שנועד לסייע למחפשי עבודה ולמעסיקים כאחד, תוך יצירת פלטפורמה חדשנית, איכותית המובילה את שוק העבודה בישראל.אנו מאגדים משרות עדכניות מאלפי מקורות בארץ, ומנגישים אותן ביעילות באמצעות סוכן AI חכם שמתאים משרות רלוונטיות למועמדים ומאפשר הגשת מועמדות בלחיצת כפתור.הפלטפורמה שלנו מציעה התאמות משרות מבוססות בינה מלאכותית עם אחוז התאמה אישי, קבוצות WhatsApp ייעודיות לפי תחום, ואפליקציה מתקדמת שמאפשרת חיפוש ושליחת קורות חיים מכל מקום ובכל זמן.Mploy אצלכם בוואטסאפ✨ רוצים להתעדכן בכל המשרות הכי שוות ישר לנייד?הצטרפו לקבוצות הוואטסאפ שלנו וקבלו את כל ההצעות המתאימות - בלי לחפש, ובלי לפספס. מחכים לכם! 📱😊

Mploy

הגש מועמדות