AI/LLM Infrastructure Engineer
- רמת השרון
- משרה קבועה
- משרה מלאה
- Deploy, manage, and optimize on-prem LLM infrastructure (e.g. LLaMA, Mistral).
- Build and maintain vector databases and RAG pipelines.
- Design scalable AI/data pipelines and inference services.
- Fine-tune models and optimize prompt engineering strategies.
- Operate LLM services on Kubernetes with GPU scheduling.
- Monitor performance, implement CI/CD, and ensure system reliability.
- Hands-on experience with LLM frameworks (vLLM, HF Transformers, DeepSpeed).
- Strong K8s skills, including GPU orchestration.
- Experience with vector DBs (e.g. Weaviate, Qdrant, FAISS).
- Solid understanding of RAG systems and prompt engineering.
- Fluency in English and working proficiency in Hebrew.
Mploy