Microcode SW Engineer for LLM / VLM Applications
- תל אביב
- ₪ 16,000 per month
- משרה קבועה
- משרה מלאה
- Deep dive into our cutting edge associative HW processing unit
- Design, build, and optimize low-level microcode-including instruction scheduling, memory access patterns, and control flow-for our custom Associative Processing architecture. You'll be working directly with a novel instruction set and hardware behaviors to craft routines that unlock parallelism and maximize throughput. This involves writing cycle-aware logic for compute units, managing hardware state transitions, and tuning for ultra-low latency across deeply pipelined data paths
- Prototype, and iterate on diverse workloads-including transformer-based LLM inference, OpenCV pipelines, FFTs, and edge ML use cases-pushing the boundaries of distributed compute and memory co-location
- Team up across disciplines to turn wild ideas into reliable, high-performance code
- Squash bugs and bottlenecks using analyzers, profilers, and trace tools, always backed by data
- Level up our CI, testing, and docs, keeping dev velocity high and friction low
- Adapt fast, dive into unfamiliar tech, and thrive in the pivot-friendly chaos of startup life
- B.Sc. or M.Sc. in Computer Science, Electrical Engineering, Software Engineering
- Experience Path 1:
- 5+ years of professional C/C++ development focused on low-level programming or microcode for hardware processing units (e.g., CPU, GPU)
- Experience Path 2:
- 5+ years in RTL design/verification plus 2+ years of hands-on C/C++ development
- Proven track record developing and optimizing software algorithms with deep consideration for hardware architecture, memory bandwidth, and system constraints
- Strong understanding of processor architecture fundamentals-caches, pipeline stages, execution units, and memory hierarchies
- Ability to interpret detailed hardware specifications and translate them into robust, efficient software solution
- Practical experience with microcode development and optimization
- Proficiency in assembly language programming
- Strong understanding of deep learning or computer vision algorithms, architectures, and frameworks
- Demonstrated ability to port and refine complex algorithms in performance-sensitive, low-level environments
- Experience with Python scripting for tool creation, data analysis, and automated testing workflows
- Solid foundation in compiler theory, including design principles and code generation techniques
- Practical experience writing performance-critical code, including firmware, compute kernels, and device drivers
Mploy