
Senior Site Reliability Engineer - Remote
- תל אביב
- משרה קבועה
- משרה מלאה
- Developing and maintaining the system and its services, automating the deployment process, ensuring system scaling.
- Investigating and resolving outages, identifying and implementing preventive measures proactively, and collaborating with key stakeholders.
- Performing proactive measures such as capacity planning, performance tuning, and implementing infrastructure as code.
- Working with development teams to ensure that new features and services are designed and deployed in a way that meets the reliability and performance goals.
- Building software tools and systems to automate analytical tasks and workflows to increase efficiency and reliability.
- Helping to identify areas for new technology investments.
- Have over 5 years of academic or industrial experience and a bachelor's degree in computer science
- Possess experience with Internet protocols (DNS/HTTP/TLS/TCP)
- Demonstrate experience with coding in one or more of the following languages (python, perl, golang, Java, SQL)
- Have experience with configuration management tools such as Ansible, Terraform, and Salt.
- Demonstrate experience with container orchestration using Kubernetes (K8s) & Docker
- Have experience troubleshooting Unix & Linux issues
- Demonstrate experience with observability tools such as Prometheus and Grafana & have some familiarity with web applications