We use cookies and other tracking technologies to improve your browsing experience on our site, analyze site traffic, and understand where our audience is coming from. To find out more, please read our privacy policy.

By choosing 'I Accept', you consent to our use of cookies and other tracking technologies.

We use cookies and other tracking technologies to improve your browsing experience on our site, analyze site traffic, and understand where our audience is coming from. To find out more, please read our privacy policy.

By choosing 'I Accept', you consent to our use of cookies and other tracking technologies. Less

We use cookies and other tracking technologies... More

Login or register
to apply for this job!

Login or register
to save this job!

Login or register to start contributing with an article!

Login or register
to see more jobs from this company!

Login or register
to boost this post!

Show some love to the author of this blog by giving their post some rocket fuel 🚀.

Login or register to search for your ideal job!

Login or register to start working on this issue!

Engineers who find a new job through Functional Works average a 15% increase in salary 🚀

Site Reliability Engineer, ML Ops

Stockholm, Sweden

29 December, 2020

Salary

551K - 725K SEK

Contract type

Full time
Sponsorship offered

Technologies & frameworks

  • Go
  • Cloud
  • Distributed Systems
  • KUBERNETES
  • Linux
  • Machine Learning
  • SRE

Benefits & perks

  • Flexible working
  • Pension
  • Sponsorship available
Changing the way hundreds of millions of people interact with each other every day 🤝

Role overview

We at Platform team provides the services, tools, and knowhow to make the Business units operationally independent. The domains we cover are Android/iOS core development and CI/CD, Data Platform providing analytics as a service, and back end core development and Infrastructure.

The role

As a Site Reliability Engineer - Machine Learning Operations (ML Ops), you'll be part of Platform-Infrastructure team which provides support on application development, designing, managing and maintaining infrastructure services over the course of their life cycle, contributing operations and development knowledge to solve problems dealing with complex infrastructure to improve performance, visibility, stability, availability and reliability using codified, scripted, or automated solutions. Responsible for ML specific resources within the infrastructure.

What we expect

  • Programming experience in at least one modern programming language, Go (Java/Scala is a plus).
  • 5+ years of programming and system administration on Linux environments, preferably working on high throughput and low latency systems.
  • Extensive experience with Public Cloud platforms (preferably GCP).
  • Extensive experience with Kubernetes, Ansible and/or Terraform, Kafka
  • Experience in ML operations.
  • Excellent understanding of distributed system design across process and site boundaries
  • Hands-on experience with service orchestration and management, deployment activities, configuration management, and all necessary automation
  • Good understanding of process isolation, virtualization and containerization concepts and being able to apply them when necessary
  • Good understanding of software development lifecycle, versioning, building, testing, staging and deployment processes with a strong continuous delivery mindset
  • Having a research-oriented mindset
  • Strong tendency to keep things simple and maintainable (stick to KISS + YAGNI)
  • Experience in configuring ,provisioning and maintaining high performance clusters
  • Experience to create and maintain lifecycle for the ML models.

What you'll work on

  • Building tooling to ease the provisioning and scaling of infrastructure resources.
  • Continuously improve and extend infrastructure components to handle growth.
  • Describe the role as split between Data Platform and Core Infrastructure
  • Optimize overall systems performance and investigate production issues for future improvements.
  • Ensure systems availability, reachability, maintainability, and testability.
  • AI platform
  • Contribute to Improve model tracking and monitoring
  • Support Data engineers and data scientists by creating and managing big data clusters, environments for data exploration and visualization.
  • A modern microservice architecture based on Java/Scala. We use a wide stack of technologies, of which, some of the most important are: Kubernetes, Docker, Ansible, Apache Cassandra, Scyladb, MySQL, Kafka, Elasticsearch, Redis, Memcache, Prometheus, Grafana, Debian Linux
  • Building the necessary instrumentation, tooling, and alarming systems in order to escalate abnormalities.

What we offer

  • International team - 30+ nationalities working together!
  • Competitive salary
  • Medical insurance
  • Learning & sharing environment
  • Flexible working hours
  • Exciting company parties & team activities – Running team, Geek lunch!
  • Start the day with fresh fruit and cereals
  • Stay refreshed: get juice, tea, coffee, and soft drinks


  • 50-249

Truecaller was born in 2009, Stockholm, Sweden, with the mission to provide more safe and efficient communication to everyone’s daily life. Today, Truecaller is loved by 200 million daily active users around the world, popular in South Asia, Middle East, Africa! We are the go-to app for Caller ID, spam blocking and payments. Truecaller is a Swedish company founded in 2009 in Stockholm, Sweden by Nami Zarringhalam and Alan Mamedi. The app began when our co-founders were just students who wanted to create a service that would easily identify incoming calls from unknown numbers. We have our strongest presence in South Asia, Middle East, Africa, and HQ in Sweden. We are backed by some of the most prominent investors in the world such as Sequoia Capital, Atomico, and Kleiner Perkins Caufield & Byers.

View 6 jobs
Engineers who find a new job through Functional Works average a 15% increase in salary.

Salary

551K - 725K SEK

Contract type

Full time
Sponsorship offered

Technologies & frameworks

  • Go
  • Cloud
  • Distributed Systems
  • KUBERNETES
  • Linux
  • Machine Learning
  • SRE

Benefits & perks

  • Flexible working
  • Pension
  • Sponsorship available

Get hired!

Sign up now and apply for roles at companies that interest you.

Engineers who find a new job through Functional Works average a 15% increase in salary.

Start with GitHubStart with Stack OverflowStart with Email

Get hired!

Sign up now and apply for roles at companies that interest you.

Engineers who find a new job through Functional Works average a 15% increase in salary.

Start with GitHubStart with Stack OverflowStart with Email
Site Reliability Engineer, ML Ops