View Our Website View All Jobs

Quality Engineering (Machine Learning and Distributed Platform)

About builds an open source parallel distributed in-memory machine learning platform which allows customers to quickly build high-performance, sophisticated ML models on terabytes of data.

The server is implemented in Java with the expertise of our technical founder, who wrote the HotSpot server JIT for JavaSoft. The ML algorithms are developed in consultation with our three Stanford advisors, Trevor Hastie, Rob Tibshirani and Stephen Boyd.

The platform is accessible via R, Python, Scala, Java, a REST API and a notebook-style web interface. Our paying customer list includes many of the largest insurance companies, banks and healthcare companies, many of the big name unicorns in tech, startups, and on and on.

We are looking for hardcore software developers to work both on the distributed compute platform and on implementing and improving Machine Learning algorithms for it. We support clusters of at least 3200 cores and tens of terabytes of RAM.

Job description

We are looking for members of our Quality Engineering team at both the lead and the individual contributor levels for Machine Learning and Distributed Platform QE. You will work very closely with teams creating H2O, our open source machine learning platform, as well as on other products built atop and complementing H2O. This job involves actual design, implementation, and running of black-box and white-box testing to exercise functionality, performance, scalability and stress of our distributed solution. This is an excellent opportunity to learn about machine learning as a key member of our world-class team. We are looking for hardcore developers, not just testers. The boundary between QE and development is very permeable at

Machine learning quality engineers will work with the algorithm engineering and data science teams to test the correctness of the ML algorithms by writing self-checking tests in R and/or Python that verify that H2O gets the math right. This including comparing against the math in published papers, handling of missing values, identifying and testing edge cases, testing for numerical stability, and so on. A strong math/statistics background is essential, as is good working knowledge of Python, R or both.

Distributed systems quality engineers will work with the distributed systems platform engineers on inspecting, testing and improving the core platform Java code for correctness and performance, and with the algorithm engineers on testing and improving performance of the ML algorithms. This includes trying to break and ensuring the performance of the distributed in-memory data storage and compute layers, as well as performance regression benchmarking and competitive benchmarking. For these roles, distributed/parallel systems background is essential. BS-level knowledge of distributed systems (multithreading, locking and races, high performance network I/O) is necessary and MS-level is highly desired. At least one of: Java/Scala/C/C++/Scala/Haskell/similar.


Education and Experience
- Proven programming education, ability, and experience
- 2-5 years of previous experience in quality engineering/development.
- Desirable: Master’s Degree in Computer Science or related field.
- Desirable: Experience with machine learning algorithms and/or math/statistics and/or distributed/parallel systems.
- (Lead) proven leadership ability in a previous role.

Skills and Abilities
- Excellent programming ability.
- (Machine Learning) Experience with Python and/or R.
- (Distributed System) Experience with at least one of: Java/Scala/C/C++/Scala/Haskell/similar.
- Proven success shipping code which solves difficult systems-level problems
- Experience with test automation and CI (Jenkins)
- Experience with any/all of Machine Learning, Hadoop, or Spark is desired.

Read More

Apply for this position

Apply with Indeed
Attach resume as .pdf, .doc, .docx, .odt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

To comply with government Equal Employment Opportunity / Affirmative Action reporting regulations, we are requesting (but NOT requiring) that you enter this personal data. This information will not be used in connection with any employment decisions, and will be used solely as permitted by state and federal law. Your voluntary cooperation would be appreciated. Learn more.
Veteran/Disability status