EN | DE

Technical Insight Series - Machine Learning Platform Evaluation

November 26, 2020

Technical Insight

Machine Learning Platform Evaluation - Wacker and Infineon


“It was remarkable how appliedAI managed this project with two partners from different industries. In a very agile process they were able to be flexible and still stay focused on the overall goal: to provide us with hands-on experience on different ML pipelines and deliver a perfect documentation and overview.” - Dr. Thomas Schröck, Wacker Chemie AG


Reproducing, deploying, versioning, and tracking Machine Learning pipelines is an integral part of the Machine Learning lifecycle and is one of the most common challenges of newly minted Machine Learning teams. As the working style in an academic setting and in industry is constantly diverging because of heightened requirements of production-grade systems, the corresponding processes and workflows are also maturing. Traditional software engineering tools such as issue trackers, version control and collaboration tools do not adequately reflect the added complexity of the Machine Learning lifecycle, nor do Data Science tools, such as notebooks or model training libraries. This gap is especially pronounced for features of managing data and models, which are central to Machine Learning pipelines. Therefore, a wide array of open-source and commercial providers on the market promise to facilitate the management of Machine Learning pipelines. Since the landscape is immature, no de-facto standard toolchain has been established in the market and companies that make strategic decisions about their toolchain are facing an immediate problem.

appliedAI was approached by its partners Wacker and Infineon, two large German companies with Machine Learning teams, which were facing this problem. Together with these partners, appliedAI conducted an evaluation of end-to-end and modular tools for managing Machine Learning pipelines. In this project, appliedAI was especially interested in tools that facilitate data versioning, model training, model versioning, model deployment, and data & model monitoring. In order to stay inside the scope of the Machine Learning tool evaluation, all tools that were concerned with data wrangling, storage or processing were excluded from this evaluation.

Infineon Technologies AG is a German semiconductor manufacturer headquartered in Neubiberg, Germany. Infineon joined the appliedAI partner network in 2017. The company has over 40,000 employees and is one of the ten largest semiconductor manufacturers worldwide. It is a market leader in automotive and power semiconductors. In fiscal year 2019, the company achieved sales of €8.0 billion.

Wacker Chemie AG is a German multinational chemical company headquartered in Munich, Germany. Wacker joined the appliedAI partner network in 2017. The company has over 14,000 employees and operates 24 production sites in Europe, Asia, and the Americas.

How can a Machine Learning Pipeline evaluation be conducted, which steps need to be run through and what critical aspects did appliedAI need to consider during the ML pipeline development? Read the following article to get insights on what the Machine Learning pipeline evaluation process for Wacker and Infineon looked like, which problems the companies were facing while using Machine Learning and what approach appliedAI conducted to set up and evaluate ML pipelines.

Executive Summary

November 26, 2020

Key takeaways from the assessment of over 50 Machine Learning tools which were designed to facilitate the management of Machine Learning pipelines:

  • Pipelines which combine modular tools and that offer no native integrations between each other often require a high degree of initiative and research to ensure interoperability. The “glue” between tools is usually not seamless (“the glue is the clue”)
  • As tools are not always reliable and stable, good documentation and responsive tech support are the key to using those tools in production.
  • Companies should focus on tools that make data scientists’ lives easier. End-to-end solutions should be preferred and CLI-heavy tools without intuitive user interface should be avoided.
  • Conducting hands-on sessions is critical in order to successfully and diligently evaluate tools for a Machine Learning team.
  • A good data management practice is mission-critical to successfully deploying and using Machine Learning tools.

1) Initial Situation and Problem Statement

Wacker and Infineon have both run their first Machine Learning (ML) projects successfully and used ML for improving the work of different business units. Both Wacker and Infineon had Machine Learning teams in-house and were growing the extent of their work. However, several shortcomings in their workflow were quickly identified. Among the shortcomings was that the model tracking and deployment needed to be done manually, the organization of many different data and deployment build pipelines was challenging, and the provisioning of nodes for training Machine Learning models was extremely time-consuming. Furthermore, reproducibility is an important criteria in the sectors that both companies operate in. Since Infineon and Wacker wanted to grow their Machine Learning teams and thus needed a solution to these problems, they approached appliedAI with the intention of a joint project for evaluating different types of Machine Learning pipelines.

2) Approach and Methodology

In order to help Infineon and Wacker to reach their goal of gaining a more structured approach for ML projects as well as tackling the challenge of disorganization and lowering the risk of legal consequences, appliedAI set up an evaluation of different Machine Learning platforms. The project followed a three-step process.

appliedAI started out with defining the requirements that were most important to the partners. They then continued by creating a list of tools for each step of the pipeline and researched their applicability for the base requirements. In the second phase ‘Tools Pre-Assessment’ a short list of 6 tools was created that ended up being the basis for the Machine Learning pipelines. The appliedAI team then contacted the relevant tool providers for support, assistance, and trial licenses for the tools that were evaluated. After gaining verified access to all tools successfully, appliedAI continued by deploying them on the initiative’s infrastructure and on Google Cloud. All Machine Learning pipelines were then evaluated with respect to the criteria defined in Phase 1 with all 6 Machine Learning pipelines. In the last step, the necessary documentation about processes and pipelines was created, which was used as the basis for the evaluation. A more detailed deep dive into the phases may be found below.

.

Process of Machine Learning Pipeline evaluation

ML Pipeline Project Phases
Source: appliedAI (Wagner, Tu, Oesterle, Münker, Machado, Rumpold)




Phase 1: Requirement Definition

To start, the scope, requirements and multiple evaluation criterias were defined to evaluate the enterprise needs of the partner companies. Within that definition, appliedAI scoped five crucial stages in the ML pipeline assessment: data versioning, model training, model versioning, model deployment, and data & model monitoring. Furthermore different approaches were identified to design a pipeline, from modular tools to end-to-end approaches. In addition, multiple criterias were defined in a multistep approach to evaluate according to the predefined enterprise-needs of Wacker and Infineon.

First, appliedAI created a questionnaire for the teams at both partner companies with specific requirements and their relative importance. The criteria that each team came up with were then discussed in a group setting and distilled to the most important set.

Secondly, appliedAI and the Machine Learning teams categorized the criteria along specific dimensions, such as whether they can be used as dealbreakers, meaning they are non-negotiable, or if they were criteria that should be rated during tool evaluation. Thus, different criteria were applied at different stages of the project, for example some criteria were used in phase 2 to filter out tools from the long list in order to arrive at the short list. Additionally, the criteria were classified into the five pipeline steps mentioned above.

Thirdly, appliedAI created a preliminary weighting for all criteria with an estimation of their relative importance. This weighting was adaptable in an interactive manner by Wacker and Infineon.


Stages in ML Pipeline assessment

tech insight workflow capabilities
Source: appliedAI (Wagner, Tu, Oesterle, Münker, Machado, Rumpold)



In a second step a long list of more than 50 tools was populated and prioritized and a criteria set was created. The goal of this pre-assessment was to get an overview of the landscape and decide which tools are worth looking into in more detail. appliedAI conducted intensive tool desk research in order to create a collection of ML tools that cover the five predefined stages.



Phase 2: Tool Pre-Assessment

This information was then populated based on public information. The tools listed in the longlist were then categorized into Tier 1, Tier 2 as well as a group for not selected tools. Tools categorized as Tier 1 were entitled as good fit for the pipeline and thus offered a good overall impression, criteria and feature coverage. Tools categorized as Tier 2 were limited suitable for the pipeline and only offered an average impression. Tools categorized in the group ‘Not selected’ contained at least one deal breaker and had an overall bad impression and therefore could not be selected as a fit.

.

Criteria sets for different pipeline steps

tech isnight long list
Source: appliedAI (Wagner, Tu, Oesterle, Münker, Machado, Rumpold)


Based on this categorization the team built a short list of tools. They downloaded the open-source tools and acquired trial licences for those tools with the commercial tool providers. Eight tools were shortlisted and then composed to five ML pipelines, either composed on an end-to-end tool or a pipeline of modular tools.

.

ML Pipeline short list

Tech insight tool overview
Source: appliedAI (Wagner, Tu, Oesterle, Münker, Machado, Rumpold)




Phase 3: ML Pipeline Evaluation

In the next step, appliedAI deployed the tools in part on Google Cloud Kubernetes Engine, and in part on appliedAI’s in-house Kubernetes cluster. This process was supported by a cross-functional project team with ML and DevOps expertise.

Afterwards appliedAI conducted an evaluation of the five shortlisted ML tools and integrated pipelines. To conduct the evaluation, the team gathered a set of questions to be answered within the pipeline evaluation and criteria were considered on a tool and pipeline level to gain a holistic view of deployed pipelines.

Additionally, the teams at Wacker and Infineon both had access to the tools on appliedAI’s infrastructure and thus the opportunity to try out the tools themselves and conduct their own evaluation. Throughout this process, appliedAI provided support to the Wacker and Infineon teams.

.

Criteria Catalogue

tech insight overview picture
Source: appliedAI (Wagner, Tu, Oesterle, Münker, Machado, Rumpold)


The set of questions was included in capabilities of the criteria catalogue to be evaluated. The multiple capabilities across the ML pipeline were then tested using five hand-designed ML workflows. The five assessed workflows were Training, Retaining, Monitoring, Scalability as well as AutoML.

The last step of the process was to discuss the findings and results of the evaluation in the joint project team with Wacker and Infineon.

Assessing pipeline-specific criteria per workflow

pipeline per workflow tech insight
Source: appliedAI (Wagner, Tu, Oesterle, Münker, Machado, Rumpold) (image contains exemplary data only)

During the whole project, the project team with experts from appliedAI, Wacker and Infineon regularly met for project meetings, where, in the sense of an agile process, concrete feedback was given on work steps and changes were then consistently conducted in a goal-oriented and flexible manner according to the results of the discussions and requirements.

3) Results and Findings

"appliedAI did a great job in managing this multi-customer project. They guided us in an agile manner through the different project phases, always open minded but never losing focus. We gained hands-on experience & could benefit from profound technical experience of appliedAI's project team. Based on that and the jointly developed blue print we feel confident to pursue this topic further." - C. Ortmaier, Infineon Technologies AG


Three findings stood out at the end of the project. First, the tools that have a user interface and offer end-to-end solutions often have distinct advantages. Many modular tools relied on complex configuration and had hidden dependencies and version conflicts which lead to a high overall effort for setup. On the other hand, modular tools sometimes offered features that were not yet integrated in end-to-end solutions.

Secondly, the integration between tools in the market is currently poor and is severely lagging behind the toolchains available to other software engineering fields. appliedAI forecasts a huge potential for ML Tooling startups, since the market is still up for grabs.

Thirdly, many of the tools showed severe operational issues concerning stability. This shows the early stage in which the ML Tooling field finds itself in at the moment.

.

ML Pipeline Evaluation Results

tech insight results all in 1
Source: appliedAI (Wagner, Tu, Oesterle, Münker, Machado, Rumpold) (image contains exemplary data only)


In contrast to the individual tools, the rating for the respective pipelines lies significantly closer together as weaknesses in one category are balanced out by strengths in others - an outperforming “do-it-all” solution does not exist.

At the end of the project, the teams at Infineon and Wacker were able to have deep insights into the current tool landscape and understand the pitfalls and challenges of building Machine Learning pipelines and the unique selling points of different toolchains.

Looking back at the ML pipeline creation project, appliedAI was able to gain high customer satisfaction, caused by its in-depth experience with tools and efficient teamwork and collaboration. Furthermore appliedAI was able to offer hands-on access to tools and a smooth transition from in-person to online meetings.

Authors of the Case Study and ML Pipeline Project Team

Dr. Denise Vandeweijer
Dr. Denise Vandeweijer

Director of Engineering Operations

Stephanie Eschmann
Stephanie Eschmann

Marketing Communications Manager

Sebastian Wagner
Sebastian Wagner

Senior AI Engineer

Moritz Münker
Moritz Münker

AI Engineer

Alexander Machado
Alexander Machado

Senior AI Engineer

Adrian Rumpold
Adrian Rumpold

Senior AI Engineer

Anica Oesterle
Anica Oesterle

AI Engineer

Patrick Tu
Patrick Tu

Junior AI Engineer

In cooperation with our partners Infineon and Wacker

Infineon
Wacker