MLProfile

Website of the PROFIL project funded by the ANR

Overview of the project

/!\ Warning: This website is under construction

Abstract

Machine Learning (ML) is widespread, and its use depends on structured workflows. An ML workflow involves a sequence of tasks such as data selection, preprocessing, transformation, model creation, evaluation, and result exploitation. A ML “model” refers to the trained and evaluated decision system tailored to a specific problem.

Several factors have democratized ML across fields: models can generalize beyond their original scope, Python is accessible to non-experts, and ML workflows are easy to implement in Notebooks. This accessibility has led to significant improvements in decision-making and problem-solving across diverse domains.

While ML workflows lead to decision models, their design greatly affects model success and quality. Creating such workflows is by its nature an exploratory process, requiring expertise in the ML domain, which is highly diverse and constantly evolving. As ML becomes more widespread, this lack of shared expertise results in uncontrolled code reuse, often causing biased or low-quality outcomes.

Effective reuse of prior workflows is essential, but analyzing variations between them and assessing their quality remains challenging.

The PROFIL project, funded by the French ANR agency, aims to characterize ML workflows embedded in codes, their operationalization, reproduction, and finally their reuse by third parties. For this purpose, we adopt a model driven engineering approach to abstract the content of machine learning notebooks written in python, to build the profile of the ML workflow and to identify inconsistencies in ML workflow.

Plus d’informations

À propos Publications Contact