Learning Module
This is where the magic happens 🧙🪄
If this is your first time reading the Learning Module documentation, we suggest starting from the Overview Videos.
The Learning Module's architecture
The Learning Module has been redesigned with an updated architecture that adheres to machine learning practices, introducing external training/testing data separation to support multi-iteration model training. This enhanced approach provides more reliable performance estimation while specifically overcoming PyCaret's limitations in external data splitting. The following diagram presents the new architecture:

Enhanced external training and validation
The updated workflow introduces flexible dataset partitioning through multiple validation methods (cross-validation, bootstrapping, etc.), resolving PyCaret's limitation in handling external data splitting. For clarity:
External splits divide the learning set into training/testing data
Internal splits further partition training data for hyperparameter tuning
The figure below illustrates this enhanced validation framework.

What relies on PyCaret, and what does not use it?
The Learning Module is built on PyCaret's open-source machine learning framework, enhanced with custom-coded components to enable new features, such as external data splitting —a capability not supported in PyCaret. The figure below highlights which elements leverage PyCaret's standard functionality versus our custom extensions, giving you a better understanding of the architecture.

A new design is here
The learning Module features a new redesigned interface aligned with the new architecture, which offers users a streamlined way to create their scenes and experiments. The updated interface is organized into three boxes:
Initialization: Users begin by selecting their machine learning resources and configuring key experiment parameters, including dataset selection, data preprocessing steps, model choices, and more.
Training: This section enables users to define and manage the model training process, encompassing aspects such as hyperparameter tuning, optimization strategies, and additional features.
Analysis: In the final stage, users can visualize and interpret their model’s performance through a variety of result plots and metrics.

A new color-code system
The intuitive box-based design simplifies pipeline creation by visually guiding users through each step. A color-coding system works alongside the boxes to prevent errors. For example, as illustrated below, if you accidentally attempt to drag a Train Model node into the Initialization box, both the node and box will turn red, immediately alerting you to the mismatch. Each box only accepts compatible node types, ensuring logical connections and reducing setup mistakes.

In Results or Analysis modes, a different color-coding is used; read more about it here.
A new scene for experimenting
The learning module includes a new Experimental Scene, a minimalistic scene designed for testing machine learning configurations (models, parameters, etc.) before finalizing them in the main production scene.
As shown in the figure below, the Experimental Scene’s minimalistic design focuses attention on core machine learning elements, with all required node types available. The scene serves as a testing ground where users can refine their pipelines before switching to the main scene.

A Redefined Pipeline Structure
In the previous design, a pipeline was defined as any sequence of connected nodes. The updated architecture now defines a pipeline as a complete sequence of nodes that starts from an initial node and terminates at the Analysis Box. This crucial change means that any disconnected node chain or incomplete workflow will not be recognized as a valid pipeline for execution or analysis. By enforcing this complete connection, the platform ensures that users adhere to machine learning best practices and that every pipeline will be analyzed. The figure below illustrates an example of valid and invalid pipelines under this new definition.

Overview Videos
How to create a scene
These steps are summarized in the figure below. Once your scene is created, a folder will be generated that includes the following:
Your scene (.medml file):
A folder for your scene models:
A folder for your scene notebooks:

Module Overview
The following sections provide a comprehensive overview of the scene and its fundamental components. Each numbered element in the main scene figure corresponds to a detailed explanation in the subsequent subsections.
Main scene

1. Scene folder breakdown
Every scene folder is organized as follows:
2. Available Nodes

The links used to explain the PyCaret-specific functions refer to the Classification section of the PyCaret documentation. However, these functions are also present in other machine learning types within PyCaret. For instance, you can find them in the Regression documentation of PyCaret as well.
Available nodes summary table:
This acts as the initial point for all experiments and determines the data your pipeline will use. The available options for this node correspond to the PyCaret setup()
function options that are not directly related to data cleaning.
-
Dataset
This node enables you to clean and improve the quality of your dataset. The available options for this node correspond to the PyCaret setup()
function options that are directly related to data cleaning.
Dataset
Dataset
This uniquely custom-coded node (distinct from PyCaret’s standard functions) gives you precise control over how your dataset is divided for training and evaluation. It serves as the foundation for reliable model validation by ensuring appropriate data separation.
Dataset
Dataset
This node allows you to select a machine learning algorithm from PyCaret's model library and set its associated parameters. It corresponds to the estimator parameter of the PyCaret create_model()
function.
-
Model_config
This node allows you to train a model using the selected ML algorithm. The available options for this node correspond to the PyCaret create_model()
function options (except the estimator parameter, which is defined through the Model node).
Model_config +
Dataset
Model

This node allows you to combine multiple models using different techniques. It is based on PyCaret's blend_models()
and stack_mdoels()
functions.
Model
Model
This node allows you to train and evaluate the performance of all estimators available in the PyCaret model library using cross-validation. The available options for this node correspond to the PyCaret compare_models()
function options.
Dataset
Model(s)
This node allows you to load a model from a file. It takes as input a model from the ones you saved in your scene, displayed in a dropdown selector. The available options for this node are the ones available in the PyCaret load_model()
function, except the model name, which is replaced by the selected file.
Dataset
Model
This box allows you to analyze a model. It gathers the analysis and model explainability functions of PyCaret. For now, only the plot_model()
function is used in the Learning Module.
-
Model
-
3. Analysis Mode
The Analysis Mode button, called See Results in Experimental scenes, is used to view the results of the experiment. It is disabled until you run an experiment. After a successful run, a .medmlres file is created in your scene folder, containing the generated results from the experiment. If you quit the app, your generated results will still be available the next time you open the app.
4. Utils Menu
This menu contains different functionalities that can be used to help you build your scene.
Machine Learning type dropdown
This dropdown allows you to select the type of machine learning you want for your experiment. When changing the type, all settings are reset.
Play
This button allows you to run the experiment. You can find additional information about running the experiment here.
Garbage bin
This button allows you to delete all nodes in the scene.
Save
This button allows you to save the scene.
Load
This button allows you to load a scene from a file.
5. Minimap
This minimap allows you to navigate the scene and visualize the nodes present in it.
6. Flow Utils
This menu contains various functionalities that interact with the flow section.
Plus Button
This button allows you to zoom in the flow section.
Minus Button
This button allows you to zoom out the flow section.
Square Button
This button allows you to fit the flow section in the view.
Lock Button
This button allows you to lock the flow section. When locked, you can't move the flow section.
Map Button
This button allows you to show/hide the minimap.
7. Scene Boxes
In the main scene, these boxes are part of the new design, helping guide the user through creating their scene and reducing errors when dragging and placing nodes. Read more about these boxes in the next sections of the documentation.
Last updated