<IFN-Labs: Classification for Table

IFN-Labs: Supervised Learning from a Data Table

By Kardi Teknomo, PhD

< Previous | Index | Next >

The purpose of the IFN virtual lab below is to provide a comprehensive, interactive learning environment for understanding and implementing Supervised Learning using Ideal Flow Network (IFN) classifiers for data in tabular format. The lab allows you to engage with the material in a hands-on way, from training IFNs with data tables, to generating back the random input data based on classes and predicting class data. The lab aims to offer a practical understanding of these concepts, enhancing the learning experience beyond theoretical knowledge. It's designed to encourage exploration and experimentation, enabling you to see the direct impact of their actions, which aids in solidifying their understanding of these complex concepts.

Brief Description

This virtual lab provides for understanding and implementing Supervised Learning using Data Tables and Ideal Flow Network (IFN) classifiers. It offers a variety of tools that allow users to train IFNs, generate input data based on classes, and predict class data.

Learning Objectives

Upon completing this lab, you will be able to:
  1. Understand the fundamentals of Supervised Learning and IFN classifiers.
  2. Apply theoretical knowledge to train IFNs using data tables.
  3. Generate input data based on classes after training IFNs.
  4. Predict class data using trained IFNs.
  5. Evaluate the performance of IFNs in classifying data.

Prerequisite

You should be familiar with the concept of IFN. Check the IFN tutorial here . You can also check the full explanation in this YouTube video lecture.

Instruction

  1. Select or input a data table with the first row containing variable names and the last column representing the output category.
  2. Train the IFN using the data table.
  3. Generate input data based on a chosen class after training (with a cloud node).
  4. Use the generated input data to predict the class output.

Experiment and Discussion

  1. Experiment with different data tables to observe the performance of the IFN classifier.
  2. Experiment with the optional parameter, temperature, to observe its effect on the generated input data.
  3. Experiment with different test data to evaluate the accuracy of class prediction.
  4. Suppose the accuracy of the training is less than 100%. Will the accuracy of training affect the generated input data? What happen to the accuracy of your prediction when you try to predict the class of generated input data? Is the prediction category the same as the class that you generate the input data?

Challenge yourself

  1. 1. Use the lab tools to predict class data for a new, unseen data table.
  2. Experiment with different temperature settings to observe their effect on the generated input data and class prediction.
  3. Evaluate the performance of the IFN classifier when trained without a cloud node. Is it possible to generate input data based on classes without a cloud node? If yes, how can it be done?

Lab Tool: Supervised Learning from a Data Table

Training IFN based on Data Table

First, the lab allows you to either select from a set of predefined data tables or input your own data. The data table should consist of multiple variables, with each row representing a record of data. The first row should contain variable names, and the last column should represent the output category.

In the text area below, you can put your own data table. Each column is a variable, each row is your record data. Input variable name separated by comma in the first row. Data table start in the second row. If your data has N columns, the first N-1 column is the input and the last one column is the category output.


Second, you can then train the Ideal Flow Network (IFN) using the data table, resulting in an IFN for each class along with the accuracy against the training data. The lab also offers the option to add a cloud node to ensure a trajectory cycle for creating an IFN, or train without a cloud node where the network may not be strongly connected.

Press button below to train IFN. You will get the IFN of each class with the accuracy against the training Data Table above. $(x,y)==>f,\alpha$

Optionally, we can experiment to train the IFN by (default) adding a cloud node to ensure it is a trajectory cycle to create an IFN, or we can train without cloud node (where the network is not sure to be strongly connected).



Hash Value

Generating Input Data based on Class

Third, you can also generate input data based on a chosen class after training with a cloud node. This feature demonstrates the power of the Ideal Flow Network (IFN) classifier's generalization ability, as it can generate input data that is not present in the original data table. If the training is conducted without a cloud node, the generation of input data based on class is not possible. An optional parameter, temperature, serves as a creativity parameter. At a temperature of 1, links with higher outflows are selected more often. Conversely, at a temperature of 0, links with lower outflows are selected more frequently. A temperature of 0.5 results in a uniform distribution.

After the training (with cloud node), you can randomly generate the input data based on your chosen class. This is the random inverse function to get back $x=f^{-1}(y)$. This also show the power of generalization of IFN classifier that it can generate input data that is not in the original data table. If the training is without cloud node, generation of input data baed on class is not possible.

Optional parameter temperature a kind of creativity parameter. At temperature = 1, links with higher outflows are selected more often. At temperature = 0, links with lower outflows are selected more often. At temperature = 0.5, a uniform distribution is used.



Predicting Class Data

Fourth, you can focus on predicting class data. You can input their test data in the provided text area. The first row should contain variable names separated by commas, while the second row should contain the test data. If your original data has N columns with the last column being the class, the test data should only include the first N-1 columns as input. After training, the lab allows users to predict the class output based on their input data.

In the text area below, you fill it with the test data. Input variable name separated by comma in the first row. The test data is at the second row. If your original data has N columns where the last column is the class, in the test data you don't specify the class. Thus, you only put the first N-1 column input.




IFN for Data Science: Classification of Data Table (supervised learning)