Machine Learning Model Analysis Using TensorBoard

Aekam Parmar -
Machine learning, AI, IoT
Illustration: © IoT For All

Machine Learning is growing by leaps and bounds, with new neural network models coming up regularly. These models are trained for a specific dataset and are proven for accuracy and processing speed. Developers need to evaluate ML models and ensure that they meet specific threshold values and functions as expected before they are deployed. There is a lot of experimenting going to improve the model performance, and visualizing differences become crucial while designing and training a model. TensorBoard helps visualize the model, making the analysis less complicated as debugging becomes easier when one can see what the problem is.

General Practice to Train ML Models

The general practice is to use pre-trained models and perform Transfer Learning to re-train the model for a similar data set. In a technique called Transfer Learning, a neural network model is first trained on a problem similar to the one that is being solved. One or more layers from the trained model are then used in a new model trained on the problem of interest.

Most of the time, the pre-train models come in a binary format (saved model, protocol buffer), making it difficult to get internal information and immediately start working on it. From the organization’s business point of view, it would make sense to have some tool to get insights into the model, to reduce the project delivery timelines.

There are a couple of available options to get the model information, like the number of layers and associated parameters. Model Summary and Model Plot are the basic options. These options are quite simple, considering few implementation lines, and provide fundamental details like several layers, types of layers, and input/output of each layer.

However, the Model Summary and Model Plot are not that effective in understanding every detail about any large, complex model in the form of Protocol Buffer. In such scenarios, using TensorBoard, a visualization tool provided by TensorFlow, is more meaningful. It is quite powerful, considering the various visualization options that it provides like Model (of course), Scalars and Metrics (training and validation data), Images (from the dataset), Hyperparameter tuning, etc.

Model Graphs to Visualize Custom Models

This option helps, especially when a custom model is received in the form of a protocol buffer. It is required to understand it before making any modification or training it. As shown in the image below, an overview of the sequential CNN is visualized on the board. Each block represents a separate layer, and selecting one of them would open a window on the top-right corner with input and output information.

In case further information is required about what is inside the individual blocks, one can double-click on the block, which will expand the block and provide more details. Notice that a block can contain one or more blocks that can be expanded in a layer-by-layer fashion. Upon selecting any specific operation, it would also provide more information about associated processing parameters.

Scalar and Metrics to Analyze Model Training and Validation

The second important aspect of Machine Learning is to analyze the training and validation of the given model. From an accuracy and speed point of view, the performance is quite important to make it suitable for real-life practical applications. The accuracy of the model improves with the number of epochs/iterations. If the training and testing validation are not up to the mark, it indicates that something is not right. It could be either underfitting or overfitting and can be corrected by either modifying the layers/parameters or improving the dataset, or both.

Image Data to Visualize Images from the Dataset

As the name suggests, it helps to visualize the images. It is not limited only to visualize the images from the dataset, but it also shows the Confusion Matrix in the form of an image. This matrix indicates the accuracy of detecting objects of individual classes. As shown in the image below, the model confuses the coat with the pullover. To overcome this situation, it is recommended to improve the specific classes’ dataset to feed distinguishable features to the model for better learning and hence accuracy.

Hyperparameter Tuning to Achieve Desired Model Accuracy

The model’s accuracy depends on the input dataset, the number of layers, and associated parameters. In most cases, the accuracy would never touch the expected accuracy during the initial training. It would require playing around with the number of layers, types of layers, associated parameters apart from the dataset. This process is known as Hyperparameter Tuning.

In this process, a range of hyperparameters is provided for the model to select, and the model is run with a combination of these parameters. The accuracy of each combination is logged and visualized on the board. It rectifies the efforts and time that would otherwise get consumed with manual training of the model for every possible combination of the hyperparameters.

Profiling Tool to Analyze Model Processing Speed

Apart from accuracy, processing speed is an equally important aspect of any model. It is necessary to analyze the processing time consumed by individual blocks and if it can be reduced by making some modifications. The Profiling Tool provides a graphical representation of time consumption by each operation with different epochs. With this visualization, one can easily pin-point the operations which are consuming more time. Some known overheads could be resizing the input, translation of model code from Python, running code in CPU instead of GPU. Taking care of such things would help to achieve optimum performance.

Overall, the TensorBoard is a great tool helping the development and training process. The data from Scalar and Metrics, Image Data, and Hyperparameter tuning help improve accuracy, while the profiling tool improves the processing speed. TensorBoard also aids in reducing the debugging time involved, which otherwise would have definitely been a large time-frame.

Author
Aekam Parmar - Principal Engineer, VOLANSYS Technologies

Contributors
Guest Writer
Guest Writer
Guest writers are IoT experts and enthusiasts interested in sharing their insights with the IoT industry through IoT For All.
Guest writers are IoT experts and enthusiasts interested in sharing their insights with the IoT industry through IoT For All.