AI Training - Tutorial - Train a model to recognize marine mammal sounds

Objective

The aim of the tutorial is to understand how to train a model with AI Training in order to classify sounds.

This the next step after you have designed the model with AI Notebooks. You can see the Notebook step in the tutorial: Audio analysis and classification with AI.

Info

It's strongly recommended to read the Notebook tutorial before reading this tutorial.

Requirements

Access to the
A Public Cloud project created
The ovhai CLI interface installed on your system (more information here)
Docker installed and configured to build images.
An OCI / Docker image registry. You can use a public registry (such as Docker Hub for example) or a private registry. Refer to the Creating a private registry documentation to create a private registry based on Harbor. To make your registry compatible with AI Solutions usage, follow the Use & manage your registries guide.
Knowledge about building images with Dockerfile

Instructions

Create object storage for data

To train the model you'll need data and a place where to save the trained model. You can reuse the previous object storage used in the Notebook tutorial Audio analysis and classification with AI or follow the step Uploading your dataset on Public Cloud Storage of this same tutorial.

Train your model

To train the model, we will use AI Training. This powerful tool will allow you to automate your pipelines and build fine-tuning phases easily.

AI Training allows you to train models directly from your own Docker images.

First, you need to create a Python script that is in charge of doing the training. You can copy and paste the following code in a file named train-audio-classification.py:

import numpy as np
import pandas as pd
import datetime

# preprocessing
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

# model
import tensorflow as tf

########################################################################################################################################################
# The goal of this script is to train a pre-construct model to recognize marine mammal sound.                                                          #
# See the Notebook "notebook-marine-sound-classification" in the ai-training-examples for                                                              #
# more details : https://github.com/ovh/ai-training-examples/blob/main/notebooks/audio/audio-classification/notebook-marine-sound-classification.ipynb #
# You must mount 2 volumes for the data and the model (the same used for the Notebook for example 😉) :                                                #
#   - /workspace/saved_model where the model is stored                                                                                                 #
#   - /workspace/data/ where store the data for the training                                                                                           #
########################################################################################################################################################

# 🗃 Load pre-transform data
df = pd.read_csv('/workspace/data/data.csv')
# dataframe shape
df.shape
# dataframe types
df.dtypes

# 🔢 Encode the labels (0 => 44) 
class_list = df.iloc[:,-1]
encoder = LabelEncoder()
y = encoder.fit_transform(class_list)
print("y: ", y)

# 🧹 Uniformize data thanks to the initial data 
input_parameters = df.iloc[:, 1:27]
scaler = StandardScaler()
X = scaler.fit_transform(np.array(input_parameters))
print("X:", X)

# ⚗️ Create training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size = 0.2)

# 🧠 Define model architecture
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(512, activation='relu', input_shape=(X_train.shape[1],)),
    tf.keras.layers.Dropout(0.2),

    tf.keras.layers.Dense(256, activation='relu'),
    tf.keras.layers.Dropout(0.2),

    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dropout(0.2),

    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dropout(0.2),

    tf.keras.layers.Dense(45, activation='softmax'),
])

print(model.summary())

# 💪 Train the model with data
model.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = 'accuracy')

# 📈 Add the TensorBoard callback (optional)
print('Model tracking')
log_dir = "/workspace/saved_model/runs/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)

model.fit(X_train, y_train, validation_data = (X_val, y_val), epochs = 100, batch_size = 128, callbacks = [tensorboard_callback])

# 💿 Save the model for future usages
model.save('/workspace/saved_model/my_model2')
print('End of training')

Info

The tensorboard step is not mandatory. It's just a way to monitor your training.

Then, create a requirements.txt file to declare the Python dependencies:

tensorflow
numpy==1.22.4
pandas
scikit-learn
keras

Then, create a Dockerfile compliant with AI Training. You can copy and paste the following code in a file named Dockerfile:

FROM --platform=linux/x86_64 python:3.8

WORKDIR /workspace
ADD . /workspace

RUN pip install -r requirements.txt

# Mandatory to run the jobs in rootless mode
RUN chown -R 42420:42420 /workspace

CMD [ "python3"  , "/workspace/train-audio-classification.py"]

Then, build the Docker image and push it in the registry:

docker build . -f Dockerfile -t <regristry-name>/marine-mammal-job:1.0.0
docker push <regristry-name>/marine-mammal-job:1.0.0

The output should be similar to this:

$ docker build . -f Dockerfile -t my-registry.gra7.container-registry.ovh.net/ai/marine-mammal-job:1.0.0
...
 => => naming to my-registry.gra7.container-registry.ovh.net/ai/marine-mammal-job:1.0.0

$ docker push my-registry.gra7.container-registry.ovh.net/ai/marine-mammal-job:1.0.0                                                     
The push refers to repository [my-registry.gra7.container-registry.ovh.net/ai/marine-mammal-job]
6e5b7acfda9e: Pushed 
..
1.0.0: digest: sha256:72f19493662aafe3d0a3dc35ea5ab76b8472bd6a709de2da1a52e7ebf8ab7ad1 size: 3054

Once your Docker image is created and pushed into the registry, you can directly use the ovhai command to create your model training. You can launch the training specifying more or less GPU depending on the speed you want for your training.

Info

If your images are stored in a private registry, please follow the documentation Registries - Use & manage your registries to add your registry.

ovhai job run \
	--name marine-audio-classification-job \
	--gpu 1 \
	--volume marine-mammal-model@GRA/:/workspace/saved_model:RW:cache \
	--volume marine-mammal-sounds@GRA/csv/:/workspace/data:RO:cache \
	<registry name>/ai/marine-mammal-job:1.0.0

The output should be similar to this:

$ ovhai job run \
        --name marine-audio-classification-job \
        --gpu 1 \
        --volume marine-mammal-model@GRA/:/workspace/saved_model:RW:cache \
        --volume marine-mammal-sounds@GRA/csv/:/workspace/data:RO:cache \
        --unsecure-http \
        registry.gra.ai.cloud.ovh.net/my-project-id/marine-audio-classification-job:1.0.0

Id:         c0c0878c-5564-4660-889a-65724f6e3056
Created At: 04-07-23 14:05:58
Updated At: 04-07-23 14:05:58
User:       my-user
Spec:
  Image:                registry.gra.ai.cloud.ovh.net/my-project-id/marine-mammal-job:1.0.0
 Command:              
  Env Vars:             ~
  Default Http Port:    8080
  Unsecure Http:        true
  Resources:
    Gpu:               1
    Cpu:               13
    Memory:            40.0 GiB
    Public Network:    1.5 Gbps
    Private Network:   0 bps
    Ephemeral Storage: 750.0 GiB
    Gpu Model:         Tesla-V100S
    Gpu Brand:         NVIDIA
    Gpu Memory:        32.0 GiB
    Flavor:            ai1-1-gpu
  Volumes:
  - Source:
      Container: marine-mammal-model
      Alias:     GRA
      Prefix:    
      Archive:   ~
    Target: ~
    Mount:
      Mount Path: /workspace/saved_model
      Permission: Read & Write
    Cache:  true
  - Source:
      Container: marine-mammal-sounds
      Alias:     GRA
      Prefix:    csv/
      Archive:   ~
    Target: ~
    Mount:
      Mount Path: /workspace/data
      Permission: Read Only
    Cache:  true
  Timeout:              0
  Timeout Auto Restart: false
  Shutdown:             ~
  Name:                 marine-audio-classification-job
  Labels:
    ovh/id:   c0c0878c-5564-4660-889a-65724f6e3056
    ovh/type: job
  Ssh Public Keys:      ~
Status:
  State:           QUEUED
  Ip:              ~
  External Ip:     ~
  Info:
    Message: Job submitted
  History:
    STATE      DATE
    QUEUED     20-07-23 16:08:58
  Data Sync:       ~
  Duration:        0s
  Url:             https://c0c0878c-5564-4660-889a-65724f6e3056.job.gra.ai.cloud.ovh.net
  Info Url:        https://ui.gra.ai.cloud.ovh.net/job/c0c0878c-5564-4660-889a-65724f6e3056
  Ssh Url:         ~
  Monitoring Url:  ~
  Volumes:
  - Mount Path:     /workspace/saved_model
    Id:             fbff59c9-abfa-4d7f-ae53-549348e8c53a
    User Volume Id: b51cd2f5-99a0-4cc0-bc32-ce4515ecce6f
  - Mount Path:     /workspace/data
    Id:             2f25d170-5741-459b-919b-fef8d6fc74c4
    User Volume Id: bf7b5c4b-c61e-4d17-b6d8-248de56834b6

You can access to the execution logs of your job with the CLI:

ovhai job logs <job id> -f

The output should be similar to this:

$ ovhai job logs c0c0878c-5564-4660-889a-65724f6e3056 -f

Starting to watch job logs2023-07-04T15:13:08Z [job] 2023-07-04 15:13:08.579602: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-07-04T15:13:08Z [job] 2023-07-04 15:13:08.582765: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
...
2023-07-04T15:13:10Z [job] Skipping registering GPU devices...
2023-07-04T15:13:11Z [job] y:  [ 0  0  0 ... 44 44 44]
2023-07-04T15:13:11Z [job] X: [[-7.35979608e-01  2.06324223e-01  1.16956344e+00 ...  5.84987427e-01
2023-07-04T15:13:11Z [job]    5.93760708e-01 -4.47715498e-01]
2023-07-04T15:13:11Z [job]  [-6.79393058e-01  4.87389689e-01  1.43717311e+00 ...  6.24553399e-01
2023-07-04T15:13:11Z [job]    1.01619027e-01 -4.27615790e-01]
2023-07-04T15:13:11Z [job]  [-6.95846736e-01  1.96218503e-01  1.15618207e+00 ...  5.78436340e-01
2023-07-04T15:13:11Z [job]    9.53233744e-01 -1.29323842e-01]
2023-07-04T15:13:11Z [job]  ...
2023-07-04T15:13:11Z [job]  [-2.82393403e-01  6.98660564e-01 -8.70768342e-01 ...  4.16732718e-01
2023-07-04T15:13:11Z [job]    8.25026056e-01  1.64726948e-01]
2023-07-04T15:13:11Z [job]  [-1.06498353e-01 -1.16538834e-01  8.56199626e-01 ...  2.10513829e-01
2023-07-04T15:13:11Z [job]    1.61386821e-03  5.61172162e-01]
2023-07-04T15:13:11Z [job]  [ 1.77002149e+00 -6.27526483e-01  1.28201588e-02 ...  6.97330140e-01
2023-07-04T15:13:11Z [job]    5.26611477e-01  6.67499260e-01]]
2023-07-04T15:13:11Z [job] Model: "sequential"
2023-07-04T15:13:11Z [job] _________________________________________________________________
2023-07-04T15:13:11Z [job]  Layer (type)                Output Shape              Param #   
2023-07-04T15:13:11Z [job] =================================================================
2023-07-04T15:13:11Z [job]  dense (Dense)               (None, 512)               13824     
2023-07-04T15:13:11Z [job]                                                                  
2023-07-04T15:13:11Z [job]  dropout (Dropout)           (None, 512)               0         
2023-07-04T15:13:11Z [job]                                                                  
2023-07-04T15:13:11Z [job]  dense_1 (Dense)             (None, 256)               131328    
2023-07-04T15:13:11Z [job]                                                                  
2023-07-04T15:13:11Z [job]  dropout_1 (Dropout)         (None, 256)               0         
2023-07-04T15:13:11Z [job]                                                                  
2023-07-04T15:13:11Z [job]  dense_2 (Dense)             (None, 128)               32896     
2023-07-04T15:13:11Z [job]                                                                  
2023-07-04T15:13:11Z [job]  dropout_2 (Dropout)         (None, 128)               0         
2023-07-04T15:13:11Z [job]                                                                  
2023-07-04T15:13:11Z [job]  dense_3 (Dense)             (None, 64)                8256      
2023-07-04T15:13:11Z [job]                                                                  
2023-07-04T15:13:11Z [job]  dropout_3 (Dropout)         (None, 64)                0         
2023-07-04T15:13:11Z [job]                                                                  
2023-07-04T15:13:11Z [job]  dense_4 (Dense)             (None, 45)                2925      
2023-07-04T15:13:11Z [job]                                                                  
2023-07-04T15:13:11Z [job] =================================================================
2023-07-04T15:13:11Z [job] Total params: 189,229
2023-07-04T15:13:11Z [job] Trainable params: 189,229
2023-07-04T15:13:11Z [job] Non-trainable params: 0
2023-07-04T15:13:11Z [job] _________________________________________________________________
2023-07-04T15:13:11Z [job] None
2023-07-04T15:13:11Z [job] Epoch 1/100
82/82 [==============================] - 2s 13ms/step - loss: 0.1473 - accuracy: 0.9586 - val_loss: 0.1022 - val_accuracy: 0.9637
2023-07-04T15:13:13Z [job] Epoch 2/100
82/82 [==============================] - 1s 12ms/step - loss: 0.1220 - accuracy: 0.9597 - val_loss: 0.1044 - val_accuracy: 0.9668
2023-07-04T15:13:14Z [job] Epoch 3/100
...

For more explanations about the CLI command for AI Training, please read this guide: CLI Reference.

Once you have your model ready, deploy the model to use it. This will be done with the AI Deploy tool.

Go further

All the source code is available on the OVHcloud GitHub organization.

To create the application using the trained model, you can follow this tutorial: Deploy an app for audio classification task using Streamlit.

Feedback

Please send us your questions, feedback and suggestions to improve the service:

On the OVHcloud Discord server

War diese Seite hilfreich?