This section describes how to create a machine learning model for object recognition (Darknet YOLO v2 format) and convert it to a format for use with TensorFlow. Currently, that’s what the Edge IoT platform requires to function at its best.
(Please note, TensorFlow model formats work with Gravio HubKit Version 4.0 later only.)
These are the steps required:
- Collect images for training. The more the better the model will be later.
- Create the training data from the collected images.
- Create a machine learning model from the training data.
- Convert the machine learning models to a format TensorFlow can understand.
1. Collect the Data Images
Prepare the image data to be used for the machine learning data of object recognition:
1. Create a new folder for storing images in the working folder on your computer and name the folder
2. Prepare the image data that contains the object to be recognized and place it in that newly created
Please note the following points.
- The image file format should be JPEG, and the file extension should be “.jpg”. (Mixing different file extensions, such as .jpeg or .png, will result in errors during learning.)
- The image size should be at least 416 × 416 pixels.
- The number of images should be several hundred to several thousand for each object to be recognized.
- Use alphanumeric notation and sequential numbers for file names to make management more convenient. for example:
2. Create the Training Data
The training data consists of (1) prepared image data, (2) class information data. The latter is a list of names of objects to be recognized. Thirdly, we require annotation data (3) (annotation information indicating the rectangular area in the image of the object to be recognized) for each image. Here is how to add Class information data and annotation data:
1. Class information data is a list of names of objects to be recognized. Create a new file with a text editor and write one item to be recognized on each line. For example, if you want to build a learning model that recognizes three objects:
cat, put the labels on a line each. The ID numbers of the objects to be recognized are assigned by sequential numbers starting at 0. (person=0, dog=1, cat=2)
2. Complete the file editing of the class information data and save it. You can name the file as you wish, but we will name it
obj.names, under the working folder on the same computer as the images folder.
You can name the file anything you like.
Next, in the working folder, create a new folder for storing the annotation data files and name it
labels. The folder directly under the working folder will look like like this:
4. Annotate the objects in each image under the
images folder. The annotation for each object is represented by a rectangular area. In the example photo below, the annotation of “person” is illustrated.
In practice, the annotation data is saved in the form of a text file within the
labels folder. The filename extension
.jpg of the corresponding image data must be saved with an identical name but with the extension converted to
In the annotation data file, each line should contain the following information as annotation data for one object:
object ID (number),
coordinates of the center point of the rectangular area (x,y), and
The first line in the next screenshot represents the identifier
0 (=person), (x,y)=(
0.532322). The coordinates (x,y) and size (w,h) of the rectangular area of the object to be recognized are expressed as a ratio to the whole image, and the values range from 0 to 1.
The following tool is available for adding annotations to an image using GUI (in English). For more details, please refer to Annotation Data Creation with LabelImg at the end of this document.
LabelImg is available from https://github.com/tzutalin/labelImg
3. Creating a Machine Learning Model
Create a learning model in the following steps:
1. Prepare Darknet
In the terminal environment of the computer where you want to create the training model, create the darknet environment by following the steps below.
git clone https://github.com/pjreddie/darknet
If you are building the training environment on a GPU (CUDA) environment, edit the darknet/Makefile and set
GPU=1 as the first line before executing the last make command.
2. For the first training, get the trained weight data from the following location.
3. Set up your training data.
mkdir learning_data (folder of your choice)
Place the image data, annotation data, and class information data prepared in the steps 1 and 2 above in that folder:
4. Prepare the training data list and validation data list.
test.txt, enter the file paths from the
data folder onwards as shown below to avoid duplication of file names. Only the image files listed in
train.txt will be used for learning.
5. Prepare the learning configuration file.
Create a new file
darknet/cfg/obj.data and enter the following:
classes = 3 train = data/learning_data/train.txt valid = data/learning_data/test.txt names = data/learning_data/obj.names backup = backup
6. Prepare the model configuration file. In this section, we will copy the already prepared
darknet/cfg/yolov2-voc.cfg configuration file and rewrite its contents.
cp darknet/cfg/yolov2-voc.cfg darknet/cfg/obj.cfg
Adjust the newly created
obj.cfg as follows: The value of Filters will be 40 if the number of classes is 3: (
(3+5)*5 = 40)
Line 3: batch=64 Line 4: subdivisions=8 Line 237: filters=(actual number of classes to be recognized + 5) * 5 Line 242: anchors (*) Line 244: classes=3 (actual number of classes to be recognized)
(*) You can continue learning with the values of anchors as they are, but recalculating the values of anchors using the following tools may improve the accuracy.
Reference: Recalculation tool for anchors values https://github.com/Jumabek/darknet_scripts
7. Start the learning process
In the darkent folder, execute the following to generate the training model data (*.weights) under the darknet/backup/ folder.
. /darknet detector train cfg/obj.data cfg/obj.cfg darknet19_448.conv.23
When the training starts, the output will be similar to:
Region Avg IOU: 0.905463, Class: 0.626457, Obj: 0.836158, No Obj: 0.006927, Avg Recall: 1.000000, count: 8
Region Avg IOU: 0.837368, Class: 0.976554, Obj: 0.842887, No Obj: 0.006556, Avg Recall: 1.000000, count: 8
Region Avg IOU: 0.893971, Class: 0.960798, Obj: 0.876742, No Obj: 0.007555, Avg Recall: 1.000000, count: 8
Region Avg IOU: 0.872773, Class: 0.711470, Obj: 0.888209, No Obj: 0.006718, Avg Recall: 1.000000, count: 8
Region Avg IOU: 0.894234, Class: 0.986793, Obj: 0.884681, No Obj: 0.006898, Avg Recall: 1.000000, count: 8
Region Avg IOU: 0.847557, Class: 0.994500, Obj: 0.849292, No Obj: 0.006747, Avg Recall: 1.000000, count: 8
Region Avg IOU: 0.802273, Class: 0.650419, Obj: 0.794552, No Obj: 0.006926, Avg Recall: 1.000000, count: 8
Region Avg IOU: 0.866342, Class: 0.840821, Obj: 0.860806, No Obj: 0.006149, Avg Recall: 1.000000, count: 8
9531: 0.661146, 0.704077 avg, 0.001000 rate, 2.704480 seconds, 609984 images
The number at the beginning of the last line (shown in red) represents the number of steps. As the learning progresses, the average value of the error (shown in blue) becomes smaller.
Learning can be stopped at any time, but to create a highly accurate recognition model, the average value of the error should be less than 1. In some cases, more than tens of thousands of steps may be required, and depending on the operating environment, it may take several days or weeks to complete.
8. Resume learning
To resume learning, do the following: Under the backup folder, there will be a number of files with the extension .backup or .weights. If you specify one of the files with the most recent extension (usually .backup), you can resume training using the last saved training model.
. /darknet detector train cfg/obj.data cfg/obj.cfg backup/(the name of the latest file under the backup folder)
As a result of the above steps, the following three files will be available after the training is completed.
darknet/backup/*.weights (the part marked with * specifies the name of the latest file generated under the backup folder)
4. Converting a machine learning model to the format used by TensorFlow
The following example shows how to convert three files (
obj.names) into protocol buffers format (
*.pb) that can be used by programs written in TensorFlow. We are using the DarkFlow conversion tool (https://github.com/thtrieu/darkflow):
1. Installing the TensorFlow environment
Follow the official TensorFlow installation procedure to install TensorFlow in the computer environment where you will work. (Python 3 example is shown here.)
pip3 install --upgrade pip
pip3 install tensorflow==1.15
(Note: The version of TensorFlow must match the environment in which the converted model will be used.)
2. Install OpenCV for Python.
pip3 install opencv-python
3. Set up DarkFlow with the following steps.
git clone https://github.com/thtrieu/darkflow.git
sed -i -e 's/self.offset = 16/self.offset = 20/g' darkflow/utils/loader.py
python3 setup.py build_ext --inplace
pip3 install -e .
pip3 install .
4. Create a models folder to store the three files created in section 3, “Creating Machine Learning Models”. The name of each file to be stored is assumed to be obj.weights, obj.cfg, and obj.names.
(Store obj.weights, obj.cfg, and obj.names under the models folder)
5. Under the DarkFlow folder, convert each file with the following command.
flow --model models/obj.cfg --load models/obj.weights --labels models/obj.names --savepb
6. The above steps will generate two files under the
built_graph folder after the conversion.
obj.pb … Neural network model file for machine learning
obj.meta … Meta information file
These files can be deployed to your Gravio infrastructure for distribution to the edge.
Need more help with this?
Join our slack community for help