[OpenR8 solution] Image_Age_SSD_Caffe (Using the SSD 512 algorithm for image analysis and Caffe library for age prediction)
  1. Image_Age_SSD_Caffe

 

This Image_Age_SSD_Caffe is in the framework of deep learning Caffe. First, the SSD (Single Shot multibox detector) algorithm is used to train the model, and then the age prediction is carried out through the trained model. This training picture size is 512 x 512.

 

The main processes are shown in Fig. 1.

 

First, we need to prepare the image that we want the model to learn, and select the target box in the image to mark the category.

 

This purpose is to let the model know that the object in this image is of that category.

 

Then, through a series of py files, two txt list files are generated to let the model know which files are going to be trained and tested.

 

The categories are grouped in these files, and then run the py files to train two txt list files.

 

After the training is completed, select different py gear tests based on the image you want to test.

 

 

Fig. 1.SSD predictive age process.png

Fig. 1.SSD predictive age process.

 

 

  1. Step 1: Areas of interest to mark

 

Fig. 1.1SSD predictive age process.png

 

  1. Purpose:

Prepare the box to select the area of interest for the marker.

 

  1. Introduction to the content:

In the Image box, select the object that you want the model to learn. For example, if you want the model to learn the age of a person's face, prepare an image of the face, and then select the face frame in the image, mark it as the age size. As a result, the model will know that the area of the picture frame is the age of the face. By analogy, if you want the model to learn something, prepare the image and frame to mark it.

So we need to prepare the files that contain the following two items:

(1). Image (Images that you want the model to learn, images that you want to test the accuracy of the model)

(2). XML file that marks the target location of an image (Files that are automatically generated after they are marked)

 

  1. Example:

In this example of a Image_Age_SSD_Caffe scenario:

Predicting the age of the face in the image, we select the target face frame in the image through LabelImg.exe (lower Fig. 2).

The following Fig. 3, a red box is a category that boxes the face and marks it as a age (face ages).

When the tag is complete, the storage automatically produces the .xml file.

Place the image in the specified folder location. The following Fig.4.

【Training】

Image please place OpenR8/solution/Image_Age_SSD_Caffe/data/train_image

xml please place OpenR8/solution/Image_Age_SSD_Caffe/data/train_annotation

【Test】Both the image and the xml file are tested using the files of the training sample.

 

  1. Additional Instructions:

Tagged software labelImg.exe has been attached with the OpenR8 file, its file path is OpenR8 > Solution > Image_Age_SSD_Caffe > LabelImg.exe. The following Fig. 2, the method of use can refer to the Open Source Robot Club [ezAI simple AI] labelImg usage method (Windows version).

 

Fig. 2.Location of the volume label tool.png

Fig. 2.Location of the volume label tool.

 

Fig. 3.Mark the target object in the image.png

Fig. 3.Mark the target object in the image.

 

Fig. 4.Place the training sample folder location for the image and xml file with the volume label finished.png

Fig. 4.Place the training sample folder location for the image and xml file with the volume label finished

 

 

  1. Step 2: Pre-processing - Set up a txt file for the file location

 

Fig. 4.1Place the training sample folder location for the image and xml file with the volume label finished.png

 

  1. Purpose:

Defines the location of the image and its markup for training and testing.

 

  1. Introduction to the content:

(1) If the computer has a display adapter installed, click "R8_Python3.6_GPU.bat" to open "1_prepare_train_txt.py".

If the computer does not have a display adapter installed, click

"R8_Python3.6_CPU.bat" to open "1_prepare_train_txt.py".

 

Fig. 5.Location of R8_Python3.6_GPU.bat and R8_Python3.6_CPU.png

Fig. 5.Location of R8_Python3.6_GPU.bat and R8_Python3.6_CPU.bat.

 

Fig. 16.Loading 2_annoset_to_lmdb.py.png

Fig. 16.Loading 2_annoset_to_lmdb.py.

 

(2) Confirm the file path of the training sample and the marking file described below, after confirmation, you can press run, produce train.txt, appear "Press any key to continue ...", indicating success in producing train.txt.

At present, the path of the training sample is fixed in solution/Image_Age_SSD_Caffe/data/train_image.

At present, the path of the training sample marking file is fixed in solution/Image_Age_SSD_Caffe/data/train_annotation.

 

【train.txt】produces a position of OpenR8solutionImage_Age_SSD_Caffedata train.txt.

 

Fig. 7.1_prepare_train_txt.py produces train.txt results of the operation.png

Fig. 7.1_prepare_train_txt.py produces train.txt results of the operation.

 

(3) Because the test image is tested in the same way as the xml file using the file of the training sample, there is no need to produce a test.txt file.

 

Fig. 8.Location of train.png

Fig. 8.Location of train.txt.

 

Fig. 9.train.png

Fig. 9.train.txt's content diagram.

 

 

  1. Step 3: Pre-processing - Set up volume label category txt file

 

Fig. 9.1train.png

 

  1. Purpose:

The last step is to create a list of files to tell the model which files are to be trained and tested.  Then set up the category txt, which is to tell the model which categories of our data to divide into, so that the model can learn to classify the images on its own.

 

  1. Introduction to the content:

Please create a 【labelmap.prototxt】 file under the OpenR8/solution/Image_Age_SSD_Caffe/data path. The file content format is as follows:

 

item {

name: "none_of_the_above"

label: 0

display_name: "background"

}

item {

                name: "Category volume label name"

label: 1

                display_name: "Display the name of this category"

}

……

item {

                name: "Category volume label name"

label: n

                display_name: "Display the name of this category"

}

 

Please note: The above "name" "display_name" must be marked with the "mark area of interest mark" The category name and quantity of the XML match.

 

For example: when marking, divided into cats, dogs two categories, then the above "name" "display_name" must be consistent with the mark cat, dog category naming, such as Fig. 10.

 

Fig. 10.The example.png

Fig. 10.The example of labelmap.prototxt.

 

  1. Example:

At present, the Image_Age_SSD_Caffe example in the OpenR8 folder labelmap.prototxt has been built, as described in the following text.

 

In the OpenR8/solution/Image_Age_SSD_Caffe/data folder, you can see labelmap.prototxt, which can be turned on by notepad++ or notepad to open the file. The following Fig. 11.

 

The contents of the file are as follows Fig. 12, we can see that we classify images into none_of_the_above (none of the above) and 20, 30, 40, 50, 60, 70, 80, a total of eight categories, of which individual label numbers are 0, 1, 2, 3, 4, 5, 6 , 7.

 

Fig. 11.Labelmap.prototxt file path.png

Fig. 11.Labelmap.prototxt file path.

 

Fig. 12.Labelmap.prototxt content diagram.png

Fig. 12.Labelmap.prototxt content diagram.

 

 

  1. Step 4: Pre-processing - annoset_to_lmdb

 

Fig. 12.1Labelmap.prototxt content diagram.png

 

This step is to convert the txt files that have been defined in the previous two steps into the lmdb files required for subsequent training and testing of the model.

 

  1. Open annoset_to_lmdb py file

If the computer has a display adapter installed, "mouse double-click R8_Python3.6_GPU.bat", do not install the display adapter, please "mouse double-click R8_Python3.6_CPU.bat" => click "File" => "Open" => "OpenR8 > solution Under the Image_Age_SSD_Caffe "=>" 2_annoset_to_lmdb.py ". The schematic is as follows Fig. 13Fig. 14Fig. 15Fig. 16.

 

For any questions about turning on the software to the load solution, refer to the "OpenR8 operating manual".

 

Fig. 13.R8_Python3.6_GPU.bat and R8_Python3.6_CPU.png

Fig. 13.R8_Python3.6_GPU.bat and R8_Python3.6_CPU.bat.

 

Fig. 14.File Open.png

Fig. 14.File Open.

 

Fig. 15.Select the 2_annoset_to_lmdb.py of the Image_Age_SSD_Caffe folder.png

Fig. 15.Select the 2_annoset_to_lmdb.py of the Image_Age_SSD_Caffe folder.

 

Fig. 16.Loading 2_annoset_to_lmdb.py.png

Fig. 16.Loading 2_annoset_to_lmdb.py.

 

  1. Run the annoset_to_lmdb process files

After the parameter is confirmed, click Run. If there is an old lmdb file, it will jump out or remove, please click "Yes", the following Fig. 17.

 

When the execution is complete, train_lmdb and test_lmdb are generated. The following Fig. 18Fig. 19.

 

Fig. 17.Delete lmdb file.png

Fig. 17.Delete lmdb file.

 

Fig. 18.Run complete.png

Fig. 18.Run complete.

 

Fig. 19.Generate train_lmdb and test_lmdb.png

Fig. 19.Generate train_lmdb and test_lmdb.

 

  1. annoset_to_lmdb process file parameter description

 

Fig. 20.Input file left turn to output file right .png

Fig. 20.Input file (left) turn to output file (right).

 

This item sets the path of the input file and the path of the output lmdb, whose file items are as above Fig. 20. The parameter path is by default and can be skipped directly to 3. Directly executed.

 

The following is an introduction to each process order for this py file, from the previous to the following respectively:

 

►Caffe_Init : Caffe frame initialization.

 

►File_DeleteDir (Fig. 21): When the old lmdb file folder exists in the folder, you need to remove the old file before you start generating a new lmdb. (For training purposes)

As shown below, the process area selects "File_DeleteDir" to appear in the variable list, the function list appears the parameters of the process, click "dirName (String)" Edit, the variable of this folder name and display in the variable area, the "Value" of this area, "data/Train_lmdb/" is the folder for lmdb.

 

Fig. 21.File_DeleteDir.png

Fig. 21.File_DeleteDir.

 

Caffe_ObjectDetect_CreateTrainData (Fig. 22): Generate lmdb files. (For training purposes)

 

Fig. 22.Caffe_ObjectDetect_CreateTrainData.png

Fig. 22.Caffe_ObjectDetect_CreateTrainData.

 

►The following last two "File_DeleteDir" and "Caffe_ObjectDetect_CreateTrainData": In order to produce the lmdb used in the test model, set the way such as the above to produce training lmdb. 

 

 

  1. Step 5: Training Models - train

 

Fig. 22.1Caffe_ObjectDetect_CreateTrainData.png

 

The introduction is divided into two parts, the 1th first describes how to start training, and then if you are interested in knowing that each process block can be viewed in the 2nd to describe the details of this process.

 

  1. When the pre-preparation data is complete, start training the model.

First, use OpenR8 to open the "3_train.py" file and load the 3_train.py, as follows Fig. 23Fig. 24.

Next, verify that the computer supports GPU acceleration (when GPU acceleration is not supported, go to the GPU field of "Caffe_Train", change the value from "All" to ""), press "Run" to start the training model, this step takes a little time to wait for the program to build the model. The following Fig. 25.

 

When this execution is complete, a trained model is generated, and its model files are generated and placed, as follows, Fig. 26.

 

Path : OpenR8/solution/Image_Age_SSD_Caffe/data/

File name 1 : xxx.caffemodel

File name 2 : xxx.solverstate

 

Fig. 23.3_train.py file path.png

Fig. 23.3_train.py file path.

 

Fig. 24.Open the 3_train.py.png

Fig. 24.Open the 3_train.py.

 

Fig. 25.Run the 3_train.py and its execution process diagram.png

Fig. 25.Run the 3_train.py and its execution process diagram.

 

Fig. 26.The production model is finished after training.png

Fig. 26.The production model is finished after training.

 

  1. 3_train Process Introduction:

 

►Caffe_Init : Caffe frame initialization.

 

►Caffe_Train:

【CaffeObject】The setting that follows the previous initialization.

【GPU】Whether the device used to train the model needs to use GPU acceleration, if so, change the value to all; Conversely, fill in the blanks. (If you fill in all, make sure the appliance has a GPU)

【solverPath】The main parameter. This is the input file solver.prototxt required for Caffe training, which contains the number of training iterations, training parameters, and a list of samples. The following Fig. 27Fig. 28.

【continueTrainModelPath】Continue training with previously trained models. Fill in the value of the model path that you want to train, and the model extension must be .caffemodel.

 

Fig. 27.Caffe_train of the 3_train.py process.png

Fig. 27.Caffe_train of the 3_train.py process.

 

Fig. 28.solver.prototxt content diagram.png

Fig. 28.solver.prototxt content diagram.

 

 

  1. Step 6: Testing models for each training phase

 

  1. Purpose: Used to test the accuracy of each stage model, from which to find the appropriate model, to avoid training over the preparation of training samples, resulting in poor test sample results.

 

  1. Introduction to the content:

(1) Training samples: "Open the OpenR8" => "Open and load the 4_train_result.py" => "Perform the py" => "Produce the train_result.txt"

(2)Select the appropriate training number model:

Such as Fig. 29, it can be observed from train_result.txt that the results of the model at a certain training frequency stage are acceptable or have reached convergence after a certain stage. Therefore, it is more appropriate for us to choose the model of this number.

 

Fig. 29.train_result.txt content.png

Fig. 29.train_result.txt content.

 

 

  1. Step 7: Test a trained model – inference_image

 

Fig. 29.1train_result.txt content.png

 

When the model has been trained, you can test the trained model with this step. First use OpenR8 to open "5_inference.py" and load the file. The following Fig. 30Fig. 31

 

  1. Test flow

 

►Select the image you want to test, as follows Fig. 32.

Replace the selected image (20-3439312.jpg) with the following Fig. 35, showing the detection results for 20-3439312.jpg.

 

►Verify that the computer supports GPU acceleration. Click "Edit" of the "Caffe_ObjectDetect_ReadNet" GPU, and "all" of "Value" in the variable area is changed to "" To cancel the use of GPU acceleration.

 

►Select a model with good training effect to detect the picture. Click "Caffe_ObjectDetect_ReadNet" caffeModelpath "..." To select the training model (. caffemodel file) you want to use.

 

►View the result message. Press Run to detect the coordinate position of the target in the object and its category through the trained model. The following Fig. 33.

 

►View the results messages and images. If you want to see the results box detected above the image, press Debug. Displays the results of object detection. The following Fig. 34. The image contains the age at which the face is predicted and the face frame is selected.

 

Fig. 30.5_inference.py.png

Fig. 30.5_inference.py.

 

Fig. 31.Loading 5_inference.py.png

Fig. 31.Loading 5_inference.py.

 

Fig. 32.5_inference select test Images.png

Fig. 32.5_inference select test Images.

 

Fig. 33.5_inference run results.png

Fig. 33.5_inference run results.

 

Fig. 34.5_inference Debug results.png

Fig. 34.5_inference Debug results.

 

Fig. 35.5_inference another image test result.png

Fig. 35.5_inference another image test result.

 

  1. Introduction to inference.py process content

 

►Caffe_Init: Caffe frame initialization.

 

►Image_Open:

【imageFileName (String)】To infer the image path of the test.

【image (Image)】After reading the image to be inferred according to the image file case name above, save this variable.

 

►Caffe_ObjectDetect_ReadNet

【CaffeObject (Object)】

【GPU (String)】Whether to turn on GPU acceleration.

【deployPath (String)】Read network structure model. data/deploy.prototxt.

【caffeModelPath (String)】Read the trained model (extension of .caffemodel).

【labelPath (String)】Read the category file used when tagging with labelImg.exe. data/predefined_classes.txt.

【meanFilePath (String)】

 

►Caffe_ObjectDetect_InferenceImage

【CaffeObject (Object)】

【OutputResult (Json)】The results after the image is deduced are stored in the message to this variable.

【GPU (String)】Whether to turn on GPU acceleration.

【image (Image)】An image to infer.

【confidence Threshold (Float)】The trust threshold.

 

►Json_Print

【json (Json)】Display the test results message for Caffe_ObjectDetect_InferenceImage.

 

►Image_DrawRectJson

【image (Image)】An image to infer.

【json (Json)】Inference of the result message.

【imageDrawRectJson (Image)】Draw the inferred result message to the image to be inferred.

 

►Debug_Image

【image (Image)】Display an image of a message that has been drawn on the inference result.

【displayPercentage (Int)】Display the percentage of the graph. A value of 200 means 200 (the picture is magnified one times), 50 means 50 (the picture is halved), and so on, zoom in and out.

【windowTitle (String)】Display the title of the diagram window.

 

  1. If you want to infer the test results of multiple images, you can turn on the 6_inference_folder.py file to use. Place multiple images to be inferred in a custom folder, such as Fig. 36 as shown. Set the image folder path that you want to infer, such as Fig. 37 as shown, press Run to start multiple image inferences, and if you want to see the results box detected above the image, press Debug. 6_inference_folder.py process content settings can be referred to 5_inference.py set way.

 

Fig. 36.Image folder to infer.png

Fig. 36.Image folder to infer.

 

Fig. 37.Inference of test results for multiple images.png

Fig. 37.Inference of test results for multiple images.

 

 

  1. Step 8: Capture webcam image test the trained model - inference_webcam

 

Fig. 37.1Inference of test results for multiple images.png

 

When the model has been trained, you can also test the trained model by retrieving webcam images with this step.

 

First use OpenR8 to open "7_inference_webcam.py" and load the file. The following Fig. 38Fig. 39.

 

  1. Test flow

►Confirm the "deviceNumber" of "OpenCV_VideoCapture_Open" webcam (the pen lens defaults to 0, the installed webcam is 1, and so on). The following Fig. 40.

 

►Verify that the computer supports GPU acceleration. Click "Edit" of the "Caffe_ObjectDetect_ReadNet" GPU, and "all" of "Value" in the variable area is changed to "" To cancel the use of GPU acceleration.

 

►Select a model with good training effect to detect the picture. Click "Caffe_ObjectDetect_ReadNet" caffeModelPath "..." to select the training model (.caffemodel file) you want to use.

 

►Press Run to detect the coordinate position and category of the object target of the webcam capture image through a trained model. The following Fig. 41.

 

Fig. 38.7_inference_webcam.py.png

Fig. 38.7_inference_webcam.py.

 

Fig. 39.Loading 7_inference_webcam.py.png

Fig. 39.Loading 7_inference_webcam.py.

 

Fig. 40.7_inference_webcam.png

Fig. 40.7_inference_webcam.

 

Fig. 41.The result.png

Fig. 41.The result of the implementation of 7_inference_webcam.

 

  1. Introduction to inference_webcam.flow process content

►Caffe_Init: Caffe frame initialization.

 

►Caffe_OjbectDetect_ReadNet

【CaffeObject (Object)】

【GPU (String)】Whether to turn on GPU acceleration.

【deployPath (String)】Read the network structure model. data/deploy.prototxt.

【caffeModelPath (String)】Read the trained model (extension of .caffemodel).

【labelPath (String)】Read the category file used when tagging with labelImg.exe. data/predefined_classes.txt.

【meanFilePath (String)】

 

►OpenCV_VideoCapture_Init: Streaming capture initialization.

 

►OpenCV_VideoCapture_Open: Open the specified webcam.

 

►OpenCV_VideoCapture_Grab: Retrieve the specified webcam image.

 

►OpenCV_VideoCapture_Retrieve: Get the image retrieved by webcam.

 

►Caffe_ObjectDetect_inferenceImage

【 CaffeObject (Object) 】

【OutputResult (Json)】The results after the image is deduced are stored in the message to this variable.

【GPU (String)】Whether to turn on GPU acceleration.

【image (Image)】An image to infer.

【ConfidenceThreshold (Float)】The trust threshold.

 

►Json_Print 【json (Json)】Displays the test results message for Caffe_ObjectDetect_inferenceImage.

 

►Image_DrawRectJson

【image (Image)】An image to infer.

【json (Json)】Inference of the result message.

【imageDrawRectJson (Image)】Draw the inferred result message to the image to be inferred.

 

►Image_Show

【image (Image)】Display an image of a message that has been drawn on the inference result.

【displayPercentage (Int)】Display the percentage of the graph. A value of 200 means 200 (the picture is magnified one times), 50 means 50 (the picture is halved), and so on, zoom in and out.

 

►Image_DestoryAllWindows: Close the image window.

 

►OpenCV_VideoCapture_Release:Close the webcam.


Recommended Article

1.
OpenR8 Community Edition - AI Software for Everyone (Free Download)