[OpenR8 solution] Image_Webcam_SSD300_Caffe (Using Caffe SSD algorithm to detect objects in webcam instant images)
  1. Chapter1: Image_Webcam_SSD300_Caffe Introduction

 

This file is an introduction to Caffe SSD to detect and label objects in webcam real-time images.

 

Learn how Caffe judges real-time image analysis objects through Image_Webcam_SSD300_Caffe, so first, this document will first describe how Image_Webcam_SSD300_Caffe labels category, using openR8 to open Image_Webcam_SSD300_Caffe solutions within produce databases, train images, and infer webcam images.

 

Its operation uses the main process as Fig. 1, and each step is described in detail in each section of this document.

 

Fig. 1. The process of labeling training samples.png

Fig. 1. The process of labeling training samples.

 

※ If you just want to test webcam real-time image detection, you can jump directly to chapter6 - Run 3_inference_webcam.flow.

 

※ If you want to increase (or change) the number of samples and increase (or change) The sample category, see chapter 3 to chapter 6 to label sample categories, recreate databases, and retrain samples.

 

※ If you have any questions in execution, you can refer to chapter 7 - additional instructions.

 

 

  1. Chapter2: Image_Webcam_SSD300_Caffe Folder Introduction

 

Image_Webcam_SSD300_Caffe is in the solution folder of OpenR8, which contains:

  1. Folder: “data folder”.
  2. flow file: “1_annoset_to_lmdb.flow”, “2_train.flow”, “3_inference_webcam.flow”.
  3. exe execution file: labelImg.exe.

 

※ The first use of caffe, to caffe some related settings unfamiliar, you can refer to the Image_Webcam_SSD300_Caffe in the file location to place the corresponding file, wait until familiar with the path, you can change the storage location, content.

 

Fig. 2. Image_Webcam_SSD300_Caffe location.png

Fig. 2. Image_Webcam_SSD300_Caffe location.

 

Fig. 3. Image_Webcam_SSD300_Caffe folder.png

Fig. 3. Image_Webcam_SSD300_Caffe folder.

 

name

use and function

contents

data

Where the sample image, sample category, database is stored, Table 2 will be explained the contents of the folder.

VOC2007、

trainval_lmdb、

test_lmdb、

deploy.prototxt、labelmap_voc.prototxt、

predefined_classes.txt、

solver.prototxt

test.prototxt、

test.txt、

train.prototxt、

trainval.txt、

XXX.caffemodel file。

1_annoset_to_lmdb.flow

Turn the training sample into a database (lmdb).

 

2_train.flow

Training samples。

 

3_inference_webcam.flow

Real-time image analysis of objects.

 

labelImg.exe

The software that labels the category to which the sample belongs.

 

Table1. Image_Webcam_SSD300_Caffe folder introduction.

 

Fig. 4. Image_Webcam_SSD300_Caffe - data folder.png

Fig. 4. Image_Webcam_SSD300_Caffe - data folder.

 

data folder

use

test_lmdb folder

Place the folder of the test sample database

trainval_lmdb folder

Place the folder of the train sample database

VOC2007 folder

VOC2012 folder

Where the sample image and sample categories are stored.

deploy.prototxt

Network file.

labelmap_voc.prototxt

The category number indicated.

predefined_classes.txt

The file that is used when the LabelImg.exe label category.

solver.prototxt

Parameter setting for Caffe.

test.prototxt

Test network.

test.txt

Enter the file path of the test sample file and the file path of the test sample category label.

train.prototxt

Training network.

trainval.txt

Enter the file path of the train sample file and the file path of the train sample category label.

VGG_VOC0712_SSD

_300x300_iter_120000.caffemodel

pretrained Caffe model。

Table 2. Image_Webcam_SSD300_Caffe - data folder introduction.

 

The content of "labelmap_voc.prototxt", such as Fig. 5, there are several categories where there will be several item{},label:0 background referring to the background, please change the category name from Label:1.

 

Fig. 5. labelmap_voc.prototxt content format.png

Fig. 5. labelmap_voc.prototxt content format.

 

 

  1. Chapter3: Prepare training Samples

 

When we use Caffe to train, we must decide the right direction first, for example:  I want to detect that this is a person, an animal, a car, etc. To do this, we need to prepare images of people, animals, cars, etc., and label their images one by one (people, animals, cars...) to join training.

 

Step1: open labelImg.exe

Open the "LabelImg.exe" in the Image_Webcam_SSD300_Caffe folder to label the sample categories we want to train.

 

Fig. 6. labelImg.exe interface.png

Fig. 6. labelImg.exe interface.

 

Step2: Select the sample images storage folder.

Click “Open Dir” to open the folder location where the image sample is placed, as in Fig. 7, where the image is placed in data/VOC2007/JPEGImages (depending on where each person is placed, the location of the folder is different. The first time use suggestion is to replace the original image with data/VOC2007/JPEGImages), and then press "Select Folder".

 

Fig. 7. Open the location of the sample.png

Fig. 7. Open the location of the sample.

 

Step3: Select the store category label folder.

Click “Changed Save Dir” to open the sample after the category, the location of the folder, the following Fig. 8 as an example, put the label file in the data/VOC2007/Annotations (according to the location of each person, the location of the folder is also different, the first use of the recommendation placed in the data/VOC2007/Annotations replaces the original label file), and then press "Select Folder".

 

※Suggestions: The folder location of the sample image and the sample category, preferably in the same folder, means that there is a data folder, which is divided into only two folders, one to save the sample image and one to save the sample category.

 

Fig. 8. Location of folders stored after selecting a sample label category.png

Fig. 8. Location of folders stored after selecting a sample label category.

 

Step4:label bounding box

Take Fig. 9 as an example, press the “Create RectBox” box to select the range of cats, and then choose the cat category, image is not limited to the same category can only be framed, after the box, you can press the save to store the label category of the file, press “Next Image” to continue the box to label the next sample image, until all sample images are labeled with category.

 

Fig. 9. Sample image.png

Fig. 9. Sample image.

 

Fig. 10. Click Create RectBox to label .png

Fig. 10. Click Create RectBox to label.

 

Fig. 11. After selecting the sample category press OK.png

Fig. 11. After selecting the sample category press OK.

 

Fig. 12. Box the sample category and press Save..png

Fig. 12. Box the sample category and press Save.

 

Fig. 13. Click Next Image to next image.png

Fig. 13. Click Next Image to next image.

 

After you have framed all the samples, you need to make two files, "Trainval.txt" and "test.txt".

 

  1. trainval.txt:

The content needs to place the path of the sample image (select the location of Open Dir) and the path of the sample category file (Change Save Dir), which is formatted as shown in Fig. 14

 

Fig. 14. trainval.txt examples of content.png

Fig. 14. trainval.txt examples of content.

 

As shown in Fig. 14, VOC2007 is the folder name for storing sample pictures and sample categories.

 

JPEGImages the name of the folder where the sample image is stored.

Annotations The name of the folder where the sample category is stored.

000001.jpg is the file name of the sample image.

000001.xml is the file name of the sample category.

So if there are 100 samples, there will be 100 lines in the format of Fig. 14.

 

  1. test.txt:

The content needs to put the path of the test image and the path of the test sample category file (xml file) in the same format as [trainval.txt].

 

※ Note: Trainval.txt and test.txt are the same layer as the folder where the sample image is stored together with the sample category.

 

 

  1. Chapter4: Run 1_annoset_to_lmdb.flow to produce lmdb

 

There is a "R8.exe" execution file under the OpenR8 folder. The following Fig. 15. You can execute R8.exe with double-clicks.

 

Fig. 15. Execute R8 software.png

Fig. 15. Execute R8.exe

 

Please click "File" => "Open" => "to enter the solution folder under OpenR8" => "Select Image_Webcam_SSD300_Caffe folder" => "Select 1_annoset_to_lmdb.flow", such as Fig. 16, Fig. 17.

 

Fig. 16. Select 1_annoset_to_lmdb.png

Fig. 16. Select 1_annoset_to_lmdb.py

 

Fig. 17. Open 1_annoset_to_lmdb.png

Fig. 17. Open 1_annoset_to_lmdb.py.

 

Press execute to generate the database (prerequisite: trainval.txt exists and its contents and path are correct) until "Press any key to continue..." is displayed(If there is a crash phenomenon, please refer to chapter 7- Additional Instructions)

 

The following is an introduction to the function of the process.

 

Fig. 18. 1_annoset_to_lmdb.py process.png

Fig. 18. 1_annoset_to_lmdb.flow process.

 

  1. Caffe_Init:

The Caffe will be initialized at the beginning.

  1. File_DeleteDir:

Delete the original old Trainval_lmdb folder (training).

 

Fig. 19. 1_annoset_to_lmdb.py - Delete the trainval_lmdb folder for training.png

Fig. 19. 1_annoset_to_lmdb.flow - Delete the trainval_lmdb folder for training.

 

  1. Caffe_ObjectDetect_CreateTrainData:

Read the sample image and category within Trainval.txt to establish a database (for training).

 

Fig. 20. 1_annoset_to_lmdb.py - Establish a database for training.png

Fig. 20. 1_annoset_to_lmdb.flow - Establish a database for training.

 

  1. File_DeleteDir:

Delete the original old Test_lmdb folder (test).

 

Fig. 21. 1_annoset_to_lmdb.py Delete the test_lmdb folder for test.png

Fig. 21. 1_annoset_to_lmdb.flow Delete the test_lmdb folder for test.

 

  1. Caffe_ObjectDetect_CreateTrainData:

Read the sample image and category within test.txt to establish a database (for test).

 

Fig. 22. 1_annoset_to_lmdb.py - Establish a database for test.png

Fig. 22. 1_annoset_to_lmdb.flow - Establish a database for test.

 

 

  1. Chapter5: Run 2_train.flow training sample

 

After executing the 1_annoset_to_lmdb.flow, open the 2_train.flow, such as Fig. 23, Fig. 24. 

 

Fig. 23. Select 2_train.png

Fig. 23. Select 2_train.py.

 

Fig. 24. Open 2_train.png

Fig. 24. Open 2_train.py.

 

※ If the computer does not have a GPU or does not want to use GPU training, go to the "GPU" field of "Caffe_ Train" and change "all" to "" (nothing is filled in).

 

Press “Run” to start the training sample (prerequisite: the database has been generated.)

 

  1. Caffe_Init:

The Caffe will be initialized at the beginning.

  1. Caffe_Train:

Read into the solver.prototxt file (including some parameter settings for Caffe), and set whether or not there is a GPU, you can run, as shown in Fig. 25.

 

Fig. 25. 2_train.py - Caffe_Train.png

Fig. 25. 2_train.py - Caffe_Train.

 

 

  1. Chapter6: Run 3_inference_webcam.flow to detect webcam images

 

After running 2_train.flow, open 3_inference_webcam.flow, as shown in Fig. 26, Fig. 27.

 

Fig. 26. Select 3_inference_webcam.png

Fig. 26. Select 3_inference_webcam.py.

 

Fig. 27. Open 3_inference_webcam.png

Fig. 27. Open 3_inference_webcam.py.

 

※ If the computer doesn’t have a GPU, go to the "enableGPU" field of “Caffe_ObjectDetect_ReadNet” and change "all" to "" (nothing is filled).

 

※ If there is retraining, go to the “caffeModelPath” field of “Caffe_ObjectDetect_ReadNet” and select the changed Caffe model file.

 

Press to run to open the webcam to object detection, as shown in Fig. 28.

 

Fig. 28. Open webcam object detection.png

Fig. 28. Open webcam object detection.

 

※If you want to switch other webcams, please refer to the “deviceNumber” of “OpenCV_VideoCapture_Open”.

 

The following is a functional introduction to the process:

 

  1. R7_EnableWxWidgets:

Create a window (wxwidgets version).

 

  1. Caffe_Init:

Initialize Caffe.

 

  1. Caffe_VideoCapture_ReadNet:

CaffeObject: Select object from "Caffe_Init".

enableGPU: Select whether to use GPU acceleration.

deployPath: Read the relative path of “deploy.prototxt”.

caffeModePath: Reads the relative path of “XXX.caffemodel” from the result of the previous train.flow.

labelPath: Reads the relative path of “predefined_classes.txt”, which stores the names of all categories.

meanFilePath: Calculate the image mean.

 

Fig. 29. 3_inference_webcam.py - Caffe_ObjectDetect_ReadNet.png

Fig. 29. 3_inference_webcam.py - Caffe_ObjectDetect_ReadNet.

 

  1. OpenGL_NewWindows:

A new window for displaying images.

OpenGLWindow: Created window object.

OpenGLWindowTitle: The name of the window displayed.

 

Fig. 30. 3_inference_webcam.py - OpenGL_NewWindow.png

Fig. 30. 3_inference_webcam.flow - OpenGL_NewWindow.

 

  1. OpenGL_ShowWindow:

Fill in the OpenGL window object created in the previous point to display the window.

 

  1. R7_ProcessWxPendingEvents:

Establish a wxwidgets event.

 

  1. OpenCV_VideoCapture_Init:

Use OpenCV VideoCapture to retrieve the image and initialize it at first.

 

  1. OpenCV_VideoCapture_Open:

videoCaptureObject: An object from the previous "Opencv_ VideoCapture _ Init".

deviceNumber: You can specify which webcam to use, 0 for the first, 1 for the second, and so on.

 

Fig. 31. 3_inference_webcam.py - OpenCV_VideoCapture_Open.png

Fig. 31. 3_inference_webcam.flow - OpenCV_VideoCapture_Open.

 

  1. Loop:

Take Fig. 32 as an example, using an infinite loop to capture the webcam image.

 

Fig. 32. 3_inference_webcam.py - Loop.png

Fig. 32. 3_inference_webcam.flow - Loop.

 

  1. OpenCV_VideoCapture_Grab:

Capture the current webcam to capture the image.

 

  1. OpenCV_VideoCapture_Retrieve:

Get the image of "OpenCV_VideoCapture_Retrieve" on the previous point.

 

Fig. 33. 3_inference_webcam.py - OpenCV_VideoCapture_Retrieve.png

Fig. 33. 3_inference_webcam.flow - OpenCV_VideoCapture_Retrieve.

 

  1. Caffe_ObjectDetect_InferenceImage:

Fill in the image of the previous point “OpenCV_VideoCapture_Retrieve” into the image field of this function to determine the category to which the image belongs.

CaffeObject: The Caffe object selected by the 3rd "Caffe_ ObjectDetect_ReadNet".

enableGPU: Whether to use GPU acceleration.

image: The previous "OpenCV_VideoCapture_Retrieve" image.

ConfidenceThreshold: Detects the threshold value.

 

Fig. 34. 3_inference_webcam.py - Caffe_ObjectDetect_InferenceImage.png

Fig. 34. 3_inference_webcam.flow - Caffe_ObjectDetect_InferenceImage.

 

  1. Image_DrawRectJson:

If the category is judged at the previous point, the position will be output in the form of Json. Here, Json is read in and drawn in the image of “OpenCV_VideoCapture_Retrieve” at point 11.

 

Fig. 35. 3_inference_webcam.py - Image_DrawRectJson.png

Fig. 35. 3_inference_webcam.flow - Image_DrawRectJson.

 

  1. OpenGL_ShowImage:

The result image of the previous "iImage_DrawRectJson" is displayed in the 5th "OpenGL_ShowWindow".

 

Fig. 36. 3_inference_webcam.py - OpenGL_ShowImage.png

Fig. 36. 3_inference_webcam.py - OpenGL_ShowImage.

 

  1. Loop_End:

The loop content is: Retrieves the image => the judgment category => displays the image. So after displaying the image, you need a "Loop_End" to indicate where the loop ends.

 

  1. OpenCV_VideoCapture_Release:

Close the webcam.

 

 

  1. Chapter7: Additional Instructions

 

Here are some of the issues you might encounter:

 

Before the database is generated:

  1. Before you open the labelImg.exe, determine the name of the category you want to categorize and fill in the “predefined_classes.txt” located in the “data” folder.

Example: Suppose you want to test whether you can divide fish, frogs, snakes, sparrows, people, these five categories.

The content in Predefined_classes.txt is Fig. 41

 

※ Category names are not supported with spaces, If there is a space between the Passer montanus saturatus, please use the bottom line (such as Passer_montanus_saturatus) or another naming method instead.

 

Fig. 37. In predefined_classes.txt modify category name.png

Fig. 37. In predefined_classes.txt modify category name.

 

  1. After the category is labeled, whether there are trainval.txt and test.txt generated.

 

When the database is generated:

  1. trainval.txt and test.txt content of the picture, lable file path is incorrectly filled, will not be able to generate the database, path format please see Fig. 14.

  1. Whether the classification category for “labelmap_voc.prototxt” is updated. Example:

Take Fig. 41 as an example, the labelmap_voc.prototxt content needs to be changed to Fig. 42.

 

Fig. 38. Modify category names in Labelmap_voc.prototxt.png

Fig. 38. Modify category names in Labelmap_voc.prototxt.

 

  1. If you want to train a large number of pictures or pictures, the resulting database will be relatively large, if the midway crash, please clear out the space to produce the database (if the database of this file as an example, more than 10,000 PNG files will produce more than 1 GB of database).

 

When training samples:

  1. Whether Trainval_lmdb and Test_lmdb have been produced.

  1. Verify that your computer is using GPU training.

 

Fig. 39. No GPU error message.png

Fig. 39. No GPU error message.

 

When detecting object using webcam:

  1. Verify that the computer is using GPU acceleration, and the error message will look like Fig. 39.

Recommended Article

1.
OpenR8 Community Edition - AI Software for Everyone (Free Download)