[OpenR8 solution] Image_OCR_Caffe_FC (Optical character recognition handwritten numbers using Caffe)
  1. Chapter1: Image_OCR_Caffe_FC Introduction

 

Image_OCR_Caffe_FC is a MNIST handwritten digit number identification using Caffe to identify the number in the image, as shown in Fig. 1.

 

First, this document will show you how to train, and choose a test image to see the results, as shown in Fig. 2.

 

Fig. 1. The goal of the solution.png

Fig. 1. The goal of the solution.

 

Fig. 2. Solution flow.png

Fig. 2. Solution flow.

 

※If you are interested in deep learning, you can refer to other deep learning files in “Open Robot Club”.

  1. Chapter2: Image_OCR_Caffe_FC Folder Introduction

 

Image_OCR_Caffe_FC is in the solution folder of OpenR8, as shown in Fig. 3, which has a folder and two flow files, as shown in Fig. 4.

 

  1. Folder: “data folder” Stores the files that Caffe will use.
  2. flow File: “1_train.flow” and “2_inference.flow” are used for training and test images.

 

Fig. 3. Image_OCR_Caffe_FC location.png

Fig. 3. Image_OCR_Caffe_FC location.

 

Fig. 4. Image_OCR_Caffe_FC folder.png

Fig. 4. Image_OCR_Caffe_FC folder.

 

The contents of “data folder” are as follows.

data folder

use

test_lmdb folder

Place the folder of the test sample database.

trainval_lmdb folder

Place the folder of the training sample database.

image folder

Store the test image.

deploy.prototxt

Read the neural network structure model.

label.txt

A file with numbers 0 to 9 is indicated.

solver.prototxt

Caffe's parameter settings.

train.prototxt

Read the trained neural network structure model.

FC_28x28_iter_10000.caffemodel

Trained Caffe model.

FC_28x28_iter_10000.solverstate

Trained files.

 

※ If you are using Caffe for the first time and are unfamiliar with Caffe related settings, it is recommended not to change the file location and name by yourself.

 

 

  1. Chapter3: Caffe training

 

Please click “File” => “Open” => “Enter the solution folder under OpenR8” => “Select Image_OCR_Caffe_FC folder” => “Select 1_train.flow”, as shown in Fig. 5, Fig. 6.

 

Fig. 5. Select 1_train.flow.png

Fig. 5. Select 1_train.flow.

 

Fig. 6. Open 1_train.flow.png

Fig. 6. Open 1_train.flow.

 

You can press Run to start training.

 

Fig. 7. Run the 1_train.flow solution.png

Fig. 7. Run the 1_train.flow solution.

 

After the training is completed, go to the next chapter to introduce how to select the image to see the result.

 

※Because the database of Image_OCR_Caffe_FC has been generated in advance, if you need to add training samples, please generate your own database.

 

※If the message shown in Fig. 8 appears, change the value of DeviceMode to a null value.

 

Fig. 8. No GPU error message.png

Fig. 8. No GPU error message.

 

 

  1. Chapter4: Select image to see results

 

Please click “File” => “Open” => “Enter the solution folder under OpenR8” => “Select Image_OCR_Caffe_FC folder” => “Select 2_inference.flow”, as shown in Fig. 9, Fig. 10.

 

Fig. 9. Select 2_inference.flow.png

Fig. 9. Select 2_inference.flow.

 

Fig. 10. Open 2_inference.flow.png

Fig. 10. Open 2_inference.flow.

 

Select an image to test.

 

Fig. 11. Select a test sample.png

Fig. 11. Select a test sample.

 

You can press to run to view the results.

 

Fig. 12. Run the 2_inference.flow solution.png

Fig. 12. Run the 2_inference.flow solution.

 

Fig. 13. Description of running results.png

Fig. 13. Description of running results.

 

Fig. 14. Handwritten a test result.png

Fig. 14. Handwritten a test result.

 

※ Too large an image will lose a certain degree of accuracy. If you want to test it yourself, the image may not be too large.

 

※If the message shown in Fig. 15 appears, change the value of DeviceMode to a null value.

 

Fig. 15. No GPU error message.png

Fig. 15. No GPU error message.

 

 

  1. Chapter5: 1_train.flow

 

  1. Caffe_Init:

Initially Caffe will be initialized.

  1. Caffe_Train:

Read in the solver.prototxt file (including some parameter settings for caffe), and set whether or not there is a GPU, you can start running, as shown in Fig. 16.

 

 

Fig. 16. 1_train.flow - Caffe_Train.png

Fig. 16. 1_train.flow - Caffe_Train.

 

 

  1. Chapter6: 2_inference.flow

 

 

  1. Image_Open:

Open the image you want to test.

imageFileName: Select the image.

image: Output the image read in.

 

Fig. 17. 2_inference.flow - Image_Open.png

Fig. 17. 2_inference.flow - Image_Open.

 

  1. Caffe_Init

Initially Caffe will be initialized.

 

Fig. 18. 2_inference.flow - Caffe_Init.png

Fig. 18. 2_inference.flow - Caffe_Init.

 

  1. Caffe_ReadModel:

CaffeObject: Select “Caffe_Initialize” the same object.

GPU: Whether to use GPU acceleration.

deployPath: Read the “deploy.prototxt” file (in the data folder).

caffeModePath: Read the “FC_28x28_iter_10000.caffemodel” file (in the data folder), which is generated by executing 1_train.flow.

labelPath: Read the “label.txt” file (in the data folder), which holds the names of all categories.

meanFilePathOrMeanValueList: Calculate the image mean.

 

Fig. 19. 2_inference.flow - Caffe_ReadModel.png

Fig. 19. 2_inference.flow - Caffe_ReadModel.

 

  1. Caffe_InferenceImage:

Let the image output by “Image_Open” let Caffe infer whether there is a category.

CaffeObject: Select “Caffe_Initialize” the same object.

inferenceLabel: The category to which the output belongs (the category here refers to 0 to 9). inferenceProbabilty: Output a score of similarity.

GPU: Whether to use GPU acceleration.

image: The image output by “Image_Open”.

 

Fig. 20. 2_inference.flow - Caffe_InferenceImage.png

Fig. 20. 2_inference.flow - Caffe_InferenceImage.

 

  1. Image_Show:

Displays the image that “Image_Open” has read in.

image: Select the image output by “Image_Open”.

displayPercentage: The displayed image zoom rate is preset to 100 (=100% normal size) when not filled.

windowTitle: The name of the displayed window. The name is not displayed when not filled.

 

Fig. 21. 2_inference.flow - Image_Show.png

Fig. 21. 2_inference.flow - Image_Show.

 

  1. Print:

Print "InferenceString =" in the DOS window.

 

Fig. 22. 2_inference.flow - Print.png

Fig. 22. 2_inference.flow - Print.

 

  1. Println

Wrap after the category is printed in the DOS window.

 

Fig. 23. 2_inference.flow - Println.png

Fig. 23. 2_inference.flow - Println.

 

  1. Print:

Print " InferenceProbability =" in the DOS window.

 

Fig. 24. 2_inference.flow - Print.png

Fig. 24. 2_inference.flow - Print.

 

  1. Println:

Wrap after printing a similarity score in the DOS window.

 

Fig. 25. 2_inference.flow - Println.png

Fig. 25. 2_inference.flow - Println.

 

  1. WaitKey:

After Image_Show, you must add WaitKey to see the image, and how many microseconds the image stays.

keyCode: The output signal.

milliSencond: If you want the image to be turned off until any key is pressed, set 0. If you want the image to be displayed after 1 second, set 1000.

※1 Second = 1000 microsecond.

 

Fig. 26. 2_inference.flow - WaitKey.png

Fig. 26. 2_inference.flow - WaitKey.

 

  1. Image_DestoryAllWindows:

Close all image windows displayed by Image_Show.


Recommended Article

1.
OpenR8 - AI Software for Everyone (Download)