Case Study – problem statement:
Many problems in Computer Vision are getting solved in the recent years with the advancements of Deep learning Techniques. One is Image Classification, & we have been seeing a good progress on this and the Object Detection to some extent. But the problem becomes more complex if we have custom objects related to our Business Domain. It may be medical Images or the one related to Security for malware detection. It’s very useful for us to have a model which can work on detecting our own set of objects. So, Let’s start.
All these steps are specific to Windows OS
Install the Python3+ version of Anaconda with the link below
Install Tensorflow using the command below
- conda install -c conda-forge tensorflow=1.14
Install all the other dependencies:
- pip install –user Cython
- pip install –user contextlib2
- pip install –user pillow
- pip install –user lxml
- pip install –user jupyter
- pip install –user matplotlib
Create a folder named Project or some other name and we will be placing all the files which we download & this will be using this directory throughout.
Add this directory to the python environment variables else python will not be recognised as internal command.
Download Git/Git bash for Windows
Now open the Git bash go the Project directory & clone the tensorflow models from github
After cloning the folder appears with name models-master. For our convenience change it to models.
2.Download the Protocol Buffers:
Select the appropriate zip file Select the appropriate zip file 32/64bit windows version.
If you are on 64bit download the file with name “protobuf-python-3.11.2.zip” else if you are on 32bit download “protoc-3.11.2-win32.zip”
Extract the protobuf in the same folder.
Now in the command prompt go to the research directory which is in models folder. You should be able to see the object detection folder. Then execute the protobuf compile giving the location of “protoc.exe” or “protoc” file which is in bin folder which you got after extracting the protoc zip file.
C:\Users\Akhilesh\Desktop\Project\models\research>C:\Users\Akhilesh\Desktop\Project\bin\protoc object_detection\protos\*.proto –python_out=.
Add libraries to the Python Path
The models/research & models/research/slim should be added to the python path. Open the System environment variables select New, add a new variable with the name PYTHONPATH and all these paths, save it.
To make sure you repeat the same on the command prompt also. Open the command prompt and set the path
- set PYTHONPATH=C:\Users\Akhilesh\Desktop\Project\models;
- set PYTHONPATH=C:\Users\Akhilesh\Desktop\Project\models\research;
- set PYTHONPATH C:\Users\Akhilesh\Desktop\Project\models\research\slim;
- set PATH=%PATH%; PYTHONPATH
To check the path is set correctly you can use echo
In the same folder now create a folder named images and put all your collection in it.
Install the tool LabelImg so that we can label the images we have so that we have the dataset ready for training & testing. Use the github link below.
After installing open the labelimg using the command
- python labelimg.py
Now open the folder containing our images to be labelled
Select the PascalVOC format & Here you begin to annotate with the create rectbox button. Draw your box, add the name in and hit OK. Save the image and hit next image & repeat the process. Use Ctrl+s to save it.
After annotation of all the images consider 80% of XML files for training 20% for your test. Create train & test folders in images directory & place all the train xml, test xml files in respective folders. After you create train & test folders & placing these xml files in respective folders, make sure the images are present in the same path as previous.
Take the script xml_to_csv from this link & change accordingly
This script should be placed in the directory from where you can access the images folder else change the path of images as well as train & test. And also create a folder with the name “training” in the models/research/object_detection directory.
Create a folder named “data” parallel in the same directory of images folder.
Now run the script using the command
- python xml_to_csv.py
If the above command runs successfully now in the data folder you should be able to see the train.csv & test.csv files.
Now we need to convert these xml files to the tf.record files using the script given in the link below
Replace the label map with your custom object names:
You can add object names with different row label names according to your dataset.
Execute these commands below
- python generate_tfrecord.py –csv_input=data/train_labels.csv –output_path=data/train.record
- python generate_tfrecord.py –csv_input=data/test_labels.csv –output_path=data/test.record
Once let us look the folder structure:
—-train xml files
—-test xml files
The images should be present in the images folder as well as the respective images & xml files in respective train & test folders.
Copy the “data” folder & paste it in the object_detection folder of models/research, along with this copy few of the images you would like to test in the folder “test_images” which is in “models/research/object_detection” directory.
4.Create Label Map:
The label map tells the trainer what each object is by defining a mapping of class to class ID numbers. Use a text editor to create a new file & save it as labelmap.pbtxt file. Make sure you it is not saved as normal “.txt” it has to be “.pbtxt” file. The content has to be this format. And place this file in the modesl/research/object_detection/training folder.
This labelmap id numbers, names has to match with the one we defined in the generate_tf_record file.
As the Tensorflow has pre-trained models with checkpoint files available, along with configuration files. Download the faster rcnn inception v2 coco model.
Now create a folder with the name same as model config file which we have downloaded in the above step. This folder has to be in the models/research/object_detection. The name of the folder for our convenience is “faster_rcnn_inception_v2_coco.config”
Then open this faster cnn config file with text editor & there are several changes need to be done in this. In the line 10 change the number of classes to your number. If your total objects are 4 then replace it with 4, whichever number is.
Now go to the line number fine_tune_checkpoint present just the next line of the “gradient clipping by norm”. We are going to change this path. Give the folder path of faster rcc which is in models/research/object_detection directory. The path has to be in duble inverted commas(“ ”) not single inverted(‘ ‘).
Next change the path in line number 122/123, input path to the file train.record which is in data folder
And the 124 line we have label map path. Change it with the labelmap.pbtxt file path which is in training folder.
Next change the line 136 & 138 with test record & labelmap.pbtxt
By default the model will get trained for 20k epochs & if you want to train for diff number change it in 113 line “num_steps” to your desired number & save the file.
Now we have configured successfully, we are good to go..!!
6.Run the training:
In the command prompt redirect to the models/research/object_detection directory and execute the command below
python legacy/train.py –logtostderr –train_dir=training/ –pipeline_config_path=training/faster_rcnn_inception_v2.config
If everything has set up correctly, Tensorflow will initialize the training. The initialization will take up to 30 seconds before the actual training begins. When the training starts it looks like this.
Each step of training reports the loss. It will start with the high number & gradually gets reduced as the number of iterations increase. The ideal loss has to be around 0.1-0.5. So train the model till the loss gets very near to this number. Not always it reaches this number after certain epochs the loss may remain constant for around 100-200 iterations then we can assume it has reached its local minima & can stop the training.
7.Export Inference Graph:
After training the model the recent 3 checkpoints are saved in the training folder. We can use anyone of them. Ideally we will be taking the last one. If we have trained for around 2133/2134 steps we will be taking the 2133th checkpoint files. There should be 3 files present meta, data, index files with 2133 number.
All three files has to be present else you can choose a previous one 2132s or 2131s.
Now create a new folder with name “Output”. The pb file will be saved in this folder. Next give this model.ckpt file to generate the frozen inference graph(.pb file). From the object detection folder execute this command.
- python export_inference_graph.py –input_type image_tensor –pipeline_config_path training/faster_rcnn_inception_v2_pets.config –trained_checkpoint_prefix training/model.ckpt-2133 –output_directory Output/mymodel_inference_graph
Now the frozen graph .pb file is present in “Ouput” folder. This .pb file is our classifier
8.Testing the object detection classifier:
The script “object_detection_tutorial” which is present in object_detection folder has to be customised to make it work on our environment. With the help of “object detection tutorial” notebook we will test it on our testset.
Use the only the pieces shown below from the notebook & add the new function “detection graph as default”.
The function shown below has to be written & this is not present in the tutorial notebook
Now if we execute all these steps we should be able to find an output image with objects detected on it as well the count of the objects successfully detected with an accuracy set to threshold of 70%.
An efficient object detection system has been developed which can be applied across various fields involving the detection of multiple objects using the recent techniques of Deep learning & can be used for pre-processing in their pipeline.
Akhilesh Gandhe ( Data Scientist)