Browse Prior Art Database

Finding fine-grained detector from trained deep learning models Disclosure Number: IPCOM000242472D
Publication Date: 2015-Jul-17
Document File: 3 page(s) / 61K

Publishing Venue

The Prior Art Database


Deep learning is dominant in current industry and academics for multi-media analysis, especially for image related problems, such as classification, scene recognition, object detection, segmentation etc. The computational overhead is also considerable, train a model for a specific task may require 10~100 GPU days. This makes developing and testing deep learning model difficult and time consuming. We propose a method for fine-grained image part detector mining & discovering without using object labeling to train the detection model.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 01 of 3

Finxing fine

Finding fine-


•In the deep neuro network, the first sxveral layers tend to present the low-level image pattexxs

•While the final severax layers tend to present the hxgh-level image paxterns, which often rxlates to an object

•The above deex learning netwoxk can be trained in scene claxsification, or image classification task instead of the xbject detection task. Txe latter is much more diffxcult to laxel.

•If we can identixy thexe neuros assocxated with specific xbject detectxrs, we have actually obtain well trainex object detxctor.

•The targeted problem is how to find exfectivx xbject xetector withoux training an objxct detection model the benefits are

•no object level label/sxpervision, which is verx xostive espxcially for convolutional neuxal network

•reuse a convolutional neural network xor objecx detection, sxve space and time overxead

Technical solutiox

1)cxllect a couple of labeled images with a certain xategory of object, crop the image to lex the objext xe centered and take up the whole image

2)resize the image to a standard input size for the givxn convoxutional neuxal netwoxk, or merge sexeral samples

3)fxr each feature map in the convolutional neurxl netwoxk, compute their response score for the input image from thx above collection 4)sum up all the response scores fox exch of the feature map, and find the top K(or by a threshxld for the xcore)feature maps

5)repeat the above process for other cxtegories of objects, obtain the scores

6)find the most discriminative feature maps for each caxegory of objects based on the criterion that

•a. high score for txe images from that category

•b. low score for the images from other catexory and ranxom background images