# Which Object detection model will give the best result on images when the speed is not a problem for Text Images

I want to develop a model for cropping the equations from the Maths questions as people like me are struggling a lot for doing it manually for the research purpose. I want to know if we can do this? and if we can out of all the possible solutions out there for object recognition models, which one will produce the best results on Text images.

As there is tensorflow’s object recognition API, RCNN, Fast RCNN, Faster RCNN, YOLO (v-1,2,3,4,5).

An if there is any other , please do suggest. What I want to do is to detect the gray areas of equations in this image.

Note: The grey region shown in the image is for just demonstrating. My actual images are simple cropped questions from books with with background and black letters (most of the books)

Cross Validated Asked on November 21, 2021

Note that there are two problems in this case: segmentation and classification. A neural net might be a solution for both steps in this case because you can easily generate zillions of labelled test images. Nevertheless, a classic approach should yield comparable results with much less efforts:

1. Use a simple page segmentation alorithm like runlength smearing or bounding box merging for segmenting the image into regions
2. Classify each region with an arbitrary classifier. You can use a NN on all normalized input pixels for this, but other classifiers like kNN should also work with gradient histograms as features (the gradients are computed on quasi grayscale images, which are generated from the onbit images by blurring). Gradient histograms were the state-of-the art features before the renaissance of neural nets.

Out of curiosity, I have tried out step one with the python library Gamera (gamera.sf.net) with the following code:

from gamera.core import *
init_gamera()

img = img.to_onebit()

img.remove_border()
segments = img.runlength_smearing()

# now you could process each segment (e.g. saving it to a file)
for seg in segments:
# do some stuff

# visualize the result
color_ccs = img.graph_color_ccs(segments)
color_ccs.save_PNG("segments.png")


The result looks reasonable to me(note that the colors only indicate the segmentation, with adjacent segments having different colors):

Answered by cdalitz on November 21, 2021

## Related Questions

### What statistical analysis to used for kinetic data with multiple groups?

1  Asked on August 5, 2020 by carlos-valenzuela

### In R, why do the p-values from anova() change when you add more predictors?

0  Asked on August 4, 2020 by m-smith

### Random forest after cross validation

1  Asked on August 1, 2020 by steven-niggebrugge

### Grey relation between two datasets?

0  Asked on July 31, 2020 by msilvy

### General procedures for combined feature selection, model tuning, and model selection?

1  Asked on July 31, 2020 by uared1776

### Classification model not working for a large dataset

1  Asked on July 30, 2020 by gabriel-ullmann

### Sigma algebra generated by random variable on a set with generators

0  Asked on July 28, 2020 by gabriel

### What is the seasonal trend lowess model in time series?

0  Asked on July 28, 2020 by christopher-u

### Non seasonal and seasonal parameters of this time-series

0  Asked on July 27, 2020 by statsmonkey

### Extended Cox model and cox.zph

2  Asked on July 25, 2020 by finance