undefined

points

by thaumasiotes4 hours ago |

comments

by Terr_58 minutes ago|

[-]

It's not too hard, while they share some mechanics, the underlying use-cases and requirements are very different.

_______ Optical character recognition:

1. You have a set of predefined patterns of interest which are well-known.

2. You're trying your best to find all occurrences of those patterns. If a letter appears only once, you still need to detect it.

3. You don't care much about visual similarity within a category. The letter "B" written in extremely different fonts is the same letter.

4. You care strongly about the boundaries between categories. For example, "B+" must resolve to two known characters in sequence.

5. You want to keep details of exactly where something was found, or at the least in what order they were found. You're creating a layer of new details, which may be added to the artifact.

_______ "Glyph compression":

1. You don't have a predefined set of patterns, the algorithm is probably trying to dynamically guess at patterns which are sufficiently similar and frequent.

2. Your aren't trying to find all occurrences, only sufficiently similar and common ones, to maximize compression. If a letter appears only once, it can be ignored.

3. You do care strongly about visual similarity within a category, you don't want to mix-n-match fonts.

4. You don't care about clear category lines, if "B+" becomes its own glyph, that's no problem.

5. You're discarding detail from the artifact, to make it smaller.

by yuliyp2 hours ago|

prev|

[-]

Glyph binning looks for any chunks in the image that are similar to eachother, regardless of what they are. Letters, eyeballs, pennies, triangles, etc without caring what it is. OCR looks specifically to try and identify characters (i.e. it starts with a knowledge of an alphabet, then looks for things in the image that look like those.

If the image is actually text, both of them can end up finding things. Binning will identify "these things look almost the same", while OCR will identify "these look like the letter M"

by Dylan168073 hours ago|

prev|

[-]

Jbig2 dynamically pulls reference chunks out of the image, which makes it more likely to have insufficient separation between the target shapes.

It also gives a false sense of security when it displays dirty pixels that still clearly show a specific digit, since you think you're basically looking at the original.

by thaumasiotes3 hours ago|

parent|

[-]

That's a description of Jbig2, not a description of OCR.

Jbig2 is an OCR algorithm that doesn't assume the document comes from a pre-existing alphabet.

by Dylan168073 hours ago|

parent|

[-]

You asked what the difference was, and I said the difference. Was it unclear that to fit the phrasing of your question, we add "OCR doesn't"? I would not personally call Jbig2 OCR.

by thaumasiotes2 hours ago|

parent|

[-]

> You asked what the difference was, and I said the difference.

Take another look at my comment.

by Dylan168071 hours ago|

parent|

[-]

Let me try rephrasing to make the response to your original comment as clear as possible.

Question: "How can we describe OCR that wouldn't match this definition exactly?"

Answer: This definition largely fits OCR, but "reference to a single instance" is a weird way to phrase it. A better definition of OCR would include how it uses builtin knowledge of glyphs and text structure, unlike JBIG2 which looks for examples dynamically. And that difference in technique gives you a significant difference in the end results.

Is that better?

The definition you quoted is not an "exact" fit to OCR, it's a mildly misleading fit to OCR, and clearing up the misleading part makes it no longer fit both.