Traditional Culture Encyclopedia - Traditional stories - Optical character recognition
Optical character recognition
The full name of OCR is optical character recognition, which is the most commonly used and efficient text scanning technology at present. It can identify and extract the text content in a picture or PDF, output a text document, verify the user information conveniently, or edit the content directly.
The typical OCR technical route is divided into five steps, namely, input, image processing, text detection, text recognition and output. Every process needs the deep cooperation of algorithms, so from the bottom of technology, from pictures to text output, we have to go through some processes.
Ocr technical process
Image input, reading files with different image formats.
Image preprocessing mainly includes image binarization, denoising, tilt correction and so on.
Layout analysis, which divides document pictures into paragraphs and lines.
Character cutting deals with the problem that it is difficult to cut characters simply because of the adhesion and broken pen.
Character feature extraction: extracting multidimensional features from character images.
Character recognition: rough template classification and fine template matching are carried out on the feature vector extracted from the current character and the feature template library to recognize characters.
Page recovery: identify the typesetting of the original document and output the identification result to the text document according to the original typesetting format.
Post-processing correction, which corrects the recognition results according to the relationship between specific language contexts.
- Related articles
- What is the main content of<& lt scholar >>?
- What are the origins and legends of jiaozi?
- Huazhong University of Science and Technology College Chinese Test Questions and Answers
- Long-term storage method of liquor
In the game, the group Lu in addition to the lower strategy, the other aspects of the ability is very good, so it is still very popular with the players, many new players do not know how to match,
- Three representative opera types in China
- How to distinguish between electrician and maintenance electrician?
- The button of the umbrella is stuck. How to solve the problem that the umbrella can't stand up?
- Tik Tok Operation Annual Work Summary Template Set (5 Selected Articles)
- Duanwu Festival handbill short sentence