NLM's Le Receives Patent for Image Processing Method
"A very inventive fellow," is how Dr. George Thoma, chief of the National Library of Medicine's Communications Engineering Branch, describes his colleague Dr. Daniel Le.
It's an apt characterization. In January, Le, a research electronics engineer, received the first patent ever awarded to an NLM employee for his invention of a technique to improve the accuracy and versatility of automated document scanners and optical character recognition (OCR).
Here's the problem his discovery solves: In the past, when an automated scanning machine "read" a piece of paper to store its contents in an electronic format, the scanner and other document imaging processes (such as optical character recognition) could only handle characters if the page was straight up and down, in what's called a "portrait" format. If the page was slightly at an angle when scanned (by an inexperienced operator, for example), this skew of the text would prevent accurate optical character recognition. Similarly, a table or other item laid out sideways (in "landscape" format) couldn't be picked up by the scanner or other document imaging processes at all.
"I thought it was important to figure out a new way to preprocess the document image because we are using these scanning machines more and more," explained Le. "The best way to access, and perhaps to preserve, a document is in electronic format -- then you can use it a million times. A paper journal, if it is used many times, gets degraded."
Le's "Automated Portrait/Landscape Mode Detection on a Binary Image," as it is identified in documentation for U.S. Patent Number 5,592,572, is the result of an algorithm he developed based on an analysis of projection profiles, vertical and horizontal variances on a page, and a technique to reduce the impact of nontextual data such as graphics and line art. Techniques other than Le's prove to be less accurate because graphics portions of a page also enter into the detection of page orientation (portrait/landscape).
"It took me about 2 months to get through the thinking process on this project, and to review all back literature," Le recounted. "I looked for similar things in document imaging technology, to see if they would help me, but unfortunately their error rates were high.
"Then, I went back to some of the books I had used in school and reviewed all the math," he explained with a slightly pained chuckle. "The hardest part, and the most time-consuming, was the testing of our ideas on almost 12,000 medical journal pages from the NLM's collection. Finally, we achieved an accuracy rate of 99.3 percent and knew we had found what we were looking for."
Le has worked in NLM's Communications Engineering Branch, part of the Lister Hill National Center for Biomedical Communications, since 1990. He left his native Vietnam in 1981, emigrated to Hong Kong and arrived in the U.S. in 1982.
1997 has already been a banner year for him. Besides his patent, he was recently awarded a Ph.D. in computer science from George Mason University.
Le probably won't see huge financial rewards from his patent, for which he's a co-assignee with the National Library of Medicine. But there have been other satisfying results.
"I'm glad I can contribute something to my adopted country," he relates. "I'm just one of many people working on document imaging, but I hope my invention will make a small difference."
To the thousands who rely on accurately scanned documents, today and in the future, his invention will make a big difference.
A paper on Le's invention is available online at http://archive.nlm.nih.gov/pubs/doc_class/prword.html.
Up to Top