Saturday, May 23, 2009

How To: Use Office 2007 OCR Using C#

I'll Show you how to read text from any image.

If you have Office 2007 installed, the OCR component is available for you to use. The only dependency that's added to your software is Office 2007. Requiring Office 2007 to be installed in order for your software to work may or may not fit a situation. But if your client can guarantee that machines that your software will run on have Office 2007 installed, you're gold. I've encountered many situations where this is the case. I've even encountered a few situations where clients were willing to install Office 2007 in order to use my applications.

Steps:

1- add Reference to Office 2007 Component:

The name of the COM object that you need to add as a reference is Microsoft Office Document Imaging 12.0 Type Library. By default, Office 2007 doesn't install it. You'll need to make sure that it's added by using the Office 2007 installation program. Just run the installer, click on the Continue button with the "Add or Remove Features" selection made, and insure that the imaging component is installed as shown in the figure to the right.
Important Note:The name of the COM object that you need to add as a reference is Microsoft Office Document Imaging 12.0 Type Library. By default, Office 2007 doesn't install it. You'll need to make sure that it's added by using the Office 2007 installation program. Just run the installer, click on the Continue button with the "Add or Remove Features" selection made, and insure that the imaging component is installed.

2- Create Windows Application Using C#:

from Visual Studio Solution Explorer >> right click on refferences>> select com tab>> then select (Microsoft Office Document Imaging 12.0 Type Library)

3- Put Button in your Form then put the following code:

OpenFileDialog openFileDialog = new OpenFileDialog();
openFileDialog.ShowDialog();

MODI.Document md = new MODI.Document();

md.Create(openFileDialog.FileName);
md.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true);

MODI.Image image = (MODI.Image)md.Images[0];

MessageBox.Show(image.Layout.Text, "The Selected Image Text is:");

4- Run The Application then press the button and select any image has text.

Conclusion:

I made a big sample application for Office OCR, if anyone interested, you can contact me on :

waleed.hussein.eg@gmail.com

6 comments:

  1. this code give error while loading the image in Visiual Studio 2008

    ReplyDelete
  2. could you describe your error or exception?

    ReplyDelete
  3. thanks a lot ..its working superbly..great article

    ReplyDelete
  4. plz help man, I have an error when I'm loading the images ... could u help me pleaaaaaaase :)

    ReplyDelete
  5. I have Microsoft Office 2007 installed but in my com object i am unable to find "Microsoft Office Document Imaging 12.0"

    Can you help me out..

    ReplyDelete
  6. I'm not a developer, I always use this free online ocr servie to convert image to text.

    ReplyDelete