Friday, June 26, 2009

Office 2007 OCR Sample Code Using C#

this sample code for:

1- scan image Format in Specify Directory.
2- read text from these images.
3- save text from each image in text fle automaticly.
4- handle problems with images

Sample Code:

public void CheckFileType(string directoryPath)
{
IEnumerator files = Directory.GetFiles(directoryPath).GetEnumerator();
while (files.MoveNext())
{
//get file extension
string fileExtension = Path.GetExtension(Convert.ToString(files.Current));

//get file name without extenstion

string fileName=Convert.ToString(files.Current).Replace(fileExtension,string.Empty);

//Check for JPG File Format

if (fileExtension == ".jpg")
{
try
{
//OCR Operations ...
MODI.Document md = new MODI.Document();
md.Create(Convert.ToString(files.Current));
md.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true);
MODI.Image image = (MODI.Image)md.Images[0];

//create text file with the same Image file name

FileStream createFile = new FileStream(fileName + ".txt",FileMode.CreateNew);

//save the image text in the text file

StreamWriter writeFile = new StreamWriter(createFile);
writeFile.Write(image.Layout.Text);
writeFile.Close();
}
catch (Exception)
{
//MessageBox.Show("This Image hasn't a text or has a problem",
"OCR Notifications",
MessageBoxButtons.OK, MessageBoxIcon.Information);
}
}
}
}

8 comments:

  1. Anaa ahibu mathalak huwa mufeed. Shukran!

    ReplyDelete
  2. Hello
    I face to this erro when I run your program :

    Retrieving the COM class factory for component with CLSID {40942A6C-1520-4132-BDF8-BDC1F71F547B} failed due to the following error: 80040154.

    what should i do ? thanks.
    rahmanian@gmail.com

    ReplyDelete
  3. same problem :/
    kubis.jan@gmail.com

    ReplyDelete
  4. compile project for x86 -> setted in Project properties

    ReplyDelete
  5. I changed project for X86 but It's not run. The same problem. Pleace help me!

    ReplyDelete
  6. Retrieving the COM class factory for component with CLSID {40942A6C-1520-4132-BDF8-BDC1F71F547B} failed due to the following error: 80040154.

    ReplyDelete
  7. I'm not a developer, I always use this free online ocr servie.

    ReplyDelete