Automating OCR Testing for Web Applications with Tesseract and Selenium in Java

by RabinsXP Team | December 26, 2025 | Articles |

Additional Info:

For teams working with Selenium in C# instead of Java, IronOCR provides a native .NET solution that integrates smoothly with Selenium WebDriver. Unlike Tess4J, which requires external Tesseract installation and tessdata configuration, IronOCR bundles everything you need.

using IronOcr;
using OpenQA.Selenium;


var ocr = new IronTesseract();
using var input = new OcrInput();
input.LoadImage("screenshot.png");
var result = ocr.Read(input);
Assert.IsTrue(result.Text.Contains("Expected Text"));

IronOCR includes built-in image correction for screenshots with low contrast or noise, which is common when capturing web elements. This reduces the preprocessing work typically needed before running OCR on dynamic web content.

Learn more: https://ironsoftware.com/csharp/ocr/

Step 3:

Use Selenium to navigate to the web page containing the image with text that needs to be OCR tested.
Use Selenium to locate the image element and get a Screenshot, and save the image.
Use Tess4J to perform OCR on the saved image and get the recognized text.
Compare the recognized text with the expected text using an assertion or comparison method.

Here’s an example code snippet that demonstrates how to perform OCR testing using Tesseract OCR and Selenium in Java:

The objective is to utilize Selenium for browsing to the https://www.wiley.com/en-us webpage, taking a screenshot of its logo, and verifying whether the text in the image matches an anticipated value.

Press enter or click to view image in full size

import net.sourceforge.tess4j.ITesseract;
import net.sourceforge.tess4j.Tesseract;
import net.sourceforge.tess4j.TesseractException;
import org.openqa.selenium.By;
import org.openqa.selenium.OutputType;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;
import org.openqa.selenium.io.FileHandler;

import java.io.File;
import java.io.IOException;
import java.time.Duration;

public class OCRTest {

    public static void main(String[] args) throws TesseractException, IOException {

        
        ChromeOptions options = new ChromeOptions();
        options.addArguments("--headless=new");
        WebDriver driver = new ChromeDriver(options);
        driver.manage().window().maximize();
        driver.manage().timeouts().implicitlyWait(Duration.ofSeconds(10));

        
        driver.get("https://www.wiley.com/en-us");

        
        WebElement imageElement = driver.findElement(By.xpath("//img[@alt="Wiley Consumer Logo"]"));
        File src = imageElement.getScreenshotAs(OutputType.FILE);
        String filePath = System.getProperty("user.dir") + "image.png";
        FileHandler.copy(src, new File(filePath));

        
        ITesseract tesseract = new Tesseract();
        
        
        tesseract.setDatapath("C:Program FilesTesseract-OCRtessdata");
        String recognizedText = tesseract.doOCR(new File(filePath));
        recognizedText = recognizedText.replaceAll("n", "");

        
        String expectedText = "WILEY";
        if (recognizedText.equals(expectedText)) {
            System.out.println("OCR test successful.");
        } else {
            System.out.println("OCR test failed.n Expected text: " + expectedText + "n Recognized text: " + recognizedText);
        }

        
        driver.quit();
    }
}

Press enter or click to view image in full size

OCR test failed Logo actual text is mismatched with expected text

Press enter or click to view image in full size

Why I utilized below code statement?

recognizedText = recognizedText.replaceAll("n", "")

replaceAll() method is called on the recognizedText string. The first argument is the regular expression to be replaced, which is "n" in this case (the double backslash is needed to escape the backslash character in the regular expression). The second argument is the replacement string, which is an empty string ("") in this case.

After executing this code, the recognizedText string should no longer contain any newline characters.

Automating OCR Testing for Web Applications with Tesseract and Selenium in Java

Leave a Reply Cancel reply

Categories

Recent Comments

Best Web Hosting Deals