Code Monkey home page Code Monkey logo

blinkinput-android's Introduction

BlinkOCR SDK for Android

Build Status

BlinkOCR SDK for Android is SDK that enables you to easily add near real time OCR functionality to your app. With provided camera management you can easily create an app that scans receipts, e-mails and much more. As of version 1.8.0 you can also scan barcodes when using custom UI integration. You can also scan images stored as Android Bitmaps that are loaded either from gallery, network or SD card.

With BlinkOCR you can scan free-form text or specialized formats like dates, amounts, e-mails and much more. Using specialized formats yields much better scanning quality than using free-form text mode.

Using BlinkOCR in your app requires a valid license key. You can obtain a trial license key by registering to Microblink dashboard. After registering, you will be able to generate a license key for your app. License key is bound to package name of your app, so please make sure you enter the correct package name when asked.

See below for more information about how to integrate BlinkOCR SDK into your app and also check latest [Release notes](Release notes.md).

Table of contents

Android BlinkOCR integration instructions

The package contains Android Archive (AAR) that contains everything you need to use BlinkOCR library. This AAR is also available in maven repository for easier integration into your app. For more information about maven integration procedure, check maven integration section.

Besides AAR, package also contains a demo project that contains following modules:

  • BlinkOCRSegmentDemo shows how to use simple Intent-based API to scan little text segments. It also shows you how to create a custom scan activity for scanning little text segments.
  • BlinkOCRFullScreen shows how to perform full camera frame generic OCR, how to draw OCR results on screen and how to obtain OcrResult object for further processing. This app also shows how to scan Code128 or Code39 barcode on same screen that is used for OCR.
  • BlinkOCRDetectorDemo demonstrates how to perform document detection and obtain dewarped image of the detected document.
  • BlinkOCRDirectAPI shows how to perform OCR of Android Bitmaps
  • BlinkOCRCombination shows how to perform OCR of camera frame, obtain that same camera frame and process it again with DirectAPi. You can test this app with PDF within demo app folder.
  • BlinkOCRRandomScanDemo demonstrates the usage of the provided RandomScanActivity and random scan feature, which is similar to segment scan, but does not force the user to scan text segments in the predefined order.

Source code of all demo apps is given to you to show you how to perform integration of BlinkOCR SDK into your app. You can use this source code and all resources as you wish. You can use demo apps as basis for creating your own app, or you can copy/paste code and/or resources from demo apps into your app and use them as you wish without even asking us for permission.

BlinkOCR is supported on Android SDK version 10 (Android 2.3.3) or later.

The library contains two activities:

  • SegmentScanActivity is responsible for camera control and recognition of small segments. It is ideal if you need to quickly scan small text segments, like date, amount or e-mail.
  • RandomScanActivity is similar to SegmentScanActivity but it does not force the user to scan text segments in the predefined order.

For advanced use cases, you will need to embed RecognizerView into your activity and pass activity's lifecycle events to it and it will control the camera and recognition process. For more information, see Embedding RecognizerView into custom scan activity.

Quick Start

Quick start with demo app

  1. Open Android Studio.
  2. In Quick Start dialog choose Import project (Eclipse ADT, Gradle, etc.).
  3. In File dialog select BlinkOCRDemo folder.
  4. Wait for project to load. If Android studio asks you to reload project on startup, select Yes.

Integrating BlinkOCR into your project using Maven

Maven repository for BlinkOCR SDK is: http://maven.microblink.com. If you do not want to perform integration via Maven, simply skip to Android Studio integration instructions or Eclipse integration instructions.

Using gradle or Android Studio

In your build.gradle you first need to add BlinkOCR maven repository to repositories list:

repositories {
	maven { url 'http://maven.microblink.com' }
}

After that, you just need to add BlinkOCR as a dependency to your application (make sure, transitive is set to true):

dependencies {
    compile('com.microblink:blinkocr:2.8.0@aar') {
    	transitive = true
    }
}

If you plan to use ProGuard, add following lines to your proguard-rules.pro:

-keep class com.microblink.** { *; }
-keepclassmembers class com.microblink.** { *; }
-dontwarn android.hardware.**
-dontwarn android.support.v4.**

Import Javadoc to Android Studio

Current version of Android Studio will not automatically import javadoc from maven dependency, so you have you do that manually. To do that, follow these steps:

  1. In Android Studio project sidebar, ensure project view is enabled
  2. Expand External Libraries entry (usually this is the last entry in project view)
  3. Locate blinkocr-2.8.0 entry, right click on it and select Library Properties...
  4. A Library Properties pop-up window will appear
  5. Click the second + button in bottom left corner of the window (the one that contains + with little globe)
  6. Window for definining documentation URL will appear
  7. Enter following address: https://blinkocr.github.io/blinkocr-android/
  8. Click OK

Using android-maven-plugin

Android Maven Plugin v4.0.0 or newer is required.

Open your pom.xml file and add these directives as appropriate:

<repositories>
   	<repository>
       	<id>MicroblinkRepo</id>
       	<url>http://maven.microblink.com</url>
   	</repository>
</repositories>

<dependencies>
	<dependency>
		  <groupId>com.microblink</groupId>
		  <artifactId>blinkocr</artifactId>
		  <version>2.8.0</version>
		  <type>aar</type>
  	</dependency>
</dependencies>

Android studio integration instructions

  1. In Android Studio menu, click File, select New and then select Module.

  2. In new window, select Import .JAR or .AAR Package, and click Next.

  3. In File name field, enter the path to LibRecognizer.aar and click Finish.

  4. In your app's build.gradle, add dependency to LibRecognizer and appcompat-v7:

    dependencies {
    	compile project(':LibRecognizer')
    	compile "com.android.support:appcompat-v7:24.2.0"
    }
    
  5. If you plan to use ProGuard, add following lines to your proguard-rules.pro:

    -keep class com.microblink.** { *; }
    -keepclassmembers class com.microblink.** { *; }
    -dontwarn android.hardware.**
    -dontwarn android.support.v4.**
    

Import Javadoc to Android Studio

  1. In Android Studio project sidebar, ensure project view is enabled
  2. Expand External Libraries entry (usually this is the last entry in project view)
  3. Locate LibRecognizer-unspecified entry, right click on it and select Library Properties...
  4. A Library Properties pop-up window will appear
  5. Click the + button in bottom left corner of the window
  6. Window for choosing JAR file will appear
  7. Find and select LibRecognizer-javadoc.jar file which is located in root folder of the SDK distribution
  8. Click OK

Eclipse integration instructions

We do not provide Eclipse integration demo apps. We encourage you to use Android Studio. We also do not test integrating BlinkOCR with Eclipse. If you are having problems with BlinkOCR, make sure you have tried integrating it with Android Studio prior contacting us.

However, if you still want to use Eclipse, you will need to convert AAR archive to Eclipse library project format. You can do this by doing the following:

  1. In Eclipse, create a new Android library project in your workspace.
  2. Clear the src and res folders.
  3. Unzip the LibRecognizer.aar file. You can rename it to zip and then unzip it using any tool.
  4. Copy the classes.jar to libs folder of your Eclipse library project. If libs folder does not exist, create it.
  5. Copy the contents of jni folder to libs folder of your Eclipse library project.
  6. Replace the res folder on library project with the res folder of the LibRecognizer.aar file.

You’ve already created the project that contains almost everything you need. Now let’s see how to configure your project to reference this library project.

  1. In the project you want to use the library (henceforth, "target project") add the library project as a dependency
  2. Open the AndroidManifest.xml file inside LibRecognizer.aar file and make sure to copy all permissions, features and activities to the AndroidManifest.xml file of the target project.
  3. Copy the contents of assets folder from LibRecognizer.aar into assets folder of target project. If assets folder in target project does not exist, create it.
  4. Clean and Rebuild your target project
  5. If you plan to use ProGuard, add same statements as in Android studio guide to your ProGuard configuration file.
  6. Add appcompat-v7 library to your workspace and reference it by target project (modern ADT plugin for Eclipse does this automatically for all new android projects).

Performing your first segment scan

  1. You can start recognition process by starting SegmentScanActivity activity with Intent initialized in the following way:

    // Intent for SegmentScanActivity Activity
    Intent intent = new Intent(this, SegmentScanActivity.class);
    
    // set your licence key
    // obtain your licence key at http://microblink.com/login or
    // contact us at http://help.microblink.com
    intent.putExtra(SegmentScanActivity.EXTRAS_LICENSE_KEY, "Add your licence key here");
    
    // setup array of scan configurations. Each scan configuration
    // contains 4 elements: resource ID for title displayed
    // in SegmentScanActivity activity, resource ID for text
    // displayed in activity, name of the scan element (used
    // for obtaining results) and parser setting defining
    // how the data will be extracted.
    // For more information about parser setting, check the
    // chapter "Scanning segments with BlinkOCR recognizer"
    ScanConfiguration[] confArray = new ScanConfiguration[] {
    	new ScanConfiguration(R.string.amount_title, R.string.amount_msg, "Amount", new AmountParserSettings()),
    	new ScanConfiguration(R.string.email_title, R.string.email_msg, "EMail", new EMailParserSettings()),
    	new ScanConfiguration(R.string.raw_title, R.string.raw_msg, "Raw", new RawParserSettings())
    };
    intent.putExtra(SegmentScanActivity.EXTRAS_SCAN_CONFIGURATION, confArray);
    
    // Starting Activity
    startActivityForResult(intent, MY_REQUEST_CODE);
  2. After SegmentScanActivity activity finishes the scan, it will return to the calling activity and will call method onActivityResult. You can obtain the scanning results in that method.

    @Override
    protected void onActivityResult(int requestCode, int resultCode, Intent data) {
    	super.onActivityResult(requestCode, resultCode, data);
    	
    	if (requestCode == MY_REQUEST_CODE) {
    		if (resultCode == SegmentScanActivity.RESULT_OK && data != null) {
    			// perform processing of the data here
    			
    			// for example, obtain parcelable recognition result
    			Bundle extras = data.getExtras();
    			Bundle results = extras.getBundle(SegmentScanActivity.EXTRAS_SCAN_RESULTS);
    			
    			// results bundle contains result strings in keys defined
    			// by scan configuration name
    			// for example, if set up as in step 1, then you can obtain
    			// e-mail address with following line
    			String email = results.getString("EMail");
    		}
    	}
    }

Performing your first random scan

  1. For random scan, use provided RandomScanActivity activity with Intent initialized in the following way:

    // Intent for RandomScanActivity Activity
    Intent intent = new Intent(this, RandomScanActivity.class);
    
    // set your licence key
    // obtain your licence key at http://microblink.com/login or
    // contact us at http://help.microblink.com
    intent.putExtra(RandomScanActivity.EXTRAS_LICENSE_KEY, "Add your licence key here");
    
    // setup array of random scan elements. Each scan element
    // holds following scan settings: resource ID (or string) for title displayed
    // in RandomScanActivity activity, name of the scan element (used
    // for obtaining results, must be unique) and parser setting defining
    // how the data will be extracted. In random scan, all scan elements should have
    // distinct parser types.
    // For more information about parser setting, check the
    // chapter "Scanning segments with BlinkOCR recognizer"
    
    RandomScanElement date = new RandomScanElement(R.string.date_title, "Date", new DateParserSettings());
    // element can be optional, which means that result can be returned without scannig that element
    date.setOptional(true);
    RandomScanElement[] elemsArray = new RandomScanElement[] {
    	new RandomScanElement(R.string.iban_title, "IBAN", new IbanParserSettings()),
    	new RandomScanElement(R.string.amount_title, "Amount", new AmountParserSettings()),
    	date};
    intent.putExtra(RandomScanActivity.EXTRAS_SCAN_CONFIGURATION, elemsArray);
    
    // Starting Activity
    startActivityForResult(intent, MY_REQUEST_CODE);
  2. You can obtain the scanning results in the onActivityResult of the calling activity.

    @Override
    protected void onActivityResult(int requestCode, int resultCode, Intent data) {
    	super.onActivityResult(requestCode, resultCode, data);
    	
    	if (requestCode == MY_REQUEST_CODE) {
    		if (resultCode == Activity.RESULT_OK && data != null) {
    			// perform processing of the data here
    			
    			// for example, obtain parcelable recognition result
    			Bundle extras = data.getExtras();
    			Bundle results = extras.getBundle(RandomScanActivity.EXTRAS_SCAN_RESULTS);
    			
    			// results bundle contains result strings in keys defined
    			// by scan element names
    			// for example, if set up as in step 1, then you can obtain
    			// IBAN with following line
    			String iban = results.getString("IBAN");
    		}
    	}
    }

Advanced BlinkOCR integration instructions

This section will cover more advanced details in BlinkOCR integration. First part will discuss the methods for checking whether BlinkOCR is supported on current device. Second part will cover the possible customization of builtin SegmentScanActivity activity, third part will describe how to embed RecognizerView into your activity and fourth part will describe how to use direct API to recognize directly android bitmaps without the need of camera.

Checking if BlinkOCR is supported

BlinkOCR requirements

Even before starting the scan activity, you should check if BlinkOCR is supported on current device. In order to be supported, device needs to have camera.

Android 2.3 is the minimum android version on which BlinkOCR is supported. For best performance and compatibility, we recommend Android 5.0 or newer.

Camera video preview resolution also matters. In order to perform successful scans, camera preview resolution cannot be too low. BlinkOCR requires minimum 480p camera preview resolution in order to perform scan. It must be noted that camera preview resolution is not the same as the video record resolution, although on most devices those are the same. However, there are some devices that allow recording of HD video (720p resolution), but do not allow high enough camera preview resolution (for example, Sony Xperia Go supports video record resolution at 720p, but camera preview resolution is only 320p - BlinkOCR does not work on that device).

BlinkOCR is native application, written in C++ and available for multiple platforms. Because of this, BlinkOCR cannot work on devices that have obscure hardware architectures. We have compiled BlinkOCR native code only for most popular Android ABIs. See Processor architecture considerations for more information about native libraries in BlinkOCR and instructions how to disable certain architectures in order to reduce the size of final app.

Checking for BlinkOCR support in your app

To check whether the BlinkOCR is supported on the device, you can do it in the following way:

// check if BlinkOCR is supported on the device
RecognizerCompatibilityStatus status = RecognizerCompatibility.getRecognizerCompatibilityStatus(this);
if(status == RecognizerCompatibilityStatus.RECOGNIZER_SUPPORTED) {
	Toast.makeText(this, "BlinkOCR is supported!", Toast.LENGTH_LONG).show();
} else {
	Toast.makeText(this, "BlinkOCR is not supported! Reason: " + status.name(), Toast.LENGTH_LONG).show();
}

Customization of SegmentScanActivity activity

SegmentScanActivity intent extras

This section will discuss possible parameters that can be sent over Intent for SegmentScanActivity activity that can customize default behaviour. There are several intent extras that can be sent to SegmentScanActivity actitivy:

  • # SegmentScanActivity.EXTRAS_SCAN_CONFIGURATION - with this extra you must set the array of ScanConfiguration objects. Each ScanConfiguration object will define specific scan configuration that will be performed. ScanConfiguration defines two string resource ID's - title of the scanned item and text that will be displayed above field where scan is performed. Besides that it defines the name of scanned item and object defining the OCR parser settings. More information about parser settings can be found in chapter Scanning segments with BlinkOCR recognizer. Here is only important that each scan configuration represents a single parser group and SegmentScanActivity ensures that only one parser group is active at a time. After defining scan configuration array, you need to put it into intent extra with following code snippet:

     intent.putExtra(SegmentScanActivity.EXTRAS_SCAN_CONFIGURATION, confArray);
  • # SegmentScanActivity.EXTRAS_SCAN_RESULTS - you can use this extra in onActivityResult method of calling activity to obtain bundle with recognition results. Bundle will contain only strings representing scanned data under keys defined with each scan configuration. If you also need to obtain OCR result structure, then you need to perform advanced integration. You can use the following snippet to obtain scan results:

     Bundle results = data.getBundle(SegmentScanActivity.EXTRAS_SCAN_RESULTS);
  • # SegmentScanActivity.EXTRAS_HELP_INTENT - with this extra you can set fully initialized intent that will be sent when user clicks the help button. You can put any extras you want to your intent - all will be delivered to your activity when user clicks the help button. If you do not set help intent, help button will not be shown in camera interface. To set the intent for help activity, use the following code snippet:

     /** Set the intent which will be sent when user taps help button. 
      *  If you don't set the intent, help button will not be shown.
      *  Note that this applies only to default PhotoPay camera UI.
      * */
     intent.putExtra(SegmentScanActivity.EXTRAS_HELP_INTENT, new Intent(this, HelpActivity.class));
  • # SegmentScanActivity.EXTRAS_CAMERA_VIDEO_PRESET - with this extra you can set the video resolution preset that will be used when choosing camera resolution for scanning. For more information, see javadoc. For example, to use 720p video resolution preset, use the following code snippet:

     intent.putExtra(SegmentScanActivity.EXTRAS_CAMERA_VIDEO_PRESET, (Parcelable)VideoResolutionPreset.VIDEO_RESOLUTION_720p);
  • # SegmentScanActivity.EXTRAS_LICENSE_KEY - with this extra you can set the license key for BlinkOCR. You can obtain your licence key from Microblink website or you can contact us at http://help.microblink.com. Once you obtain a license key, you can set it with following snippet:

     // set the license key
     intent.putExtra(SegmentScanActivity.EXTRAS_LICENSE_KEY, "Enter_License_Key_Here");

    Licence key is bound to package name of your application. For example, if you have licence key that is bound to com.microblink.ocr app package, you cannot use the same key in other applications. However, if you purchase Premium licence, you will get licence key that can be used in multiple applications. This licence key will then not be bound to package name of the app. Instead, it will be bound to the licencee string that needs to be provided to the library together with the licence key. To provide licencee string, use the EXTRAS_LICENSEE intent extra like this:

     // set the license key
     intent.putExtra(SegmentScanActivity.EXTRAS_LICENSE_KEY, "Enter_License_Key_Here");
     intent.putExtra(SegmentScanActivity.EXTRAS_LICENSEE, "Enter_Licensee_Here");
  • # SegmentScanActivity.EXTRAS_SHOW_OCR_RESULT - with this extra you can define whether OCR result should be drawn on camera preview as it arrives. This is enabled by default, to disable it, use the following snippet:

     // enable showing of OCR result
     intent.putExtra(SegmentScanActivity.EXTRAS_SHOW_OCR_RESULT, false);
  • # SegmentScanActivity.EXTRAS_SHOW_OCR_RESULT_MODE - if OCR result should be drawn on camera preview, this extra defines how it will be drawn. Here you need to pass instance of ShowOcrResultMode. By default, ShowOcrResultMode.ANIMATED_DOTS is used. You can also enable ShowOcrResultMode.STATIC_CHARS to draw recognized chars instead of dots. To set this extra, use the following snippet:

     // display colored static chars instead of animated dots
     intent.putExtra(SegmentScanActivity.EXTRAS_SHOW_OCR_RESULT_MODE, (Parcelable) ShowOcrResultMode.STATIC_CHARS);
  • # SegmentScanActivity.EXTRAS_IMAGE_LISTENER - with this extra you can set your implementation of ImageListener interface that will obtain images that are being processed. Make sure that your ImageListener implementation correctly implements Parcelable interface with static CREATOR field. Without this, you might encounter a runtime error. For more information and example, see Using ImageListener to obtain images that are being processed. By default, ImageListener will receive all possible images that become available during recognition process. This will introduce performance penalty because most of those images will probably not be used so sending them will just waste time. To control which images should become available to ImageListener, you can also set ImageMetadata settings with SegmentScanActivity.EXTRAS_IMAGE_METADATA_SETTINGS

  • # SegmentScanActivity.EXTRAS_IMAGE_METADATA_SETTINGS - with this extra you can set ImageMetadata settings which will define which images will be sent to ImageListener interface given via SegmentScanActivity.EXTRAS_IMAGE_LISTENER extra. If ImageListener is not given via Intent, then this extra has no effect. You can see example usage of ImageMetadata Settings in chapter Obtaining various metadata with MetadataListener and in provided demo apps.

Customization of RandomScanActivity activity

RandomScanActivity accepts similar intent extras as SegmentScanActivity with few differences listed below.

  • # RandomScanActivity.EXTRAS_SCAN_CONFIGURATION With this extra you must set the array of RandomScanElement objects. Each RandomScanElement holds following information about scan element: title of the scanned item, name of scanned item and object defining the OCR parser settings. Additionally, it is possible to set parser group for a parser that is responsible for extracting the element data by using the setParserGroup(String groupName) method on RandomScanElement object. If all parsers are in the same parser group, recognition will be faster, but sometimes merged OCR engine options may cause that some parsers are unable to extract valid data from the scanned text. Putting each parser into its own group will give better accuracy, but will perform OCR of image for each parser which can consume a lot of processing time. By default, if parser groups are not defined, all parsers will be placed in the same parser group. More information about parser settings can be found in chapter Scanning segments with BlinkOCR recognizer.

  • # RandomScanActivity.EXTRAS_SCAN_MESSAGE With this extra, it is possible to change default scan message that is displayed above the scanning window. You can use the following code snippet to set scan message string:

    intent.putExtra(RandomScanActivity.EXTRAS_SCAN_MESSAGE, message);
  • # RandomScanActivity.EXTRAS_BEEP_RESOURCE With this extra you can set the resource ID of the sound to be played when the scan element is recognized. You can use following snippet to set this extra

    intent.putExtra(RandomScanActivity.EXTRAS_BEEP_RESOURCE, R.raw.beep);

Embedding RecognizerView into custom scan activity

This section will discuss how to embed RecognizerView into your scan activity and perform scan.

  1. First make sure that RecognizerView is a member field in your activity. This is required because you will need to pass all activity's lifecycle events to RecognizerView.
  2. It is recommended to keep your scan activity in one orientation, such as portrait or landscape. Setting sensor as scan activity's orientation will trigger full restart of activity whenever device orientation changes. This will provide very poor user experience because both camera and BlinkOCR native library will have to be restarted every time. There are measures for this behaviour and will be discussed later.
  3. In your activity's onCreate method, create a new RecognizerView, define its settings and listeners and then call its create method. After that, add your views that should be layouted on top of camera view.
  4. Override your activity's onStart, onResume, onPause, onStop and onDestroy methods and call RecognizerView's lifecycle methods start, resume, pause, stop and destroy. This will ensure correct camera and native resource management. If you plan to manage RecognizerView's lifecycle independently of host activity's lifecycle, make sure the order of calls to lifecycle methods is the same as is with activities (i.e. you should not call resume method if create and start were not called first).

Here is the minimum example of integration of RecognizerView as the only view in your activity:

public class MyScanActivity extends Activity implements ScanResultListener, CameraEventsListener {
	private static final int PERMISSION_CAMERA_REQUEST_CODE = 69;
	private RecognizerView mRecognizerView;
		
	@Override
	protected void onCreate(Bundle savedInstanceState) {				
		// create RecognizerView
		mRecognizerView = new RecognizerView(this);
		   
		RecognitionSettings settings = new RecognitionSettings();
		// setup array of recognition settings (described in chapter "Recognition 
		// settings and results")
		RecognizerSettings[] settArray = setupSettingsArray();
		if(!RecognizerCompatibility.cameraHasAutofocus(CameraType.CAMERA_BACKFACE, this)) {
			settArray = RecognizerSettingsUtils.filterOutRecognizersThatRequireAutofocus(settArray);
		}
		settings.setRecognizerSettingsArray(settArray);
		mRecognizerView.setRecognitionSettings(settings);
		
		try {
		    // set license key
		    mRecognizerView.setLicenseKey(this, "your license key");
		} catch (InvalidLicenceKeyException exc) {
		    finish();
		    return;
		}
		
		// scan result listener will be notified when scan result gets available
		mRecognizerView.setScanResultListener(this);
		// camera events listener will be notified about camera lifecycle and errors
		mRecognizerView.setCameraEventsListener(this);
		
		// set camera aspect mode
		// ASPECT_FIT will fit the camera preview inside the view
		// ASPECT_FILL will zoom and crop the camera preview, but will use the
		// entire view surface
		mRecognizerView.setAspectMode(CameraAspectMode.ASPECT_FILL);
		   
		mRecognizerView.create();
		
		setContentView(mRecognizerView);
	}
	
	@Override
	protected void onStart() {
	   super.onStart();
	   // you need to pass all activity's lifecycle methods to RecognizerView
	   mRecognizerView.start();
	}
	
	@Override
	protected void onResume() {
	   	super.onResume();
	   	// you need to pass all activity's lifecycle methods to RecognizerView
       mRecognizerView.resume();
	}

	@Override
	protected void onPause() {
	   	super.onPause();
	   	// you need to pass all activity's lifecycle methods to RecognizerView
		mRecognizerView.pause();
	}

	@Override
	protected void onStop() {
	   super.onStop();
	   // you need to pass all activity's lifecycle methods to RecognizerView
	   mRecognizerView.stop();
	}
	
	@Override
	protected void onDestroy() {
	   super.onDestroy();
	   // you need to pass all activity's lifecycle methods to RecognizerView
	   mRecognizerView.destroy();
	}

	@Override
	public void onConfigurationChanged(Configuration newConfig) {
	   super.onConfigurationChanged(newConfig);
	   // you need to pass all activity's lifecycle methods to RecognizerView
	   mRecognizerView.changeConfiguration(newConfig);
	}
		
    @Override
    public void onScanningDone(RecognitionResults results) {
    	// this method is from ScanResultListener and will be called when scanning completes
    	// RecognitionResults may contain multiple results in array returned
    	// by method getRecognitionResults().
    	// This depends on settings in RecognitionSettings object that was
    	// given to RecognizerView.
    	// For more information, see chapter "Recognition settings and results")
    	
    	// After this method ends, scanning will be resumed and recognition
    	// state will be retained. If you want to prevent that, then
    	// you should call:
    	// mRecognizerView.resetRecognitionState();

		// If you want to pause scanning to prevent receiving recognition
		// results, you should call:
		// mRecognizerView.pauseScanning();
		// After scanning is paused, you will have to resume it with:
		// mRecognizerView.resumeScanning(true);
		// boolean in resumeScanning method indicates whether recognition
		// state should be automatically reset when resuming scanning
    }
    
    @Override
    public void onCameraPreviewStarted() {
        // this method is from CameraEventsListener and will be called when camera preview starts
    }
    
    @Override
    public void onCameraPreviewStopped() {
        // this method is from CameraEventsListener and will be called when camera preview stops
    }

    @Override
    public void onError(Throwable exc) {
        /** 
         * This method is from CameraEventsListener and will be called when 
         * opening of camera resulted in exception or recognition process
         * encountered an error. The error details will be given in exc
         * parameter.
         */
    }
    
    @Override
    @TargetApi(23)
    public void onCameraPermissionDenied() {
    	/**
    	 * Called on Android 6.0 and newer if camera permission is not given
    	 * by user. You should request permission from user to access camera.
    	 */
    	 requestPermissions(new String[]{Manifest.permission.CAMERA}, PERMISSION_CAMERA_REQUEST_CODE);
    	 /**
    	  * Please note that user might have not given permission to use 
    	  * camera. In that case, you have to explain to user that without
    	  * camera permissions scanning will not work.
    	  * For more information about requesting permissions at runtime, check
    	  * this article:
    	  * https://developer.android.com/training/permissions/requesting.html
    	  */
    }
    
    @Override
    public void onAutofocusFailed() {
	    /**
	     * This method is from CameraEventsListener will be called when camera focusing has failed. 
	     * Camera manager usually tries different focusing strategies and this method is called when all 
	     * those strategies fail to indicate that either object on which camera is being focused is too 
	     * close or ambient light conditions are poor.
	     */
    }
    
    @Override
    public void onAutofocusStarted(Rect[] areas) {
	    /**
	     * This method is from CameraEventsListener and will be called when camera focusing has started.
	     * You can utilize this method to draw focusing animation on UI.
	     * Areas parameter is array of rectangles where focus is being measured. 
	     * It can be null on devices that do not support fine-grained camera control.
	     */
    }

    @Override
    public void onAutofocusStopped(Rect[] areas) {
	    /**
	     * This method is from CameraEventsListener and will be called when camera focusing has stopped.
	     * You can utilize this method to remove focusing animation on UI.
	     * Areas parameter is array of rectangles where focus is being measured. 
	     * It can be null on devices that do not support fine-grained camera control.
	     */
    }
}

Scan activity's orientation

If activity's screenOrientation property in AndroidManifest.xml is set to sensor, fullSensor or similar, activity will be restarted every time device changes orientation from portrait to landscape and vice versa. While restarting activity, its onPause, onStop and onDestroy methods will be called and then new activity will be created anew. This is a potential problem for scan activity because in its lifecycle it controls both camera and native library - restarting the activity will trigger both restart of the camera and native library. This is a problem because changing orientation from landscape to portrait and vice versa will be very slow, thus degrading a user experience. We do not recommend such setting.

For that matter, we recommend setting your scan activity to either portrait or landscape mode and handle device orientation changes manually. To help you with this, RecognizerView supports adding child views to it that will be rotated regardless of activity's screenOrientation. You add a view you wish to be rotated (such as view that contains buttons, status messages, etc.) to RecognizerView with addChildView method. The second parameter of the method is a boolean that defines whether the view you are adding will be rotated with device. To define allowed orientations, implement OrientationAllowedListener interface and add it to RecognizerView with method setOrientationAllowedListener. This is the recommended way of rotating camera overlay.

However, if you really want to set screenOrientation property to sensor or similar and want Android to handle orientation changes of your scan activity, then we recommend to set configChanges property of your activity to orientation|screenSize. This will tell Android not to restart your activity when device orientation changes. Instead, activity's onConfigurationChanged method will be called so that activity can be notified of the configuration change. In your implementation of this method, you should call changeConfiguration method of RecognizerView so it can adapt its camera surface and child views to new configuration. Note that on Android versions older than 4.0 changing of configuration will require restart of camera, which can be slow.

RecognizerView reference

The complete reference of RecognizerView is available in Javadoc. The usage example is provided in BlinkOCRFullScreen demo app provided with SDK. This section just gives a quick overview of RecognizerView's most important methods.

This method should be called in activity's onCreate method. It will initialize RecognizerView's internal fields and will initialize camera control thread. This method must be called after all other settings are already defined, such as listeners and recognition settings. After calling this method, you can add child views to RecognizerView with method addChildView(View, boolean).

This method should be called in activity's onStart method. It will initialize background processing thread and start native library initialization on that thread.

This method should be called in activity's onResume method. It will trigger background initialization of camera. After camera is loaded, it will start camera frame recognition, except if scanning loop is paused.

This method should be called in activity's onPause method. It will stop the camera, but will keep native library loaded.

This method should be called in activity's onStop method. It will deinitialize native library, terminate background processing thread and free all resources that are no longer necessary.

This method should be called in activity's onDestroy method. It will free all resources allocated in create() and will terminate camera control thread.

This method should be called in activity's onConfigurationChanged method. It will adapt camera surface to new configuration without the restart of the activity. See Scan activity's orientation for more information.

With this method you can define which camera on device will be used. Default camera used is back facing camera.

Define the aspect mode of camera. If set to ASPECT_FIT (default), then camera preview will be letterboxed inside available view space. If set to ASPECT_FILL, camera preview will be zoomed and cropped to use the entire view space.

Define the video resolution preset that will be used when choosing camera resolution for scanning.

With this method you can set recognition settings that contains information what will be scanned and how will scan be performed. For more information about recognition settings and results see Recognition settings and results. This method must be called before create().

With this method you can reconfigure the recognition process while recognizer is active. Unlike setRecognitionSettings, this method must be called while recognizer is active (i.e. after resume was called). For more information about recognition settings see Recognition settings and results.

With this method you can set a OrientationAllowedListener which will be asked if current orientation is allowed. If orientation is allowed, it will be used to rotate rotatable views to it and it will be passed to native library so that recognizers can be aware of the new orientation. If you do not set this listener, recognition will be performed only in orientation defined by current activity's orientation.

With this method you can set a ScanResultListener which will be notified when recognition completes. After recognition completes, RecognizerView will pause its scanning loop and to continue the scanning you will have to call resumeScanning method. In this method you can obtain data from scanning results. For more information see Recognition settings and results.

With this method you can set a CameraEventsListener which will be notified when various camera events occur, such as when camera preview has started, autofocus has failed or there has been an error while using the camera or performing the recognition.

This method pauses the scanning loop, but keeps both camera and native library initialized. Pause and resume scanning methods count the number of calls, so if you called pauseScanning() twice, you will have to call resumeScanning twice to actually resume scanning.

With this method you can resume the paused scanning loop. If called with true parameter, implicitly calls resetRecognitionState(). If called with false, old recognition state will not be reset, so it could be reused for boosting recognition result. This may not be always a desired behaviour. Pause and resume scanning methods count the number of calls, so if you called pauseScanning() twice, you will have to call resumeScanning twice to actually resume scanning loop.

This method lets you set up RecognizerView to not automatically resume scanning first time resume is called. An example use case of when you might want this is if you want to display onboarding help when opening camera first time and want to prevent scanning in background while onboarding is displayed over camera preview.

With this method you can reset internal recognition state. State is usually kept to improve recognition quality over time, but without resetting recognition state sometimes you might get poorer results (for example if you scan one object and then another without resetting state you might end up with result that contains properties from both scanned objects).

With this method you can add your own view on top of RecognizerView. RecognizerView will ensure that your view will be layouted exactly above camera preview surface (which can be letterboxed if aspect ratio of camera preview size does not match the aspect ratio of RecognizerView and camera aspect mode is set to ASPECT_FIT). Boolean parameter defines whether your view should be rotated with device orientation changes. The rotation is independent of host activity's orientation changes and allowed orientations will be determined from OrientationAllowedListener. See also Scan activity's orientation for more information why you should rotate your views independently of activity.

This method returns true if camera thinks it has focused on object. Note that camera has to be active for this method to work. If camera is not active, returns false.

This method requests camera to perform autofocus. If camera does not support autofocus feature, method does nothing. Note that camera has to be active for this method to work.

This method returns true if camera supports torch flash mode. Note that camera has to be active for this method to work. If camera is not active, returns false.

If torch flash mode is supported on camera, this method can be used to enable/disable torch flash mode. After operation is performed, SuccessCallback will be called with boolean indicating whether operation has succeeded or not. Note that camera has to be active for this method to work and that callback might be called on background non-UI thread.

You can use this method to define the scanning region and define whether this scanning region will be rotated with device if OrientationAllowedListener determines that orientation is allowed. This is useful if you have your own camera overlay on top of RecognizerView that is set as rotatable view - you can thus synchronize the rotation of the view with the rotation of the scanning region native code will scan.

Scanning region is defined as Rectangle. First parameter of rectangle is x-coordinate represented as percentage of view width, second parameter is y-coordinate represented as percentage of view height, third parameter is region width represented as percentage of view width and fourth parameter is region height represented as percentage of view height.

View width and height are defined in current context, i.e. they depend on screen orientation. If you allow your ROI view to be rotated, then in portrait view width will be smaller than height, whilst in landscape orientation width will be larger than height. This complies with view designer preview. If you choose not to rotate your ROI view, then your ROI view will be laid out either in portrait or landscape, depending on setting for your scan activity in AndroidManifest.xml

Note that scanning region only reflects to native code - it does not have any impact on user interface. You are required to create a matching user interface that will visualize the same scanning region you set here.

This method can only be called when camera is active. You can use this method to define regions which camera will use to perform meterings for focus, white balance and exposure corrections. On devices that do not support metering areas, this will be ignored. Some devices support multiple metering areas and some support only one. If device supports only one metering area, only the first rectangle from array will be used.

Each region is defined as Rectangle. First parameter of rectangle is x-coordinate represented as percentage of view width, second parameter is y-coordinate represented as percentage of view height, third parameter is region width represented as percentage of view width and fourth parameter is region height represented as percentage of view height.

View width and height are defined in current context, i.e. they depend on current device orientation. If you have custom OrientationAllowedListener, then device orientation will be the last orientation that you have allowed in your listener. If you don't have it set, orientation will be the orientation of activity as defined in AndroidManifest.xml. In portrait orientation view width will be smaller than height, whilst in landscape orientation width will be larger than height. This complies with view designer preview.

Second boolean parameter indicates whether or not metering areas should be automatically updated when device orientation changes.

You can use this method to define metadata listener that will obtain various metadata from the current recognition process. Which metadata will be available depends on metadata settings. For more information and examples, check demo applications and section Obtaining various metadata with MetadataListener.

This method sets the license key that will unlock all features of the native library. You can obtain your license key from Microblink website.

Use this method to set a license key that is bound to a licensee, not the application package name. You will use this method when you obtain a license key that allows you to use BlinkOCR SDK in multiple applications. You can obtain your license key from Microblink website.

Using direct API for recognition of Android Bitmaps

This section will describe how to use direct API to recognize android Bitmaps without the need for camera. You can use direct API anywhere from your application, not just from activities.

  1. First, you need to obtain reference to Recognizer singleton using getSingletonInstance.
  2. Second, you need to initialize the recognizer.
  3. After initialization, you can use singleton to process images. You cannot process multiple images in parallel.
  4. Do not forget to terminate the recognizer after usage (it is a shared resource).

Here is the minimum example of usage of direct API for recognizing android Bitmap:

public class DirectAPIActivity extends Activity implements ScanResultListener {
	private Recognizer mRecognizer;
		
	@Override
	protected void onCreate(Bundle savedInstanceState) {
		// initialize your activity here
	}
	
	@Override
	protected void onStart() {
	   super.onStart();
	   try {
		   mRecognizer = Recognizer.getSingletonInstance();
		} catch (FeatureNotSupportedException e) {
			Toast.makeText(this, "Feature not supported! Reason: " + e.getReason().getDescription(), Toast.LENGTH_LONG).show();
			finish();
			return;
		}
	   try {
	       // set license key
	       mRecognizer.setLicenseKey(this, "your license key");
	   } catch (InvalidLicenceKeyException exc) {
	       finish();
	       return;
	   }
		RecognitionSettings settings = new RecognitionSettings();
		// setupSettingsArray method is described in chapter "Recognition 
		// settings and results")
		settings.setRecognizerSettingsArray(setupSettingsArray());
		mRecognizer.initialize(this, settings, new DirectApiErrorListener() {
			@Override
			public void onRecognizerError(Throwable t) {
				Toast.makeText(DirectAPIActivity.this, "There was an error in initialization of Recognizer: " + t.getMessage(), Toast.LENGTH_SHORT).show();
				finish();
			}
		});
	}
	
	@Override
	protected void onResume() {
	   super.onResume();
		// start recognition
		Bitmap bitmap = BitmapFactory.decodeFile("/path/to/some/file.jpg");
		mRecognizer.recognize(bitmap, Orientation.ORIENTATION_LANDSCAPE_RIGHT, this);
	}

	@Override
	protected void onStop() {
	   super.onStop();
	   mRecognizer.terminate();
	}

    @Override
    public void onScanningDone(RecognitionResults results) {
    	// this method is from ScanResultListener and will be called 
    	// when scanning completes
    	// RecognitionResults may contain multiple results in array returned
    	// by method getRecognitionResults().
    	// This depends on settings in RecognitionSettings object that was
    	// given to RecognizerView.
    	// For more information, see chapter "Recognition settings and results")
    	    	
    	finish(); // in this example, just finish the activity
    }
    
}

Understanding DirectAPI's state machine

DirectAPI's Recognizer singleton is actually a state machine which can be in one of 4 states: OFFLINE, UNLOCKED, READY and WORKING.

  • When you obtain the reference to Recognizer singleton, it will be in OFFLINE state.
  • First you need to unlock the Recognizer by providing a valid licence key using setLicenseKey method. If you attempt to call setLicenseKey while Recognizer is not in OFFLINE state, you will get IllegalStateException.
  • After successful unlocking, Recognizer singleton will move to UNLOCKED state.
  • Once in UNLOCKED state, you can initialize Recognizer by calling initialize method. If you call initialize method while Recognizer is not in UNLOCKED state, you will get IllegalStateException.
  • After successful initialization, Recognizer will move to READY state. Now you can call any of the recognize* methods.
  • When starting recognition with any of the recognize* methods, Recognizer will move to WORKING state. If you attempt to call these methods while Recognizer is not in READY state, you will get IllegalStateException
  • Recognition is performed on background thread so it is safe to call all Recognizer's method from UI thread
  • When recognition is finished, Recognizer first moves back to READY state and then returns the result via provided ScanResultListener.
  • Please note that ScanResultListener's onScanningDone method will be called on background processing thread, so make sure you do not perform UI operations in this calback.
  • By calling terminate method, Recognizer singleton will release all its internal resources and will request processing thread to terminate. Note that even after calling terminate you might receive onScanningDone event if there was work in progress when terminate was called.
  • terminate method can be called from any Recognizer singleton's state
  • You can observe Recognizer singleton's state with method getCurrentState

Using DirectAPI while RecognizerView is active

Both RecognizerView and DirectAPI recognizer use the same internal singleton that manages native code. This singleton handles initialization and termination of native library and propagating recognition settings to native library. It is possible to use RecognizerView and DirectAPI together, as internal singleton will make sure correct synchronization and correct recognition settings are used. If you run into problems while using DirectAPI in combination with RecognizerView, let us know!

Obtaining various metadata with MetadataListener

This section will give an example how to use Metadata listener to obtain various metadata, such as object detection location, images that are being processed and much more. Which metadata will be obtainable is configured with Metadata settings. You must set both MetadataSettings and your implementation of MetadataListener before calling create method of RecognizerView. Setting them after causes undefined behaviour.

The following code snippet shows how to configure MetadataSettings to obtain detection location, video frame that was used to perform and dewarped image of the document being scanned (NOTE: the availability of metadata depends on currently active recognisers and their settings. Not all recognisers can produce all types of metadata. Check Recognition settings and results article for more information about recognisers and their settings):

// this snippet should be in onCreate method of your scanning activity

MetadataSettings ms = new MetadataSettings();
// enable receiving of detection location
ms.setDetectionMetadataAllowed(true);

// ImageMetadataSettings contains settings for defining which images will be returned
MetadataSettings.ImageMetadataSettings ims = new MetadataSettings.ImageMetadataSettings();
// enable returning of dewarped images, if they are available
ims.setDewarpedImageEnabled(true);
// enable returning of image that was used to obtain valid scanning result
ims.setSuccessfulScanFrameEnabled(true)

// set ImageMetadataSettings to MetadataSettings object
ms.setImageMetadataSettings(ims);

// this line must be called before mRecognizerView.create()
mRecognizerView.setMetadataListener(myMetadataListener, ms);

The following snippet shows one possible implementation of MetadataListener:

public class MyMetadataListener implements MetadataListener {

	/**
	 * Called when metadata is available.
	 */
    @Override
    public void onMetadataAvailable(Metadata metadata) {
    	// detection location will be available as DetectionMetadata
        if (metadata instanceof DetectionMetadata) {
        	// DetectionMetadata contains DetectorResult which is null if object detection
        	// has failed and non-null otherwise
        	// Let's assume that we have a QuadViewManager which can display animated frame
        	// around detected object (for reference, please check javadoc and demo apps)
            DetectorResult dr = ((DetectionMetadata) metadata).getDetectionResult();
            if (dr == null) {
            	// animate frame to default location if detection has failed
                mQuadViewManager.animateQuadToDefaultPosition();
            } else if (dr instanceof QuadDetectorResult) {
            	// otherwise, animate frame to detected location
                mQuadViewManager.animateQuadToDetectionPosition((QuadDetectorResult) dr);
            }
        // images will be available inside ImageMetadata
        } else if (metadata instanceof ImageMetadata) {
        	// obtain image
        	
        	// Please note that Image's internal buffers are valid only
        	// until this method ends. If you want to save image for later,
        	// obtained a cloned image with image.clone().
        	
            Image image = ((ImageMetadata) metadata).getImage();
            // to convert the image to Bitmap, call image.convertToBitmap()
            
            // after this line, image gets disposed. If you want to save it
            // for later, you need to clone it with image.clone()
        }
    }
}

Here are javadoc links to all classes that appeared in previous code snippet:

Using ImageListener to obtain images that are being processed

There are two ways of obtaining images that are being processed:

This section will give an example how to implement ImageListener interface that will obtain images that are being processed. ImageListener has only one method that needs to be implemented: onImageAvailable(Image). This method is called whenever library has available image for current processing step. Image is class that contains all information about available image, including buffer with image pixels. Image can be in several format and of several types. ImageFormat defines the pixel format of the image, while ImageType defines the type of the image. ImageListener interface extends android's Parcelable interface so it is possible to send implementations via intents.

Here is the example implementation of ImageListener interface. This implementation will save all images into folder myImages on device's external storage:

public class MyImageListener implements ImageListener {

   /**
    * Called when library has image available.
    */
    @Override
    public void onImageAvailable(Image image) {
        // we will save images to 'myImages' folder on external storage
        // image filenames will be 'imageType - currentTimestamp.jpg'
        String output = Environment.getExternalStorageDirectory().getAbsolutePath() + "/myImages";
        File f = new File(output);
        if(!f.exists()) {
            f.mkdirs();
        }
        DateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd-HH-mm-ss");
        String dateString = dateFormat.format(new Date());
        String filename = null;
        switch(image.getImageFormat()) {
            case ALPHA_8: {
                filename = output + "/alpha_8 - " + image.getImageName() + " - " + dateString + ".jpg";
                break;
            }
            case BGRA_8888: {
                filename = output + "/bgra - " + image.getImageName() + " - " + dateString + ".jpg";
                break;
            }
            case YUV_NV21: {
                filename = output + "/yuv - " + image.getImageName()+ " - " + dateString + ".jpg";
                break;
            }
        }
        Bitmap b = image.convertToBitmap();
        FileOutputStream fos = null;
        try {
            fos = new FileOutputStream(filename);
            boolean success = b.compress(Bitmap.CompressFormat.JPEG, 100, fos);
            if(!success) {
                Log.e(this, "Failed to compress bitmap!");
                if(fos != null) {
                    try {
                        fos.close();
                    } catch (IOException ignored) {
                    } finally {
                        fos = null;
                    }
                    new File(filename).delete();
                }
            }
        } catch (FileNotFoundException e) {
            Log.e(this, e, "Failed to save image");
        } finally {
            if(fos != null) {
                try {
                    fos.close();
                } catch (IOException ignored) {
                }
            }
        }
        // after this line, image gets disposed. If you want to save it
        // for later, you need to clone it with image.clone()
    }

    /**
     * ImageListener interface extends Parcelable interface, so we also need to implement
     * that interface. The implementation of Parcelable interface is below this line.
     */

    @Override
    public int describeContents() {
        return 0;
    }

    @Override
    public void writeToParcel(Parcel dest, int flags) {
    }

    public static final Creator<MyImageListener> CREATOR = new Creator<MyImageListener>() {
        @Override
        public MyImageListener createFromParcel(Parcel source) {
            return new MyImageListener();
        }

        @Override
        public MyImageListener[] newArray(int size) {
            return new MyImageListener[size];
        }
    };
}

Note that ImageListener can only be given to SegmentScanActivity via Intent, while to RecognizerView, you need to give Metadata listener and Metadata settings that defines which metadata should be obtained. When you give ImageListener to SegmentScanActivity via Intent, it internally registers a MetadataListener that enables obtaining of all available image types and invokes ImageListener given via Intent with the result. For more information and examples how to use MetadataListener for obtaining images, refer to demo applications.

Recognition settings and results

This chapter will discuss various recognition settings used to configure different recognizers and scan results generated by them.

Recognition settings define what will be scanned and how will the recognition process be performed. Here is the list of methods that are most relevant:

Sets whether or not outputting of multiple scan results from same image is allowed. If that is true, it is possible to return multiple recognition results produced by different recognizers from same image. However, single recognizer can still produce only a single result from single image. If this option is false, the array of BaseRecognitionResults will contain at most 1 element. The upside of setting that option to false is the speed - if you enable lots of recognizers, as soon as the first recognizer succeeds in scanning, recognition chain will be terminated and other recognizers will not get a chance to analyze the image. The downside is that you are then unable to obtain multiple results from different recognizers from single image. By default, this option is false.

Sets the number of miliseconds BlinkOCR will attempt to perform the scan it exits with timeout error. On timeout returned array of BaseRecognitionResults inside RecognitionResults might be null, empty or may contain only elements that are not valid (isValid returns false) or are empty (isEmpty returns true).

NOTE: Please be aware that time counting does not start from the moment when scanning starts. Instead it starts from the moment when at least one BaseRecognitionResult becomes available which is neither empty nor valid.

The reason for this is the better user experience in cases when for example timeout is set to 10 seconds and user starts scanning and leaves device lying on table for 9 seconds and then points the device towards the object it wants to scan: in such case it is better to let that user scan the object it wants instead of completing scan with empty scan result as soon as 10 seconds timeout ticks out.

Sets the mode of the frame quality estimation. Frame quality estimation is the process of estimating the quality of video frame so only best quality frames can be chosen for processing so no time is wasted on processing frames that are of too poor quality to contain any meaningful information. It is not used when performing recognition of Android bitmaps using Direct API. You can choose 3 different frame quality estimation modes: automatic, always on and always off.

  • In automatic mode (default), frame quality estimation will be used if device contains multiple processor cores or if on single core device at least one active recognizer requires frame quality estimation.
  • In always on mode, frame quality estimation will be used always, regardless of device or active recognizers.
  • In always off mode, frame quality estimation will be always disabled, regardless of device or active recognizers. This is not recommended setting because it can significantly decrease quality of the scanning process.

Sets the array of RecognizerSettings that will define which recognizers should be activated and how should the be set up. The list of available RecognizerSettings and their specifics are given below.

Scanning segments with BlinkOCR recognizer

This section discusses the setting up of BlinkOCR recognizer and obtaining results from it. You should also check the demo for example.

Setting up BlinkOCR recognizer

BlinkOCR recognizer is consisted of one or more parsers that are grouped in parser groups. Each parser knows how to extract certain element from OCR result and also knows what are the best OCR engine options required to perform OCR on image. Parsers can be grouped in parser groups. Parser groups contain one or more parsers and are responsible for merging required OCR engine options of each parser in group and performing OCR only once and then letting each parser in group parse the data. Thus, you can make for own best tradeoff between speed and accuracy - putting each parser into its own group will give best accuracy, but will perform OCR of image for each parser which can consume a lot of processing time. On the other hand, putting all parsers into same group will perform only one OCR but with settings that are combined for all parsers in group, thus possibly reducing parsing quality.

Let's see this on example: assume we have two parsers at our disposal: AmountParser and EMailParser. AmountParser knows how to extract amount's from OCR result and requires from OCR only to recognise digits, periods and commas and ignore letters. On the other hand, EMailParser knows how to extract e-mails from OCR result and requires from OCR to recognise letters, digits, '@' characters and periods, but not commas.

If we put both AmountParser and EMailParser into same parser group, the merged OCR engine settings will require recognition of all letters, all digits, '@' character, both period and comma. Such OCR result will contain all characters for EMailParser to properly parse e-mail, but might confuse AmountParser if OCR misclassifies some characters into digits.

If we put AmountParser in one parser group and EMailParser in another parser group, OCR will be performed for each parser group independently, thus preventing the AmountParser confusion, but two OCR passes of image will be performed, which can have a performance impact.

So to sum it up, BlinkOCR recognizer performs OCR of image for each available parser group and then runs all parsers in that group on obtained OCR result and saves parsed data.

By definition, each parser results with string that represents a parsed data. The parsed string is stored under parser's name which has to be unique within parser group. So, when defining settings for BlinkOCR recognizer, when adding parsers, you need to provide a name for the parser (you will use that name for obtaining result later) and optionally provide a name for the parser group in which parser will be put into.

To activate BlinkOCR recognizer, you need to create BlinkOCRRecognizerSettings, add some parsers to it and add it to RecognizerSettings array. You can use the following code snippet to perform that:

private RecognizerSettings[] setupSettingsArray() {
	BlinkOCRRecognizerSettings sett = new BlinkOCRRecognizerSettings();
	
	// add amount parser to default parser group
	sett.addParser("myAmountParser", new AmountParserSettings());
	
	// now add sett to recognizer settings array that is used to configure
	// recognition
	return new RecognizerSettings[] { sett };
}

The following is a list of available parsers:

  • Amount parser - represented by AmountParserSettings

    • used for parsing amounts from OCR result
  • IBAN parser - represented by IbanParserSettings

    • used for parsing International Bank Account Numbers (IBANs) from OCR result
  • E-mail parser - represented by EMailParserSettings

    • used for parsing e-mail addresses
  • Date parser - represented by DateParserSettings

    • used for parsing dates in various formats
  • Raw parser - represented by RawParserSettings

    • used for obtaining raw OCR result
  • Vehicle Identification Number (VIN) parser - represented by VinParserSettings

    • used for parsing vehicle identification number
  • License Plates parser - represented by LicensePlatesParserSettings

    • used for parsing license plates numbers
  • Regex parser - represented by RegexParserSettings

    • used for parsing arbitrary regular expressions
    • please note that some features, like back references, match grouping and certain regex metacharacters are not supported. See javadoc for more info.
  • Mobile coupons parser - represented by MobileCouponsParserSettings

    • used for parsing prepaid codes from mobile phone coupons

Obtaining results from BlinkOCR recognizer

BlinkOCR recognizer produces BlinkOCRRecognitionResult. You can use instanceof operator to check if element in results array is instance of BlinkOCRRecognitionResult class. See the following snippet for an example:

@Override
public void onScanningDone(RecognitionResults results) {
	BaseRecognitionResult[] dataArray = results.getRecognitionResults();
	for(BaseRecognitionResult baseResult : dataArray) {
		if(baseResult instanceof BlinkOCRRecognitionResult) {
			BlinkOCRRecognitionResult result = (BlinkOCRRecognitionResult) baseResult;
			
	        // you can use getters of BlinkOCRRecognitionResult class to 
	        // obtain scanned information
	        if(result.isValid() && !result.isEmpty()) {
	        	 // use the parser name provided to BlinkOCRRecognizerSettings to
	        	 // obtain parsed result provided by given parser
	        	 // obtain result of "myAmountParser" in default parsing group
		        String parsedAmount = result.getParsedResult("myAmountParser");
		        // note that parsed result can be null or empty even if result
		        // is marked as non-empty and valid
		        if(parsedAmount != null && !parsedAmount.isEmpty()) {
		        	// do whatever you want with parsed result
		        }
		        // obtain OCR result for default parsing group
		        // OCR result exists if result is valid and non-empty
		        OcrResult ocrResult = result.getOcrResult();
	        } else {
	        	// not all relevant data was scanned, ask user
	        	// to try again
	        }
		}
	}
}

Available getters are:

boolean isValid()

Returns true if scan result contains at least one OCR result in one parsing group.

boolean isEmpty()

Returns true if scan result is empty, i.e. nothing was scanned. All getters should return null for empty result.

String getParsedResult(String parserName)

Returns the parsed result provided by parser with name parserName added to default parser group. If parser with name parserName does not exists in default parser group, returns null. If parser exists, but has failed to parse any data, returns empty string.

String getParsedResult(String parserGroupName, String parserName)

Returns the parsed result provided by parser with name parserName added to parser group named parserGroupName. If parser with name parserName does not exists in parser group with name parserGroupName or if parser group does not exists, returns null. If parser exists, but has failed to parse any data, returns empty string.

Object getSpecificParsedResult(String parserName)

Returns specific parser result for concrete parser with the given parser name in default parser group. For example, date parser which is represented with DateParserSettings can return parsed date as Date object. It is always possible to obtain parsed result as raw string by using getParsedResult(String) or getParsedResult(String, String) method. If parser with name parserName does not exists in default parser group, returns null. If parser exists, but has failed to parse any data, returns null or empty string.

Object getSpecificParsedResult(String parserGroupName, String parserName)

Returns specific parser result for concrete parser with the given parser name in the given parser group. For example, date parser which is represented with DateParserSettings can return parsed date as Date object. It is always possible to obtain parsed result as raw string by using getParsedResult(String) or getParsedResult(String, String) method. If parser with name parserName does not exists in parser group with name parserGroupName or if parser group does not exists, returns null. If parser exists, but has failed to parse any data, returns null or empty string.

OcrResult getOcrResult()

Returns the OCR result structure for default parser group.

OcrResult getOcrResult(String parserGroupName)

Returns the OCR result structure for parser group named parserGroupName.

Scanning templated documents with BlinkOCR recognizer

This section discusses the setting up of BlinkOCR recognizer for scanning templated documents. Please check demo app for examples.

Templated document is any document which is defined by its template. Template contains the information about how the document should be detected, i.e. found on the camera scene and information about which part of document contains which useful information.

Defining how document should be detected

Before performing OCR of the document, BlinkOCR first needs to find its location on camera scene. In order to perform detection, you need to define DetectorSettings which will be used to instantiate detector which perform document detection. You can set detector settings with method setDetectorSettings(DetectorSettings). If you do not set detector settings, BlinkOCR recognizer will work in Segment scan mode.

You can find out more information about about detectors that can be used in section Detection settings and results.

Defining how document should be recognized

After document has been detected, it will be recognized. This is done in following way:

  1. the detector produces a DetectorResult which contains one or more detection locations.
  2. based on array of DecodingInfos that were defined as part of concrete DetectorSettings (see setDecodingInfos(DecodingInfo[]) method of QuadDetectorSettings), for each element of array following is performed:
    • location defined in DecodingInfo is dewarped to image of height defined within DecodingInfo
    • a parser group that has same name as current DecodingInfo is searched and if it is found, optimal OCR settings for all parsers from that parser group is calculated
    • using optimal OCR settings OCR of the dewarped image is performed
    • finally, OCR result is parsed with each parser from that parser group
    • if parser group with the same name as current DecodingInfo cannot be found, no OCR will be performed, however image will be reported via MetadataListener if receiving of DEWARPED images has been enabled
  3. if no DocumentClassifier has been given with setDocumentClassifier(DocumentClassifier), recognition is done. If DocumentClassifier exists, its method classify(BlinkOCRRecognitionResult) is called to determine which type document has been detected
  4. If classifier returned string which is same as one used previously to setup parser decoding infos, then this array of DecodingInfos is obtained and step 2. is performed again with obtained array of DecodingInfos.

When to use DocumentClassifier?

If you plan scanning several different documents of same size, for example different ID cards, which are all 85x54 mm (credit card) size, then you need to use DocumentClassifer to classify the type of document so correct DecodingInfo array can be used for obtaining relevant information. An example would be the case where you need to scan both front sides of croatian and german ID cards - the location of first and last names are not same on both documents. Therefore, you first need to classify the document based on some discriminative features.

If you plan supporting only single document type, then you do not need to use DocumentClassifier.

How to implement DocumentClassifier?

DocumentClassifier is interface that should be implemented to support classification of documents that cannot be differentiated by detector. Classification result is used to determine which set of decoding infos will be used to extract classification-specific data. This interface extends the Parcelable interface and the parcelization should be implemented. Besides that, following method has to be implemented:

Based on BlinkOCRRecognitionResult which contains data extracted from decoding infos inherent to detector, classifies the document. For each document type that you want to support, returned result string has to be equal to the name of the corresponding set of DecodingInfo objects which are defined for that document type. Named decoding info sets should be defined using setParserDecodingInfos(DecodingInfo[], String) method.

How to obtain recognition results?

Just like when using BlinkOCR recognizer in segment scan mode, same principles apply here. You use the same approach as discussed in Obtaining results from BlinkOCR recognizer. Just keep in mind to use parser group names that are equal to decoding info names. Check demo app that is delivered with SDK for detailed example.

Scanning PDF417 barcodes

This section discusses the settings for setting up PDF417 recognizer and explains how to obtain results from PDF417 recognizer.

Setting up PDF417 recognizer

To activate PDF417 recognizer, you need to create a Pdf417RecognizerSettings and add it to RecognizerSettings array. You can do this using following code snippet:

private RecognizerSettings[] setupSettingsArray() {
	Pdf417RecognizerSettings sett = new Pdf417RecognizerSettings();
	// disable scanning of white barcodes on black background
	sett.setInverseScanning(false);
	// allow scanning of barcodes that have invalid checksum
	sett.setUncertainScanning(true);
	// disable scanning of barcodes that do not have quiet zone
	// as defined by the standard
	sett.setNullQuietZoneAllowed(false);

	// now add sett to recognizer settings array that is used to configure
	// recognition
	return new RecognizerSettings[] { sett };
}

As can be seen from example, you can tweak PDF417 recognition parameters with methods of Pdf417RecognizerSettings.

setUncertainScanning(boolean)

By setting this to true, you will enable scanning of non-standard elements, but there is no guarantee that all data will be read. This option is used when multiple rows are missing (e.g. not whole barcode is printed). Default is false.

setNullQuietZoneAllowed(boolean)

By setting this to true, you will allow scanning barcodes which don't have quiet zone surrounding it (e.g. text concatenated with barcode). This option can significantly increase recognition time. Default is false.

setInverseScanning(boolean)

By setting this to true, you will enable scanning of barcodes with inverse intensity values (i.e. white barcodes on dark background). This option can significantly increase recognition time. Default is false.

Obtaining results from PDF417 recognizer

PDF417 recognizer produces Pdf417ScanResult. You can use instanceof operator to check if element in results array is instance of Pdf417ScanResult class. See the following snippet for an example:

@Override
public void onScanningDone(RecognitionResults results) {
	BaseRecognitionResult[] dataArray = results.getRecognitionResults();
	for(BaseRecognitionResult baseResult : dataArray) {
		if(baseResult instanceof Pdf417ScanResult) {
			Pdf417ScanResult result = (Pdf417ScanResult) baseResult;
			
	        // getStringData getter will return the string version of barcode contents
			String barcodeData = result.getStringData();
			// isUncertain getter will tell you if scanned barcode is uncertain
			boolean uncertainData = result.isUncertain();
			// getRawData getter will return the raw data information object of barcode contents
			BarcodeDetailedData rawData = result.getRawData();
			// BarcodeDetailedData contains information about barcode's binary layout, if you
			// are only interested in raw bytes, you can obtain them with getAllData getter
			byte[] rawDataBuffer = rawData.getAllData();
		}
	}
}

As you can see from the example, obtaining data is rather simple. You just need to call several methods of the Pdf417ScanResult object:

String getStringData()

This method will return the string representation of barcode contents. Note that PDF417 barcode can contain binary data so sometimes it makes little sense to obtain only string representation of barcode data.

boolean isUncertain()

This method will return the boolean indicating if scanned barcode is uncertain. This can return true only if scanning of uncertain barcodes is allowed, as explained earlier.

BarcodeDetailedData getRawData()

This method will return the object that contains information about barcode's binary layout. You can see information about that object in javadoc. However, if you only need to access byte array containing, you can call method getAllData of BarcodeDetailedData object.

Quadrilateral getPositionOnImage()

Returns the position of barcode on image. Note that returned coordinates are in image's coordinate system which is not related to view coordinate system used for UI.

Scanning one dimensional barcodes with BlinkOCR's implementation

This section discusses the settings for setting up 1D barcode recognizer that uses BlinkOCR's implementation of scanning algorithms and explains how to obtain results from that recognizer. Henceforth, the 1D barcode recognizer that uses BlinkOCR's implementation of scanning algorithms will be refered as "Bardecoder recognizer".

Setting up Bardecoder recognizer

To activate Bardecoder recognizer, you need to create a BarDecoderRecognizerSettings and add it to RecognizerSettings array. You can do this using following code snippet:

private RecognizerSettings[] setupSettingsArray() {
	BarDecoderRecognizerSettings sett = new BarDecoderRecognizerSettings();
	// activate scanning of Code39 barcodes
	sett.setScanCode39(true);
	// activate scanning of Code128 barcodes
	sett.setScanCode128(true);
	// disable scanning of white barcodes on black background
	sett.setInverseScanning(false);
	// disable slower algorithm for low resolution barcodes
	sett.setTryHarder(false);

	// now add sett to recognizer settings array that is used to configure
	// recognition
	return new RecognizerSettings[] { sett };
}

As can be seen from example, you can tweak Bardecoder recognition parameters with methods of BarDecoderRecognizerSettings.

setScanCode128(boolean)

Method activates or deactivates the scanning of Code128 1D barcodes. Default (initial) value is false.

setScanCode39(boolean)

Method activates or deactivates the scanning of Code39 1D barcodes. Default (initial) value is false.

setInverseScanning(boolean)

By setting this to true, you will enable scanning of barcodes with inverse intensity values (i.e. white barcodes on dark background). This option can significantly increase recognition time. Default is false.

setTryHarder(boolean)

By setting this to true, you will enabled scanning of lower resolution barcodes at cost of additional processing time. This option can significantly increase recognition time. Default is false.

Obtaining results from Bardecoder recognizer

Bardecoder recognizer produces BarDecoderScanResult. You can use instanceof operator to check if element in results array is instance of BarDecoderScanResult class. See the following snippet for example:

@Override
public void onScanningDone(RecognitionResults results) {
	BaseRecognitionResult[] dataArray = results.getRecognitionResults();
	for(BaseRecognitionResult baseResult : dataArray) {
		if(baseResult instanceof BarDecoderScanResult) {
			BarDecoderScanResult result = (BarDecoderScanResult) baseResult;
			
			// getBarcodeType getter will return a BarcodeType enum that will define
			// the type of the barcode scanned
			BarcodeType barType = result.getBarcodeType();
	        // getStringData getter will return the string version of barcode contents
			String barcodeData = result.getStringData();
			// getRawData getter will return the raw data information object of barcode contents
			BarcodeDetailedData rawData = result.getRawData();
			// BarcodeDetailedData contains information about barcode's binary layout, if you
			// are only interested in raw bytes, you can obtain them with getAllData getter
			byte[] rawDataBuffer = rawData.getAllData();
		}
	}
}

As you can see from the example, obtaining data is rather simple. You just need to call several methods of the BarDecoderScanResult object:

String getStringData()

This method will return the string representation of barcode contents.

BarcodeDetailedData getRawData()

This method will return the object that contains information about barcode's binary layout. You can see information about that object in javadoc. However, if you only need to access byte array containing, you can call method getAllData of BarcodeDetailedData object.

String getExtendedStringData()

This method will return the string representation of extended barcode contents. This is available only if barcode that supports extended encoding mode was scanned (e.g. code39).

BarcodeDetailedData getExtendedRawData()

This method will return the object that contains information about barcode's binary layout when decoded in extended mode. You can see information about that object in javadoc. However, if you only need to access byte array containing, you can call method getAllData of BarcodeDetailedData object. This is available only if barcode that supports extended encoding mode was scanned (e.g. code39).

getBarcodeType()

This method will return a BarcodeType enum that defines the type of barcode scanned.

Scanning barcodes with ZXing implementation

This section discusses the settings for setting up barcode recognizer that use ZXing's implementation of scanning algorithms and explains how to obtain results from it. BlinkOCR uses ZXing's c++ port to support barcodes for which we still do not have our own scanning algorithms. Also, since ZXing's c++ port is not maintained anymore, we also provide updates and bugfixes to it inside our codebase.

Setting up ZXing recognizer

To activate ZXing recognizer, you need to create ZXingRecognizerSettings and add it to RecognizerSettings array. You can do this using the following code snippet:

private RecognizerSettings[] setupSettingsArray() {
	ZXingRecognizerSettings sett=  new ZXingRecognizerSettings();
	// disable scanning of white barcodes on black background
	sett.setInverseScanning(false);
	// activate scanning of QR codes
	sett.setScanQRCode(true);

	// now add sett to recognizer settings array that is used to configure
	// recognition
	return new RecognizerSettings[] { sett };
}

As can be seen from example, you can tweak ZXing recognition parameters with methods of ZXingRecognizerSettings. Note that some barcodes, such as Code 39 are available for scanning with BlinkOCR's implementation. You can choose to use only one implementation or both (just put both settings objects into RecognizerSettings array). Using both implementations increases the chance of correct barcode recognition, but requires more processing time. Of course, we recommend using the BlinkOCR's implementation for supported barcodes.

setScanAztecCode(boolean)

Method activates or deactivates the scanning of Aztec 2D barcodes. Default (initial) value is false.

setScanCode128(boolean)

Method activates or deactivates the scanning of Code128 1D barcodes. Default (initial) value is false.

setScanCode39(boolean)

Method activates or deactivates the scanning of Code39 1D barcodes. Default (initial) value is false.

setScanDataMatrixCode(boolean)

Method activates or deactivates the scanning of Data Matrix 2D barcodes. Default (initial) value is false.

setScanEAN13Code(boolean)

Method activates or deactivates the scanning of EAN 13 1D barcodes. Default (initial) value is false.

setScanEAN8Code(boolean)

Method activates or deactivates the scanning of EAN 8 1D barcodes. Default (initial) value is false.

shouldScanITFCode(boolean)

Method activates or deactivates the scanning of ITF 1D barcodes. Default (initial) value is false.

setScanQRCode(boolean)

Method activates or deactivates the scanning of QR 2D barcodes. Default (initial) value is false.

setScanUPCACode(boolean)

Method activates or deactivates the scanning of UPC A 1D barcodes. Default (initial) value is false.

setScanUPCECode(boolean)

Method activates or deactivates the scanning of UPC E 1D barcodes. Default (initial) value is false.

setInverseScanning(boolean)

By setting this to true, you will enable scanning of barcodes with inverse intensity values (i.e. white barcodes on dark background). This option can significantly increase recognition time. Default is false.

setSlowThoroughScan(boolean)

Use this method to enable slower, but more thorough scan procedure when scanning barcodes. By default, this option is turned on.

Obtaining results from ZXing recognizer

ZXing recognizer produces ZXingScanResult. You can use instanceof operator to check if element in results array is instance of ZXingScanResult class. See the following snippet for example:

@Override
public void onScanningDone(RecognitionResults results) {
	BaseRecognitionResult[] dataArray = results.getRecognitionResults();
	for(BaseRecognitionResult baseResult : dataArray) {
		if(baseResult instanceof ZXingScanResult) {
			ZXingScanResult result = (ZXingScanResult) baseResult;
			
			// getBarcodeType getter will return a BarcodeType enum that will define
			// the type of the barcode scanned
			BarcodeType barType = result.getBarcodeType();
	        // getStringData getter will return the string version of barcode contents
			String barcodeData = result.getStringData();
		}
	}
}

As you can see from the example, obtaining data is rather simple. You just need to call several methods of the ZXingScanResult object:

String getStringData()

This method will return the string representation of barcode contents.

getBarcodeType()

This method will return a BarcodeType enum that defines the type of barcode scanned.

Performing detection of various documents

This section will discuss how to set up a special kind of recognizer called DetectorRecognizer whose only purpose is to perform a detection of a document and return position of the detected document on the image or video frame.

Setting up Detector Recognizer

To activate Detector Recognizer, you need to create DetectorRecognizerSettings and add it to RecognizerSettings array. When creating DetectorRecognizerSettings, you need to initialize it with already prepared DetectorSettings. Check this chapter for more information about available detectors and how to configure them.

You can use the following code snippet to create DetectorRecognizerSettings and add it to RecognizerSettings array:

private RecognizerSettings[] setupSettingsArray() {
	DetectorRecognizerSettings sett = new DetectorRecognizerSettings(setupDetector());
	
	// now add sett to recognizer settings array that is used to configure
	// recognition
	return new RecognizerSettings[] { sett };
}

Please note that snippet above assumes existance of method setupDetector() which returns a fully configured DetectorSettings as explained in chapter Detection settings and results.

Obtaining results from Detector Recognizer

Detector Recognizer produces DetectorRecognitionResult. You can use instanceof operator to check if element in results array is instance of DetectorRecognitionResult class. See the following snippet for an example:

@Override
public void onScanningDone(RecognitionResults results) {
	BaseRecognitionResult[] dataArray = results.getRecognitionResults();
	for(BaseRecognitionResult baseResult : dataArray) {
		if(baseResult instanceof DetectorRecognitionResult) {
			DetectorRecognitionResult result = (DetectorRecognitionResult) baseResult;
			
	        // you can use getters of DetectorRecognitionResult class to 
	        // obtain detection result
	        if(result.isValid() && !result.isEmpty()) {
				DetectorResult detection = result.getDetectorResult();
				// the type of DetectorResults depends on type of configured
				// detector when setting up the DetectorRecognizer
	        } else {
	        	// not all relevant data was scanned, ask user
	        	// to try again
	        }
		}
	}
}

Available getters are:

boolean isValid()

Returns true if detection result is valid, i.e. if all required elements were detected with good confidence and can be used. If false is returned that indicates that some crucial data is missing. You should ask user to try scanning again. If you keep getting false (i.e. invalid data) for certain document, please report that as a bug to help.microblink.com. Please include high resolution photographs of problematic documents.

boolean isEmpty()

Returns true if scan result is empty, i.e. nothing was scanned. All getters should return null for empty result.

DetectorResult getDetectorResult()

Returns the DetectorResult generated by detector that was used to configure Detector Recognizer.

Detection settings and results

This chapter will discuss various detection settings used to configure different detectors that some recognizers can use to perform object detection prior performing further recognition of detected object's contents.

Each detector has its own version of DetectorSettings which derives DetectorSettings class. Besides that, each detector also produces its own version of DetectorResult which derives DetectorResult class. Appropriate recognizers, such as Detector Recognizer, require DetectorSettings for their initialization and provide DetectorResult in their recognition result.

Abstract DetectorSettings contains following setters that are inherited by all derived settings objects:

setDisplayDetectedLocation(boolean)

Defines whether detection location will be delivered as detection metadata to MetadataListener. In order for this to work, you need to set MetadataListener to RecognizerView and you need to allow receiving of detection metadata in MetadataSettings.

Abstract DetectorResult contains following getters that are inherited by all derived result objects:

DetectionCode getDetectionCode()

Returns the Detection code which indicates the status of detection (failed, fallback or success).

Detection of documents with Machine Readable Zone

This section discusses how to use MRTD detector to perform detection of Machine Readable Zones used in various Machine Readable Travel Documents (MRTDs - ID cards and passports). This detector is used internally in Machine Readable Travel Documents recognizer to perform detection of Machine Readable Zone (MRZ) prior performing OCR and data extraction.

Setting up MRTD detector

To use MRTD detector, you need to create MRTDDetectorSettings and give it to appropriate recognizer. You can use following snippet to perform that:

private DetectorSettings setupDetector() {
	MRTDDetectorSettings settings = new MRTDDetectorSettings();

	// with following setter you can control whether you want to detect
	// machine readable zone only or full travel document
	settings.setDetectFullDocument(false);
	
	return settings;
}

As you can see from the snippet, MRTDDetectorSettings can be tweaked with following methods:

setDetectFullDocument(boolean)

This method allows you to enable detection of full Machine Readable Travel Documents. The position of the document is calculated from location of detected Machine Readable Zone. If this is set to false (default), then only location of Machine Readable Zone will be returned.

Obtaining MRTD detection result

MRTD detector produces MRTDDetectorResult. You can use instanceof operator to check if obtained DetectorResult is instance of MRTDDetectorResult class. See the following snippet for an example:

public void handleDetectorResult(DetectorResult detResult) {
	if (detResult instanceof MRTDDetectorResult) {
		MRTDDetectorResult result = (MRTDDetectorResult) detResult;
		Quadrilateral pos = result.getDetectionLocation();
	}
}

The available getters of MRTDDetectorResults are as follows:

Quadrilateral getDetectionLocation()

Returns the Quadrilateral containing the position of detection. If position is empty, all four Quadrilateral points will have coordinates (0,0).

int[] getElementsCountPerLine()

Returns the array of integers defining the number of char-like elements per each line of detected machine readable zone.

MRTDDetectionCode getMRTDDetectionCode()

Returns the MRTDDetectionCode enum defining the type of detection or null if nothing was detected.

Detection of documents with Document Detector

This section discusses how to use Document detector to perform detection of documents of certain aspect ratios. This detector can be used to detect cards, cheques, A4-sized documents, receipts and much more.

Setting up of Document Detector

To use Document Detector, you need to create DocumentDetectorSettings. When creating DocumentDetectorSettings you need to specify at least one DocumentSpecification which defines how specific document should be detected. DocumentSpecification can be created directly or from preset (recommended). Please refer to javadoc for more information on document specification.

In the following snippet, we will show how to setup DocumentDetectorSettings to perform detection of credit cards:

private DetectorSettings setupDetector() {
	DocumentSpecification cardDoc = DocumentSpecification.createFromPreset(DocumentSpecificationPreset.DOCUMENT_SPECIFICATION_PRESET_ID1_CARD);

	DocumentDetectorSettings settings = new DocumentDetectorSettings(new DocumentSpecification[] {cardDoc});

	// require at least 3 subsequent close detections (in 3 subsequent 
	// video frames) to treat detection as 'stable'
	settings.setNumStableDetectionsThreshold(3)
	
	return settings;
}

As you can see from the snippet, DocumentDetectorSettings can be tweaked with following methods:

setNumStableDetectionsThreshold(int)

Sets the number of subsequent close detections must occur before treating document detection as stable. Default is 1. Larger number guarantees more robust document detection at price of slower performance.

setDocumentSpecifications(DocumentSpecification[])

Sets the array of document specifications that define documents that can be detected. See javadoc for DocumentSpecification for more information about document specifications.

Obtaining document detection result

Document detector produces DocumentDetectorResult. You can use instanceof operator to check if obtained DetectorResult is instance of DocumentDetectorResult class. See the following snippet for an example:

public void handleDetectorResult(DetectorResult detResult) {
	if (detResult instanceof DocumentDetectorResult) {
		DocumentDetectorResult result = (DocumentDetectorResult) detResult;
		Quadrilateral pos = result.getDetectionLocation();
	}
}

Available getters of DocumentDetectorResult are as follows:

Quadrilateral getDetectionLocation()

Returns the Quadrilateral containing the position of detection. If position is empty, all four Quadrilateral points will have coordinates (0,0).

double getAspectRatio()

Returns the aspect ratio of detected document. This will be equal to aspect ratio of one of DocumentSpecification objects given to DocumentDetectorSettings.

ScreenOrientation getScreenOrientation()

Returns the orientation of the screen that was active at the moment document was detected.

Detection of faces with Face Detector

This section discusses how to use face detector to perform detection of faces on various documents.

Setting up Face Detector

To use Face Detector, you need to create FaceDetectorSettings and give it to appropriate recognizer. You can use following snippet to perform that:

private DetectorSettings setupDetector() {
	// following constructor initializes FaceDetector settings
	// and requests height of dewarped image to be 300 pixels
	FaceDetectorSettings settings = new FaceDetectorSettings(300);
	return settings;
}

FaceDetectorSettings can be tweaked with following methods:

setDecodingInfo(DecodingInfo)

This method allows you to control how detection will be dewarped. DecodingInfo constains Rectangle which defines position in detected location that is interesting, expressed as relative rectangle with respect to detected rectangle and height to which detection will be dewarped. Fore more info check out DecodingInfo.

setDecodingInfo(int)

This method allows you to control how detection will be dewarped (same as creating DecodingInfo containing Rectangle initialized with (0.f, 0.f, 1.f, 1.f) and given dewarp height.

Obtaining face detection result

Face Detector produces FaceDetectorResult. You can use instanceof operator to check if obtained DetectorResult is instance of FaceDetectorResult class. See the following snippet for an example:

public void handleDetectorResult(DetectorResult detResult) {
	if (detResult instanceof FaceDetectorResult) {
		FaceDetectorResult result = (FaceDetectorResult) detResult;
		Quadrilateral[] locations = result.getDetectionLocations();
	}
}

The available getters of FaceDetectorResults are as follows:

Quadrilateral[] getDetectionLocations()

Returns the locations of detections in coordinate system of image on which detection was performed or null if detection was not successful.

Quadrilateral[] getTransformedDetectionLocations()

Returns the locations of detections in normalized coordinate system of visible camera frame or null if detection was not successful.

Combining detectors with MultiDetector

This section discusses how to use Multi detector to combine multiple different detectors.

Setting up Multi Detector

To use Multi Detector, you need to create MultiDetectorSettings. When creating MultiDetectorSettings you need to specify at least one other DetectorSettings that will be wrapped with Multi Detector. In the following snippet, we demonstrate how to create a Multi detector that wraps both MRTDDetector and Document Detector and has ability to detect either Machine Readable Zone or card document:

private DetectorSettings setupDetector() {
	DocumentSpecification cardDoc = DocumentSpecification.createFromPreset(DocumentSpecificationPreset.DOCUMENT_SPECIFICATION_PRESET_ID1_CARD);
	DocumentDetectorSettings dds = new DocumentDetectorSettings(new DocumentSpecification[] {cardDoc});

	MRTDDetectorSettings mrtds = new MRTDDetectorSettings(100);

    MultiDetectorSettings mds = new MultiDetectorSettings(new DetectorSettings[] {dds, mrtds});
	
	return mds;
}

Obtaining results from Multi Detector

Multi detector produces MultiDetectorResult. You can use instanceof operator to check if obtained DetectorResult is instance of MultiDetectorResult class. See the following snippet for an example:

public void handleDetectorResult(DetectorResult detResult) {
	if (detResult instanceof MultiDetectorResult) {
		MultiDetectorResult result = (MultiDetectorResult) detResult;
		DetectorResults[] results = result.getDetectionResults();
	}
}

As you can see from the snippet, MultiDetectorResult contains one getter:

getDetectionResults()

Returns the array of detection results contained within. You can iterate over the array to inspect each detection result's contents.

Processor architecture considerations

BlinkOCR is distributed with both ARMv7, ARM64, x86 and x86_64 native library binaries.

ARMv7 architecture gives the ability to take advantage of hardware accelerated floating point operations and SIMD processing with NEON. This gives BlinkOCR a huge performance boost on devices that have ARMv7 processors. Most new devices (all since 2012.) have ARMv7 processor so it makes little sense not to take advantage of performance boosts that those processors can give. Also note that some devices with ARMv7 processors do not support NEON instruction sets. Most popular are those based on NVIDIA Tegra 2 fall into this category. Since these devices are old by today's standard, BlinkOCR does not support them.

ARM64 is the new processor architecture that some new high end devices use. ARM64 processors are very powerful and also have the possibility to take advantage of new NEON64 SIMD instruction set to quickly process multiple pixels with single instruction.

x86 architecture gives the ability to obtain native speed on x86 android devices, like Prestigio 5430. Without that, BlinkOCR will not work on such devices, or it will be run on top of ARM emulator that is shipped with device - this will give a huge performance penalty.

x86_64 architecture gives better performance than x86 on devices that use 64-bit Intel Atom processor.

However, there are some issues to be considered:

  • ARMv7 build of native library cannot be run on devices that do not have ARMv7 compatible processor (list of those old devices can be found here)
  • ARMv7 processors does not understand x86 instruction set
  • x86 processors do not understand neither ARM64 nor ARMv7 instruction sets
  • however, some x86 android devices ship with the builtin ARM emulator - such devices are able to run ARM binaries but with performance penalty. There is also a risk that builtin ARM emulator will not understand some specific ARM instruction and will crash.
  • ARM64 processors understand ARMv7 instruction set, but ARMv7 processors does not understand ARM64 instructions
  • if ARM64 processor executes ARMv7 code, it does not take advantage of modern NEON64 SIMD operations and does not take advantage of 64-bit registers it has - it runs in emulation mode
  • x86_64 processors understand x86 instruction set, but x86 processors do not understand x86_64 instruction set
  • if x86_64 processor executes x86 code, it does not take advantage of 64-bit registers and use two instructions instead of one for 64-bit operations

LibRecognizer.aar archive contains ARMv7, ARM64, x86 and x86_64 builds of native library. By default, when you integrate BlinkOCR into your app, your app will contain native builds for all processor architectures. Thus, BlinkOCR will work on ARMv7, ARM64, x86 and x86_64 devices and will use ARMv7 features on ARMv7 devices and ARM64 features on ARM64 devices. However, the size of your application will be rather large.

Reducing the final size of your app

If your final app is too large because of BlinkOCR, you can decide to create multiple flavors of your app - one flavor for each architecture. With gradle and Android studio this is very easy - just add the following code to build.gradle file of your app:

android {
  ...
  splits {
    abi {
      enable true
      reset()
      include 'x86', 'armeabi-v7a', 'arm64-v8a', 'x86_64'
      universalApk true
    }
  }
}

With that build instructions, gradle will build four different APK files for your app. Each APK will contain only native library for one processor architecture and one APK will contain all architectures. In order for Google Play to accept multiple APKs of the same app, you need to ensure that each APK has different version code. This can easily be done by defining a version code prefix that is dependent on architecture and adding real version code number to it in following gradle script:

// map for the version code
def abiVersionCodes = ['armeabi-v7a':1, 'arm64-v8a':2, 'x86':3, 'x86_64':4]

import com.android.build.OutputFile

android.applicationVariants.all { variant ->
    // assign different version code for each output
    variant.outputs.each { output ->
        def filter = output.getFilter(OutputFile.ABI)
        if(filter != null) {
            output.versionCodeOverride = abiVersionCodes.get(output.getFilter(OutputFile.ABI)) * 1000000 + android.defaultConfig.versionCode
        }
    }
}

For more information about creating APK splits with gradle, check this article from Google.

After generating multiple APK's, you need to upload them to Google Play. For tutorial and rules about uploading multiple APK's to Google Play, please read the official Google article about multiple APKs.

Removing processor architecture support in gradle without using APK splits

If you will not be distributing your app via Google Play or for some other reasons you want to have single APK of smaller size, you can completely remove support for certaing CPU architecture from your APK. This is not recommended as this has consequences.

To remove certain CPU arhitecture, add following statement to your android block inside build.gradle:

android {
	...
	packagingOptions {
		exclude 'lib/<ABI>/libBlinkOCR.so'
	}
}

where <ABI> represents the CPU architecture you want to remove:

  • to remove ARMv7 support, use exclude 'lib/armeabi-v7a/libBlinkOCR.so'
  • to remove x86 support, use exclude 'lib/x86/libBlinkOCR.so'
  • to remove ARM64 support, use exclude 'lib/arm64-v8a/libBlinkOCR.so'
  • to remove x86_64 support, use exclude 'lib/x86_64/libBlinkOCR.so'

You can also remove multiple processor architectures by specifying exclude directive multiple times. Just bear in mind that removing processor architecture will have sideeffects on performance and stability of your app. Please read this for more information.

Removing processor architecture support in Eclipse

This section assumes that you have set up and prepared your Eclipse project from LibRecognizer.aar as described in chapter Eclipse integration instructions.

If you are using Eclipse, removing processor architecture support gets really complicated. Eclipse does not support build flavors and you will either need to remove support for some processors or create several different library projects from LibRecognizer.aar - each one for specific processor architecture.

Native libraryies in eclipse library project are located in subfolder libs:

  • libs/armeabi-v7a contains native libraries for ARMv7 processor arhitecture
  • libs/x86 contains native libraries for x86 processor architecture
  • libs/arm64-v8a contains native libraries for ARM64 processor architecture
  • libs/x86_64 contains native libraries for x86_64 processor architecture

To remove a support for processor architecture, you should simply delete appropriate folder inside Eclipse library project:

  • to remove ARMv7 support, delete folder libs/armeabi-v7a
  • to remove x86 support, delete folder libs/x86
  • to remove ARM64 support, delete folder libs/arm64-v8a
  • to remove x86_64 support, delete folder libs/x86_64

Consequences of removing processor architecture

However, removing a processor architecture has some consequences:

  • by removing ARMv7 support BlinkOCR will not work on devices that have ARMv7 processors.
  • by removing ARM64 support, BlinkOCR will not use ARM64 features on ARM64 device
  • by removing x86 support, BlinkOCR will not work on devices that have x86 processor, except in situations when devices have ARM emulator - in that case, BlinkOCR will work, but will be slow
  • by removing x86_64 support, BlinkOCR will not use 64-bit optimizations on x86_64 processor, but if x86 support is not removed, BlinkOCR should work

Our recommendation is to include all architectures into your app - it will work on all devices and will provide best user experience. However, if you really need to reduce the size of your app, we recommend releasing separate version of your app for each processor architecture. It is easiest to do that with APK splits.

Combining BlinkOCR with other native libraries

If you are combining BlinkOCR library with some other libraries that contain native code into your application, make sure you match the architectures of all native libraries. For example, if third party library has got only ARMv7 and x86 versions, you must use exactly ARMv7 and x86 versions of BlinkOCR with that library, but not ARM64. Using these architectures will crash your app in initialization step because JVM will try to load all its native dependencies in same preferred architecture and will fail with UnsatisfiedLinkError.

Troubleshooting

Integration problems

In case of problems with integration of the SDK, first make sure that you have tried integrating it into Android Studio by following integration instructions. Althought we do provide Eclipse ADT integration integration instructions, we officialy do not support Eclipse ADT anymore. Also, for any other IDEs unfortunately you are on your own.

If you have followed Android Studio integration instructions and are still having integration problems, please contact us at help.microblink.com.

SDK problems

In case of problems with using the SDK, you should do as follows:

Licencing problems

If you are getting "invalid licence key" error or having other licence-related problems (e.g. some feature is not enabled that should be or there is a watermark on top of camera), first check the ADB logcat. All licence-related problems are logged to error log so it is easy to determine what went wrong.

When you have determine what is the licence-relate problem or you simply do not understand the log, you should contact us help.microblink.com. When contacting us, please make sure you provide following information:

  • exact package name of your app (from your AndroidManifest.xml and/or your build.gradle file)
  • licence key that is causing problems
  • please stress out that you are reporting problem related to Android version of BlinkOCR SDK
  • if unsure about the problem, you should also provide excerpt from ADB logcat containing licence error

Other problems

If you are having problems with scanning certain items, undesired behaviour on specific device(s), crashes inside BlinkOCR or anything unmentioned, please do as follows:

  • enable logging to get the ability to see what is library doing. To enable logging, put this line in your application:

     com.microblink.util.Log.setLogLevel(com.microblink.util.Log.LogLevel.LOG_VERBOSE);

    After this line, library will display as much information about its work as possible. Please save the entire log of scanning session to a file that you will send to us. It is important to send the entire log, not just the part where crash occured, because crashes are sometimes caused by unexpected behaviour in the early stage of the library initialization.

  • Contact us at help.microblink.com describing your problem and provide following information:

    • log file obtained in previous step
    • high resolution scan/photo of the item that you are trying to scan
    • information about device that you are using - we need exact model name of the device. You can obtain that information with this app
    • please stress out that you are reporting problem related to Android version of BlinkOCR SDK

Additional info

Complete API reference can be found in Javadoc.

For any other questions, feel free to contact us at help.microblink.com.

blinkinput-android's People

Contributors

dodoent avatar i1e avatar cerovec avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.