Within the last two years many companies had to ask their customers to sign the SEPA Direct Debit Mandates. It is an established procedure to send out forms with filled customer data (the SEPA Mandate). The customer signs the mandate and sends it back to the company.
One of our customers – an insurance company – is using Kofax Capture and Kofax Transformation Modules for mailroom automation. In this context the SEPA Mandates had to be recognized by KTM und the appropriate business process had to be triggered.
Till then two processes for SEPA Mandates were established:
- The customer has signed the mandate: the flag ‘SEPA Mandate was granted’ is set. No further action is needed.
- The customer did not sign the mandate: further administrative processing must be started.
Within KTM the recognition of the signature is done by an advanced zone locator and blackness values of zones.
In the course of time this concept was diluted. On one hand the customers changed or supplemented the filled in customer data by handwritten comments, because the data was wrong or incomplete. On the other hand, some customers received blank SEPA Mandates, which were filled by the customer with handwritten text.
Thus the insurance company was in need of another process for SEPA Mandates:
- customer has signed the mandate, but within the ‘customer data’ region of the mandate handwritten notes were added. A process to change or add customer data must be started.
The challenge for KTM was to recognize if there are handwritten notes in defined regions of the SEPA Mandate.
This is an example of a filled mandate, which was signed by the customer:
And now an example with handwritten changes by the customer:
To identify the handwritten notes you can use the OCR engine ‘Mixed Print’. This engine is for reading typescript and handwriting on a document. However we are not interested in the content of the handwritten notes – we just want to know if there are handwritten notes at all. The ‘Mixed Print’ engine won’t give good results for the content of the written notes, as in these cases typescript and handwriting will often overlap.
But the ‘Mixed Print’ engine provides information, if there was handwriting at all. Candidates for handwritten notes are marked with so-called ‘boxes’ You can view these ‘boxes’ by using the XDOC browser, which comes with the KTM installation. First, you run the ‘Mixed Print’ engine on the mandate document (you can do that in the KTM Project Builder). Then you start the XDOC browser to open the xdc file of the mandate document:
‘Representation 0’ (the ‘Mixed Print’ engine) has three ‘boxes’. Each box stands for a region with candidates for handwriting. These ‘boxes’ can be retrieved by KTM scripting. By selecting a region within the mandate where you look for the ‘boxes’, everything is ready to judge if somebody scribbled on your form.
To define the ‘search region’ you could use the words ‘one-off payment’ (upper right corner, defines upper bound of search region) and ‘By signing this mandate form’ (text underneath the customer data, defines lower of search region). To find this words you could use format locators or search directly within the OCR result of the document. The following scripting example looks directly into the OCR result. The function Is_handwritten returns TRUE, if at least one ‘box’ is found within the search region.
The example script needs a reference to ‘Kofax memphis Forms 4.0’. So please add this reference in your KTM script:
The underlying KTM project uses OCR recognition with RecoStar or FineReader by default. To check if somebody scribbeled on the mandate you may use the following function:
1Function Is_handwritten(pXDoc As CASCADELib.CscXDocument) As Boolean 2'Checks is something handwritten is in a region of the page 3 4Dim i As Integer 5Dim BoxAnzahl As Integer 6Dim StartTOP As Long 7Dim EndeTOP As Long 8 9BoxAnzahl=0 10StartTOP=0 11EndeTOP=0 12Is_handwritten=False 13 14'Search 'one-off payment' and add 80 to TOP. Only look south. 15For i=0 To pXDoc.TextLines.Count-1 16 If InStr(LCase(pXDoc.TextLines(i).Text),"one-off payment")>0 Then 17 StartTOP=pXDoc.TextLines(i).Top 18 StartTOP=StartTOP+80 '~ line height 19 Exit For 20 End If 21Next 22 23'Search 'By signing this mandate form'. Only look north of this. 24For i=0 To pXDoc.TextLines.Count-1 25 If InStr(LCase(pXDoc.TextLines(i).Text),"By signing this mandate form")>0 Then 26 EndeTOP=pXDoc.TextLines(i).Top 27 Exit For 28 End If 29Next 30 31'Re-OCR with engine 'Mixed Print' 32FullPageRecognition_1(pXDoc, "", "Mixed Print") 33 34'only count boxes south of StartTOP 35'only count boxes north of EndeTOP 36'Box.width>200 to avoid 'dirt' 37'Box.left>275 to leave out the left border (holes, barcodes) 38 39For i= 0 To pXDoc.Boxes.Count-1 40 If pXDoc.Boxes.ItemByIndex(i).Top>StartTOP And pXDoc.Boxes.ItemByIndex(i).Width>200 And pXDoc.Boxes.ItemByIndex(i).Left>275 And pXDoc.Boxes.ItemByIndex(i).Top<EndeTOP Then 41 BoxAnzahl=BoxAnzahl+1 42 End If 43Next 44 45'OCR back to RecoStar or FineReader, for standard processing 46FullPageRecognition_1(pXDoc, "", "RecoStar") 47 48If BoxAnzahl>0 Then 'at least one box: there was some handwriting! 49 Is_handwritten= True 50Else 51 Is_handwritten= False 52End If 53End Function
And finally the called procedure FullPageRecognition_1, which does an Re-OCR:
1Public Sub FullPageRecognition_1(ByVal pXDoc As CscXDocument, ByVal ImageCleanProfile As String, ByVal OCRProfile As String) 2 'remove existing OCR results and perform OCR on page one with profile OCRProfile 3 Dim i as Integer 4 Dim oPRP As IMpsPageRecogProfile 5 Dim oPR As New MpsPageRecognizing 6 7 'OCR only on page 1 8 pXDoc.CDoc.Pages(0).SuppressOCR=False 9 10 '# Remove any representations, before proceeding to perform full page recognition 11 For i = pXDoc.Representations.Count -1 To 0 Step -1 12 pXDoc.Representations.Remove (i) 13 Next 14 15 Set oPRP = Project.RecogProfiles.ItemByName(OCRProfile) '# Use the page recognition profile OCRProfile 16 oPR.Recognize(pXDoc, oPRP, 0) '# Perform recognition on the first page 17 18 '# At design time the text lines need to be analysed. At runtime this will be done automatically 19 If Project.ScriptExecutionMode = CscScriptExecutionMode.CscScriptModeServerDesign Then pXDoc.Representations(0).AnalyzeLines 20End Sub
Older blog articles about KTM and KC: