1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<TITLE> [Mageia-discuss] Reading payment forms with a scanner
</TITLE>
<LINK REL="Index" HREF="index.html" >
<LINK REL="made" HREF="mailto:mageia-discuss%40mageia.org?Subject=Re%3A%20%5BMageia-discuss%5D%20Reading%20payment%20forms%20with%20a%20scanner&In-Reply-To=%3C511C9E90.2020102%40unige.ch%3E">
<META NAME="robots" CONTENT="index,nofollow">
<META http-equiv="Content-Type" content="text/html; charset=us-ascii">
<LINK REL="Previous" HREF="009185.html">
<LINK REL="Next" HREF="009188.html">
</HEAD>
<BODY BGCOLOR="#ffffff">
<H1>[Mageia-discuss] Reading payment forms with a scanner</H1>
<B>Juergen Harms</B>
<A HREF="mailto:mageia-discuss%40mageia.org?Subject=Re%3A%20%5BMageia-discuss%5D%20Reading%20payment%20forms%20with%20a%20scanner&In-Reply-To=%3C511C9E90.2020102%40unige.ch%3E"
TITLE="[Mageia-discuss] Reading payment forms with a scanner">juergen.harms at unige.ch
</A><BR>
<I>Thu Feb 14 09:21:36 CET 2013</I>
<P><UL>
<LI>Previous message: <A HREF="009185.html">[Mageia-discuss] Mageia Cauldron ksplash theme
</A></li>
<LI>Next message: <A HREF="009188.html">[Mageia-discuss] Reading payment forms with a scanner
</A></li>
<LI> <B>Messages sorted by:</B>
<a href="date.html#9187">[ date ]</a>
<a href="thread.html#9187">[ thread ]</a>
<a href="subject.html#9187">[ subject ]</a>
<a href="author.html#9187">[ author ]</a>
</LI>
</UL>
<HR>
<!--beginarticle-->
<PRE>I just had a nasty experience with an ebanking bill that got rejected
(without sending me a corresponding note) due to a typo.
Does anybody have experience/advice on using a normal scanner for
reading the essential fields of payment forms to make ebanking more
efficient and less error-prone?
I just did some googling and quick checks along the following lines:
- tile the payment forms on the scanner (I have an Epson 1260), so that
only the reading zone at the bottom is visible of each,
- scan with xsane (selecting adequate settings - different from those I
ordinarily use)
- if necessary us gimp to cut away zones with garbage that upset the OCR
conversion (i.e. tesseract, can be avoided by properly setting the
reading area in xsane)
- use tesseract to do OCR
- filter the output to throw away garbage lines, and to correct
characters that frequently get mis-interpreted (e.g. B->8, Z->2, O->0,
D->0 etc.)
- output the data thus produced, formatted for copy-paste into the
ebanking form
That works surprisingly well, but is excessivly complicated to handle
(easy to make handling mistakes, not fit to give it to my wife). Are
there tools that help automating these steps and integrating them into a
single tool? - if not, it should not be too difficult to do some
scripting (but I dont want to re-invent things). (And yes, I had tried
some years ago these small reading sticks that you slide over the form -
I ditched it: only works on windows, and produces an excessive amount of
errors).
Juergen
</PRE>
<!--endarticle-->
<HR>
<P><UL>
<!--threads-->
<LI>Previous message: <A HREF="009185.html">[Mageia-discuss] Mageia Cauldron ksplash theme
</A></li>
<LI>Next message: <A HREF="009188.html">[Mageia-discuss] Reading payment forms with a scanner
</A></li>
<LI> <B>Messages sorted by:</B>
<a href="date.html#9187">[ date ]</a>
<a href="thread.html#9187">[ thread ]</a>
<a href="subject.html#9187">[ subject ]</a>
<a href="author.html#9187">[ author ]</a>
</LI>
</UL>
<hr>
<a href="https://www.mageia.org/mailman/listinfo/mageia-discuss">More information about the Mageia-discuss
mailing list</a><br>
</body></html>
|