This example will load data from tagged format data, internet email
in this case. Each field starts on new line and has a tag of
the form "Tag:
" before it. Records are delimited by
the Leading "From
" line of each message.
database /tmp/testdb
table mail
recdelim >>\n=\n\F\RFrom\x20
multiple
firstmatch
datefmt x, dd mmm yyyy HH:MM:SS
# Name Type Tag Default value
field From varchar From ''
field To varchar To ''
field Subject varchar Subject ''
field Date date Date
field Text varchar -
Here's another schema that imports more of the email fields and handles more variants of the format.
table mail
recdelim >>\n=\n\F\RFrom\x20
recsize 2000000
firstmatch
trimspace
datefmt x, dd mmm yyyy HH:MM:SS
# Name Type Tag
Default
field Headers varchar(512) />>\n\n\P=\RFrom\x20=!$$+ ''
field Body varchar(1024) />>\n\n\RFrom\x20=!$$\P+!\n\n
\RFrom\x20+ ''
field From varchar(80) />>\x0a\RFrom:=\P[\x20\x09]*
!\x0a[^\x0a\x20\x09]+\F\x0a[^\x0a\x20\x09] ''
field To varchar(80) />>\x0a\RTo:=\P[\x20\x09]*
!\x0a[^\x0a\x20\x09]+\F\x0a[^\x0a\x20\x09] ''
field Subject varchar(80) />>\x0a\RSubject:=\P
[\x20\x09]*!\x0a[^\x20\x09]+\F\x0a[^\x20\x09] ''
field Date date />>\x0a\RDate:=\P[\x20\x09]*
!\x0a[^\x0a\x20\x09]+\F\x0a[^\x0a\x20\x09]
field Returnpath varchar(80) />>\x0a\RReturn-Path:=\P
[\x20\x09]*!\x0a[^\x0a\x20\x09]+\F\x0a[^\x0a\x20\x09] ''
field Msgid varchar(80) />>\x0a\RMessage-ID:=\P
[\x20\x09]*!\x0a[^\x0a\x20\x09]+\F\x0a[^\x0a\x20\x09] ''
Here is an example record for the above schemas. You could also use your own mailbox as input.
From jsmith@somesite.com Mon Dec 08 14:33:33 1997
From: "Smith, John" <jsmith@somesite.com>
To: jdoe@thunderstone.com
Subject: I want to purchase Texis
Date: Mon, 8 Dec 1997 14:33:32 -0500
I have looked at all of the information about Texis on your
web site and would like to place an order. I will be contacting
you by phone tomorrow.
Sincerely,
John Smith
Database Administrator
Smith Consultants