Example Schema: Variable Width Columns (Web Server Log)

This schema will load a standard web server access log into Texis. This data contains variable width fields without consistent delimiters and an optional field. A single Rex expression that matches an entire record is used.

database /tmp/testdb
table    htlog
recdelim \x0a
multiple
trimspace
datefmt dd/mmm/yyyy:HH:MM:SS
# this expression should be on one line with no intervening spaces
recexpr  >>^\P=[^ ]+ +-?[^ ]* +-?[^ ]* +\[=[^\]]+]= +"=[^ ]+ +
[^ "]+ HTTP/?\digit?\.?\digit?"= +[^ ]+ +[^ \x0a]+
#       Name            Type            Tag     Default value
field   Client          varchar(40)     2
field   Ident           varchar(40)     5
field   User            varchar(20)     8
field   Date            date            11
field   Method          varchar(10)     15
field   Request         varchar(100)    17
field   Protocol        varchar(10)     18-21
field   Status          integer         24
field   Bytes           integer         26

Here is sample data for the above schema.

198.49.220.90 - - [03/Sep/1996:13:34:25 -0400] "GET /jump/
Demonstrations.html HTTP/1.0" 200 2622
index.thunderstone.com - - [03/Sep/1996:13:34:57 -0400] "GET
/hrline.gif HTTP/1.0"
thunder.thunderstone.com - - [03/Sep/1996:14:18:01 -0400] "GET
/" 200 1857
thunder.thunderstone.com rfc931 - [03/Sep/1996:14:18:02 -0400]
"GET /" 200 1857
thunder.thunderstone.com rfc931 mw [03/Sep/1996:14:18:03 -0400]
"GET /" 200 1857
thunder.thunderstone.com - mw [03/Sep/1996:14:18:04 -0400] "GET /"
200 1857


Copyright © Thunderstone Software     Last updated: Apr 15 2024
Copyright © 2024 Thunderstone Software LLC. All rights reserved.