Date Formats

Where dates appear in the text in a predictable format, they can be captured as a date field using the data type date, rather than as characters only. This is shown in the example above with the log file. You may also use the data type datestr to perform date parsing, but store the results into a varchar field (datestr was added in version 2.12 (Oct 2 1998). Timport allows for some flexibility in the manner in which the dates might appear.

The keyword datefmt is the format to expect date fields in. The default is Texis style:

yyyy-mm-dd[ HH[:MM[:SS]]]

where the first 4 digits represent the year, then 2 digits for the month, 2 digits for the day, and optionally 2 digits each for hours, minutes, and seconds. The scanner in Timport will treat all punctuation and space as delimiters.

In the above schema file, the datefmt keyword is defined to match the way it exists in the log file:

datefmt  yy-mm-dd HHMM

This will match dates as above:

95-04-21 1300
     95-04-21 1301
     95-04-21 1800

Use these specifications to define the expected date format as the value for datefmt. Specify:

y for year digits
     m for month digits or month name
     d for day of month digits
     j for day of year digits
     H for hour digits
     M for minute digits
     S for second digits
     p for "am" or "pm" string
     x for junk

The scanner will read up to the next delimiter or how many digits you specify, whichever comes first.
Any non-digit is a delimiter for the digit only types.
'p' will only check for 'a' or 'p' then skip all trailing alphabetics.
'x' will skip all alphabetics.

1900 will be added to 2 digit year specs greater than 69. 2000 will be added to 2 digit year specs less than 70.

Examples:

Format                 Matches                  Means
yy-mm-dd HHMM          95-04-27 16:54           1995-04-27 16:54:00
dd-mm-yyyy HH:MM:SS    27/04/1995 16:54:32      1995-04-27 16:54:32
yyyymmdd HHMMSS p      19950427 045432 pm       1995-04-27 16:54:32
x, dd mmm yy HH:MM:SS  Thu, 27 Apr 95 16:55:56  1995-04-27 16:55:56

Capturing the dates as date values allows for greater than >, less than < manipulations of document by date range, adding to the power of the database.