NPM
, the Numeric Pattern Matcher, is one of several pattern
matchers that can be called by the user in sending out a Metamorph
query. It is signified by a pound sign `#
' in the starting
position, in the same way that the tilde `~
' calls SPM
,
a percent sign `%
' calls XPM
, a forward slash `/
'
calls REX
, and no special character in the first position
(where there are equivalences) calls PPM
or SPM
.
There are still many numeric patterns that are best located with a
REX
expression to match the range of characters desired.
However, when you need the program to interpret your query as a
numeric quantity, use NPM
. NPM
does number crunching
through all possible numbers found in the text to locate those numbers
which are in the specified range of desired numbers. Therefore where
a lot of numeric searching is being done you may find that a math
co-processor can speed up such searches.
Since all numbers in the text become items to be checked for numeric
validity, one should tie down the search by specifying one or more
other items along with the NPM
item. For example you might
enter on the query line:
cosmetic sales $ #>1000000
Such a search would locate a sentence like:
Income produced from lipstick brought the company $4,563,000 last year.
In this case "income
" is located by PPM
as a match to
"sales
", "lipstick
" is located by PPM
as a
match to "cosmetic
", the English character "$
"
signifying "dollars
" is located by SPM
as a match to
"$
", and the numeric quantity represented in the text as
"4,563,000
" is located by NPM
as a match to
"#>1000000
" (a number greater than one million). Another
example:
cosmetic sales $ #>million
Even though one can locate the same sentence by entering the above
query, it is strongly recommended that searches entered on the query
line are entered as precise numeric quantities. The true intent of
NPM
is to make it possible to locate and treat as a numeric
value information in text which was not entered as such.
You would find the above sentence even without specifying the string
"$
", but realize that the dollar sign ($
) in the text
is not part of the numeric quantity located by NPM
. There may
be cases where it is important to specify both the quantity and the
unit. For example, if you are looking for quantities of coal, you
wouldn't want to find coal pricing information by mistake. Compare
these two searches:
Query1: Australia coal tons #>500
Query2: Australia coal $ #>500
The first would locate the sentence:
Petroleum Consolidated mined 1200 tons of coal in Australia.
The second would locate the sentence:
From dividends paid out of the $3.5 million profit in the coal industry, they were able to afford a vacation in Australia.
Some units, such as kilograms, milliliters, nanoamps, and such, are
understood by NPM
to be their true value; that is, in the first case,
1000 grams
. Use NPMP
to find out which units are understood and
how they will be interpreted. The carrot mark (^
) shows where the parser
stops understanding valid numeric quantities. Note that an abbreviation such
as "kg
" is not understood as a quantity, but only a unit; therefore,
"5 kilograms
" has a numeric quantity of 5000
(grams), where
"5 kg
" has a numeric quantity of 5
(kg's).
Beware of entering something that doesn't make sense. For example, a
quantity cannot be less than 6 and greater than 10 at the same time, and
therefore "#<6>10
" will make the controlfile sent to the engine unable to be
processed.
Do not enter ambiguity on the query line; NPM
is intended to deal with
ambiguity in the text, not in the query. The safest way to enter NPM
searches
is by specifying the accurate numeric quantity desired. Example:
date #>=1980<=1989
This query will locate lines containing a date specification and a year, where
one wants only those entries from the 1980's. It would also locate dates in legal
documents which are spelled out. Example:
retirement benefits age #>50<80
This query will locate references about insurance benefits which reference
age 54, 63, and so on. Reflecting the truer intent of NPM
, a sentence like
the following could also be retrieved.
At fifty-five one is awarded the company's special Golden Age program.
In the event that a numeric string contains a space, it must be in quotes
to be interpreted correctly. So, although it is strongly not recommended, one
could enter the following:
revenue "#>fifty five"
With this, you can locate references like the following example.
Their corporate gross income was $1.4 million before they merged with Acme Industrial.
Keep in mind that an NPM
Search done within the context of
Metamorph relies upon occurrences of intersections of found items
inside the specified text delimiters, just as any Metamorph search.
It is still not a database tool. The Engine will retrieve any hit
which satisfies all search requirements including those which contain
additional numeric information beyond what was called for.
In an application where Metamorph Hit Markup has been enabled, exactly what was found will be highlighted. This is the easiest way to get feedback on what was located to satisfy search requirements. If there are any questions about results, review basic program theory and compare to the other types of searches as given elsewhere in this chapter.