By combining the lookup()
function with a GROUP BY
, a
column may be grouped into bins or ranges - e.g. for price-range
grouping - instead of distinct individual values. Syntax:
lookup(keys, ranges[, names])
The keys argument is one (or more, e.g. strlst
) values to
look up; each is searched for in the ranges argument, which is
one (or more, e.g. strlst
) ranges. All range(s) that the given
key(s) match will be returned. If the names argument is given,
the corresponding names value(s) are returned instead; this
allows ranges to be renamed into human-readable values. If names is given, the number of its values must equal the number of
ranges.
Each range is a pair of values (lower and upper bounds) separated by ".." (two periods). The range is optionally surrounded by square (bound included) or curly (bound excluded) brackets. E.g.:
[10..20}
denotes the range 10 to 20: including 10 ("[") but not including ("}") 20. Both an upper and lower bracket must be given if either is present (though they need not be the same type). The default if no brackets are given is to include the lower bound but exclude the upper bound; this makes consecutive ranges non-overlapping, if they have the same upper and lower bound and no brackets (e.g. "0..10,10..20"). Either bound may be omitted, in which case that bound is unlimited. Each range's lower bound must not be greater than its upper bound, nor equal if either bound is exclusive.
If a ranges value is not varchar/char, or does not
contain "..", its entire value is taken as a single inclusive
lower bound, and the exclusive upper bound will be the next ranges value's lower bound (or unlimited if no next value).
E.g. the varint
lower-bound list:
0,10,20,30
is equivalent to the strlst
range list:
[0..10},[10..20},[20..30},[30..]
By using the lookup()
function in a GROUP BY
, a column may
be grouped into ranges. For example, given a table Products
with the following SKUs and float
prices:
SKU Price
------------
1234 12.95
1235 5.99
1236 69.88
1237 39.99
1238 29.99
1239 25.00
1240 50.00
1241 -2.00
1242 499.95
1243 19.95
1244 9.99
1245 125.00
they may be grouped into price ranges (with most-products first) with this SQL:
SELECT lookup(Price, convert('0..25,25..50,50..,', 'strlst', 'lastchar'),
convert('Under $25,$25-49.99,$50 and up,', 'strlst', 'lastchar'))
PriceRange, count(SKU) NumberOfProducts
FROM Products
GROUP BY lookup(Price, convert('0..25,25..50,50..,', 'strlst', 'lastchar'),
convert('Under $25,$25-49.99,$50 and up,', 'strlst', 'lastchar'))
ORDER BY 2 DESC;
or this Vortex:
<$binValues = "0..25" "25..50" "50..">
<$binDisplays = "Under $$25" "$$25-49.99" "$$50 and up">
<sql row "select lookup(Price, $binValues, $binDisplays) PriceRange,
count(SKU) NumberOfProducts
from Products
group by lookup(Price, $binValues, $binDisplays)
order by 2 desc">
<fmt "%10s: %d\n" $PriceRange $NumberOfProducts>
</sql>
which would give these results:
PriceRange NumberOfProducts
------------+------------+
Under $25, 4
$50 and up, 4
$25-49.99, 3
1
The trailing commas in PriceRange values are due to them being
strlst
values, for possible multiple ranges. Note the empty
PriceRange for the fourth row: the -2 Price matched
no ranges, and hence an empty PriceRange was returned for
it.
CAVEATS
The lookup()
function as described above was added in Texis
version 7.06.1528745000 20180611.
A different version of the lookup()
function was first added in
version 7.01.1386980000 20131213: it only took the second range syntax
variant (single lower bound); range values had to be in ascending
order (by keys type); only the first matching range was
returned; and if a key did not match any range the first range was
returned.
SEE ALSOlookupCanonicalizeRanges
, lookupParseRange