lookup

By combining the lookup() function with a GROUP BY, a column may be grouped into bins or ranges - e.g. for price-range grouping - instead of distinct individual values. Syntax:

lookup(keys, ranges[, names])

The keys argument is one (or more, e.g. strlst) values to look up; each is searched for in the ranges argument, which is one (or more, e.g. strlst) ranges. All range(s) that the given key(s) match will be returned. If the names argument is given, the corresponding names value(s) are returned instead; this allows ranges to be renamed into human-readable values. If names is given, the number of its values must equal the number of ranges.

Each range is a pair of values (lower and upper bounds) separated by ".." (two periods). The range is optionally surrounded by square (bound included) or curly (bound excluded) brackets. E.g.:

[10..20}

denotes the range 10 to 20: including 10 ("[") but not including ("}") 20. Both an upper and lower bracket must be given if either is present (though they need not be the same type). The default if no brackets are given is to include the lower bound but exclude the upper bound; this makes consecutive ranges non-overlapping, if they have the same upper and lower bound and no brackets (e.g. "0..10,10..20"). Either bound may be omitted, in which case that bound is unlimited. Each range's lower bound must not be greater than its upper bound, nor equal if either bound is exclusive.

If a ranges value is not varchar/char, or does not contain "..", its entire value is taken as a single inclusive lower bound, and the exclusive upper bound will be the next ranges value's lower bound (or unlimited if no next value). E.g. the varint lower-bound list:

0,10,20,30

is equivalent to the strlst range list:

[0..10},[10..20},[20..30},[30..]

By using the lookup() function in a GROUP BY, a column may be grouped into ranges. For example, given a table Products with the following SKUs and float prices:

SKU    Price
    ------------
    1234   12.95
    1235    5.99
    1236   69.88
    1237   39.99
    1238   29.99
    1239   25.00
    1240   50.00
    1241   -2.00
    1242  499.95
    1243   19.95
    1244    9.99
    1245  125.00

they may be grouped into price ranges (with most-products first) with this SQL:

SELECT   lookup(Price, convert('0..25,25..50,50..,', 'strlst', 'lastchar'),
     convert('Under $25,$25-49.99,$50 and up,', 'strlst', 'lastchar'))
       PriceRange, count(SKU) NumberOfProducts
FROM Products
GROUP BY lookup(Price, convert('0..25,25..50,50..,', 'strlst', 'lastchar'),
     convert('Under $25,$25-49.99,$50 and up,', 'strlst', 'lastchar'))
ORDER BY 2 DESC;

or this Vortex:

<$binValues =   "0..25"      "25..50"     "50..">
<$binDisplays = "Under $$25" "$$25-49.99" "$$50 and up">
<sql row "select lookup(Price, $binValues, $binDisplays) PriceRange,
              count(SKU) NumberOfProducts
          from Products
          group by lookup(Price, $binValues, $binDisplays)
          order by 2 desc">
  <fmt "%10s: %d\n" $PriceRange $NumberOfProducts>
</sql>

which would give these results:

PriceRange NumberOfProducts
------------+------------+
Under $25,              4
$50 and up,             4
$25-49.99,              3
                        1

The trailing commas in PriceRange values are due to them being strlst values, for possible multiple ranges. Note the empty PriceRange for the fourth row: the -2 Price matched no ranges, and hence an empty PriceRange was returned for it.


CAVEATS
The lookup() function as described above was added in Texis version 7.06.1528745000 20180611.

A different version of the lookup() function was first added in version 7.01.1386980000 20131213: it only took the second range syntax variant (single lower bound); range values had to be in ascending order (by keys type); only the first matching range was returned; and if a key did not match any range the first range was returned.


SEE ALSO
lookupCanonicalizeRanges, lookupParseRange


Copyright © Thunderstone Software     Last updated: Oct 5 2023
Copyright © 2024 Thunderstone Software LLC. All rights reserved.