The SUBSET operator allows subset queries, and is typically
used with multi-value (i.e. strlst
) fields that are treated as
sets. It is true if the left-side is a subset of the right-side,
i.e. if there are no values on the left-side that are missing from the
right-side. Duplicates count, i.e. they must match one-to-one from
left side to right.
For example, suppose the table Users
contains one row per
user (UserName
), and has a strlst
column
GroupMembership
that lists all the groups that row's user is a
member of. To find all users that are members of groups
"Management
", "Sales
" and
"Marketing
", a SUBSET query can be used:
SELECT UserName
FROM Users
WHERE ('Management', 'Sales', 'Marketing')
IS SUBSET OF GroupMembership;
(Syntactically, SUBSET is always used as part of the phrase IS SUBSET OF, as it is only valid in WHERE clauses.) The above query will return the users that are members of all three groups - including any users that may also be members of additional groups.
Note that SUBSET is not commutative, i.e. if the left- and
right-sides are reversed, the meaning is changed (unlike e.g. INTERSECT). If A
is a subset of B
, then B
is
not necessarily a subset of A
; B
is a subset of
A
if and only if both sets contain the same values. E.g. this
query:
SELECT UserName
FROM Users
WHERE GroupMembership
IS SUBSET OF ('Management', 'Sales', 'Marketing');
while merely the reversed version of the earlier query, behaves differently: it would list the users whose are in zero or more of the Management, Sales or Marketing groups - and are not in any other groups.
In set logic the empty set is a subset of any set; thus if there are no values on the left-side, SUBSET is true no matter what the right-side value(s) are. Note that SUBSET interprets an empty varchar value as empty-set, not single-item empty-string set set (as IN does). See here for details, as well as additional behaviors that IN, SUBSET and INTERSECT share in common. SUBSET was added in Texis version 7.
A SUBSET query can often utilize a regular (B-tree) index to
increase performance. Generally the index should be created with
indexvalues
set to splitstrlst
(the default), as this
enables individual values of strlst
s to be accessed as needed.
There are some limitations and caveats for SUBSET and indexes
however:
strlst
column (either side):
Queries with empty-set parameters (i.e. zero-item strlst
,
or empty varchar
) and a strlst
column cannot use an
indexvalues=splitstrlst
index, regardless of which side of
SUBSET the parameter and column are on. An index with
indexvalues=all
can be used however. It may be created in
addition to the normal indexvalues=splitstrlst
index, and
the Texis optimizer will choose the appropriate one at search
time.strlst
column left-side, non-empty parameter
right-side:
With a strlst
column on the left-side, and a non-empty
parameter on the right, empty rows will not be returned if an
index is used - even though they properly match (as empty set is
a subset of any set).
These caveats are due to limitations in indexvalues=strlst
indexes; see here for more information.