So far, the examples presented have shown how to retrieve and manipulate values from individual rows in a table. In this section, we will illustrate how summary information can be obtained from groups of rows in a table.
Often we find it useful to group data by some characteristic of the
group, such as department or division, or benefit level, so that
summary statistics about the group (totals, averages, etc.) can be
calculated. For example, to calculate average departmental salaries,
the user could group the salaries of all employees by department. In
Texis, the GROUP BY clause is used to divide the rows of a table into
groups that have matching values in one or more columns. The form of
this clause is:
GROUP BY column-name1 [,column-name2] ...
and it fits into the SELECT
expression in the following manner.
SELECT column-name1 [,column-name2] ...
FROM table-name
[WHERE search-condition]
[GROUP BY column-name1 [,column-name2] ... ]
[ORDER BY column-name1 [DESC] [,column-name2] [DESC] ] ... ;
The column(s) listed in the GROUP BY clause are used to form groups. The grouping is based on rows with the same value in the specified column or columns being placed in the same group. It is important to note that grouping is conceptual; the table is not physically rearranged.
As an extension Texis also allows the GROUP BY clause to consist of
expressions instead of just column names. This should be used with
caution, and the same expression should be used in the SELECT
as
in the GROUP BY clause. This is especially true if the expression
will fold multiple values together, such as dividing a number by
1000 to group quantities together if they are in the same 1000. If
you select SALARY, and GROUP BY SALARY/1000 you will see one sample
salary from the matching group.
The GROUP BY clause is normally used along with five built-in, or "aggregate" functions. These functions perform special operations on an entire table or on a set, or group, of rows rather than on each row and then return one row of values for each group.
Table here lists the aggregate functions available with Texis.
Function Name | Meaning | Example |
SUM(column name) | Total of the values in a numeric column | SUM(SALARY) |
AVG(column name) | Average of the values in a column | AVG(SALARY) |
MAX(column name) | Largest value in a column | MAX(SALARY) |
MIN(column name) | Smallest value in a column | MIN(SALARY) |
COUNT(*) | Count of the number of rows selected | COUNT(*) |
Aggregate functions are used in place of column names in the SELECT
statement. The form of the function is:
Function name ([DISTINCT] argument)
In all situations the argument represents the column name to which the function applies. For example, if the sum of all salaries is needed, then the function SUM is used and the argument is the column SALARY. When COUNT is used an asterisk (*) can be placed within the parentheses instead of a column name to count all the rows without regard to field.
If the DISTINCT keyword is used then only the unique values are processed. This is most useful with COUNT to find the number of unique values. If you use DISTINCT then you must supply a column name. DISTINCT will work with the other aggregate functions, although there is typically very little need for them. The DISTINCT feature was added in version 4.00.1002000000
Example: What is the average salary paid in each department?
Enter this statement:
SELECT DEPT, AVG(SALARY)
FROM EMPLOYEE
GROUP BY DEPT ;
Syntax Notes:
AVG
is the aggregate function name.(SALARY)
is the column on which the average is computed.DEPT
is the column by which the rows will be grouped.
The above statement will produce the following results:
DEPT AVG(SALARY)
MKT 33500
MGT 45000
LIB 22000
RND 27500
FIN 42000
In this query, all rows in the EMPLOYEE
table that have the same
department codes are grouped together. The aggregate function AVG is
calculated for the salary column in each group. The department code
and the average departmental salary are displayed for each department.
A SELECT
clause that contains an aggregate function cannot contain any
column name that does not apply to a group; for example:
The statement:
SELECT ENAME, AVG(SALARY)
FROM EMPLOYEE
GROUP BY DEPT ;
results in the message
Error at Line 1: Not a GROUP BY Expression
It is not permissible to include column names in a SELECT
clause that
are not referenced in the GROUP BY clause. The only column names that
can be displayed, along with aggregate functions, must be listed in
the GROUP BY clause. Since ENAME
is not included in the GROUP BY
clause, an error message results.
Example: The chair of the Marketing Department plans to participate in a national salary survey for employees in Marketing Departments. Determine the average salary paid to the Marketing Department employees.
This statement:
SELECT COUNT(*), AVG(SALARY)
FROM EMPLOYEE
WHERE DEPT = 'MKT'
Results in:
COUNT(*) AVG(SALARY)
2 33500
In this example, the aggregate function AVG is used in a SELECT
statement that has a WHERE
clause. Texis selects the rows that
represent Marketing Department employees and then applies the
aggregate function to these rows.
You can divide the rows of a table into groups based on values in more than one column. For example, you might want to compute total salary by department and then, within a department, want subtotals by benefits classification.
Example: What is the total salary paid by benefits classification in each department?
Enter this statement:
SELECT DEPT, BENEFITS, SUM(SALARY)
FROM EMPLOYEE
GROUP BY DEPT, BENEFITS ;
In this example, we are grouping by department, and within department,
by benefits classification.
We'll get the following results:
DEPT BENEFITS SUM(SALARY)
FIN FULL 42000
LIB PART 22000
MGT FULL 45000
MKT FULL 67000
RND FULL 30000
RND PART 25000
In this query, the rows are grouped by department and, within each
department, employees with the same benefits are grouped so that
totals can be computed. Notice that the columns DEPT and BENEFITS can
appear in the SELECT
statement since both columns appear in the GROUP
BY clause.
If the GROUP BY clause is omitted when an aggregate function is used, then the entire table is considered as one group, and the group function displays a single value for the entire table.
Example: What is the total salary paid to all employees?
The statement:
SELECT SUM(SALARY)
FROM EMPLOYEE ;
results in:
SUM(SALARY)
231000