7.8 Minimizing the work done on user interaction | |
To deliver rapid results to the user there are three things you can do. You can use more hardware, as we've just been discussing, do the work that needs to be done as efficiently as possible, which we've also spent some time dicussing, and finally: don't do work you don't need to do. The simplest form is caching, which we'll cover in a little more depth later. Suffice it to say that we don't want to do the same thing over and over again for every user, if we can do it once, and store the results.
There are other variations on a theme that can be used in many circumstances. The basic premise is that if the work can be done once as the database is being built, it's better than doing the same work millions of times as it is searched. The two most common are preordering the data, and denormalizing the database.
If the data in a table doesn't change, and will be most often selected in a particular order, it makes sense to load the table in that order, which avoids the need for an order by later on, remembering the ORDER BY caveats from earlier on. First LIKEP and LIKER results will be reordered, although records of equal rank will still be in table order. Secondly if you have a query such as Date > '-2 days'as the only thing in the where clause, and an index on Date (which you should have), then you will get the results in Date order, regardless of table order. For Metamorph queries, or complicated WHERE clauses the results will be in table order.
Denormalizing the database can be of assistance by dramatically reducing the number of table reads that are required to display the results. If you consider a simple case where the search is against a single table which has many to one relationship with two other tables to display resulting info. If you are displaying 25 rows on a result screen, you have gone from 25 record reads to 75 to display a single page, which taking into account the index reads to find those records has probably tripled the work to display the page. If you are actively searching against fields in multiple tables the problem can get worse.
Normalized tables certainly have their uses, as they avoid problems of duplicated data that can get out of synch, which makes denormalizing a tradeoff that needs to be considered. If the joined data can change frequently, which would incur frequent updates of many tables then maintaining the join would be sensible. However the basic principle we are trying to follow is to locate the information the user wants as quickly as possible, and get it to them.
Back: Multiple machines sharing the load | Next: Caching frequently accessed results |