7.7 Multiple machines sharing the load


7.7 Multiple machines sharing the load

When the load being handled is too much there are several strategies that can be used to solve the problem. Some, which are discussed elsewhere, include checking that the appropriate indices exist, minimizing the work done at each interaction, and caching frequently performed operations. If those fail to produce the desired performance you may be able to add memory or CPU to the machine, or add machines to support the application.

If you decide that need several machines to support the load you are seeing then there are two ways you can split the load, or with a combination of them. The two ways are replication and distributing the load. With replication you have more than one machine with the same data, and queries can go against either machine and produce the same result. With a distributed system different portions of the task are split between machines, for example current article searches on one machine, and archive searches on another.

Replication has advantages in that it is relatively simple to implement if the application allows it, and also provides an inherent hot backup. If one machine fails then the rest will still be running, abeit at a higher load. To successfully allow replication to work the system must not allow frequent user updates. Batch or data updates can be applied to one machine and copied, or applied to all machines.

Updates would only occur on the one machine the user was connected to. If the volume of updates is low enough you could either use triggers, or a script that submitted the update to all the machines. Care would need to be taken if a machine is taken offline, and then returned to use to make sure the data is synchronized. When the volume of updates is high you would be frequently updating all the machines, which negates the benefits of multiple machines.

With a distributed system the work is apportioned amongst a number of machines. Where the split occurs will depend on the application. You might have relatively independent portions of the application that can be put on different machines. For example if you were doing news archive searches you might have recent news on one machine, and archives on another machine.

Back: Large number of searchable fields

Next: Minimizing the work done on user interaction