10.3.1 Using ZIP Codes

One method of delimiting a geographic search is to added geographic-specific keywords to the text search. For example, the city, state, and zip code of our tourist attractions could be text-searched along with the usual descriptions. Let's use the following table schema:

  • Name
    The attraction's name

  • Desc
    A brief description of the attraction

  • City
    The city where the attraction is located

  • State
    The two-letter state

  • Zip
    The (integer) zip code

We create a Metamorph index on not only Name and Desc , but the other fields as well:

create metamorph inverted index xgeo on tourist(Name\Desc\City\State\Zip)

Now we search against those fields with LIKE in our script:


  <SQL "select Name
        from tourist
        where Name\Desc\City\State\Zip like $query">
  </SQL>

passing our user's $query to the search.

Since the state, city and ZIP code info is part of the text field, our users can geographically delimit their searches with queries like this:

Wyoming bed and breakfast

NASA rockets in Huntsville

restaurant 440*

giving a state, city, or ZIP code, or combinations thereof. Note the wildcard in the last query: this matches several nearby zip codes, broadening the search.

Advantages

The biggest advantage to this type of geographic search is that we can probably completely resolve the query with one index, the Metamorph index. This can greatly speed up the search over, say, an AND clause that would require post-processing.

Also, we can limit the search to an exact political region, say a city, without needing to know its latitude/longitude.

Disadvantages

The biggest disadvantage is that we can't easily zoom in or out to cover a larger area in the search, nor cover a fixed region centered on a location. For example, searching for hotels in Rhode Island may leave out ones just a few miles away in a neighboring state. We can't tie the search to, say, a 100x100 mile area centered on Providence, regardless of state/city boundaries.

ZIP codes that have the same prefix might not really be adjacent, making the exact region ambiguous.

Having the same state in many rows in the table makes the index larger, and that "word" will because noisy in the query, slowing the search.

To resolve some of these issues, we can use latitude and longitude data, as in our next example (next page):

Back: Geographic Searching Next: Delimiting an Entire Region
Copyright © 2024 Thunderstone Software LLC. All rights reserved.