Ignore Selectors

Syntax: one or more CSS selectors

These CSS selectors define portions of HTML documents to ignore (e.g. boilerplate text). Text from matching elements will be removed from the searchable text. A matched element includes the open tag through the matching balanced close tag (if not a void element). Only valid HTML 5 elements may be matched. Links will be unaffected. A matching tag with no close tag given in the document will generally match through the closing implied by HTML (e.g. end of parent element). The default is "nav, header, footer", which ignores <nav>, <header>, and <footer> elements' text.

A limited subset of CSS selector syntax is supported. Each setting entry must be a selector as defined by the following pseudo grammar. "!" indicates at least one of the preceding parenthetical group's components must be given. An optional item/group is suffixed with "?"; "*" indicates zero or more occurences of the item/group may appear; "+" indicates one or more. Fixed-font indicates literal text, including e.g. square brackets ([]) and quotes. A non-fixed-font pipe character (|) separates alternatives.

  • selector = complex-selector-list

  • complex-selector-list = complex-selector ( , complex-selector )*

  • complex-selector = compound-selector ( combinator compound-selector )*

  • compound-selector = ( type-selector? subclass-selector* )!

  • combinator = whitespace | > | + | ˜

  • type-selector = tag | *

  • subclass-selector = ( # id ) | ( . class ) | attribute-selector

  • attribute-selector = ( [ attr ] ) | ( [ attr attr-matcher ( value | string-token ) attr-modifier? ] )

  • attr-matcher = ˜= | |= | ^= | $= | *= | =

  • attr-modifier = i | s

  • string-token = "value" | 'value'

  • whitespace = ( space | tab | CR | LF | FF )+

Examples:

#myId Elements with id attribute equal to myId
div.myClass div elements with class attribute containing myClass token
div.myClass p p elements that are descendants of myClass-class div elements
.A, .B Elements with class token A or B
.myClass > span span elements that are children of myClass-class elements
div[myAttr=myVal] div elements with an attribute myAttr whose value is myVal

Whitespace is permitted around (before/after) a selector; around (and as) a combinator; around a comma operator; and between the parts of an attribute-selector inside the square brackets. Comments (delimited by /* */) may appear between/around any parts in the grammar. Matches are case-insensitive, except for attribute-selector values, which match case-sensitively (unless the i attr-modifier is given). Backslash escapes are not suppored. A tag must be an HTML 5 tag. Setting added in version 25.0.0. See also Keep Selectors, here.


Copyright © Thunderstone Software     Last updated: Oct 10 2023
Copyright © 2024 Thunderstone Software LLC. All rights reserved.