Customization of the rules

All terms (1-word and multi-word) are unknown to the term checker until they are specified in the disambiguation files. If you do not add your organization's technical names and the technical verbs to the term checker, the term checker cannot fully analyze your text.

For help with writing the rules, refer to http://wiki.languagetool.org. Use an XML editor or a text editor that has syntax highlighting (https://en.wikipedia.org/wiki/Syntax_highlighting).

You can write complex rules that use regular expressions. For help with regular expressions, refer to www.regular-expressions.info.

For guidance about terminology management, refer to Case study: text simplification for shipping procedures (www.techscribe.co.uk/techw/text-simplification-for-shipping-procedures.htm).

To buy customization services, contact TechScribe. TechScribe can do these tasks for you:

To customize the term checker

  1. Make sure that you did the installation procedure 'Download the templates for your project terms'.
  2. Add technical names and technical verbs to disambiguation-projectterms.xml.
  3. Add unapproved terms and misused terms to grammar-projectterms.xml.
  4. Local files version only. The term checker identifies some capitalized text as a proper noun. You can change rule 1.1 so that it does not ignore proper nouns.
  5. To make sure that complex rules are correct, use 'testrules'.

Add terms to disambiguation-projectterms.xml

The term checker contains technical names from these sources:

The term checker contains most technical verbs that are in rule 1.12.

For each approved term that is not in the term checker, add each inflection of the term. Use the rules that are in disambiguation-projectterms.xml as templates. For example, if you have a technical name 'disclaimer' (rule 1.5.15), you can add the singular inflection to the token in the rule PROJECT_TN_NOUN_SINGULAR_1_WORD. (The order of the terms in the token is not important.):

        <token regexp="yes">disclaimer|keyboard|password</token>

As an alternative, you can make one rule for each inflection.

Some organizations have thousands of technical terms. A good method is to use a different XML rule for each ASD-STE100 rule number.

Multi-word terms are more difficult to add, because a separate rule for each inflection is necessary. If you keep the technical terms in a spreadsheet, you can use a script to convert the data to XML. TechScribe uses a series of regular expressions in PowerGREP (www.powergrep.com) to convert the terms to XML.

If a term is approved for only 1 meaning, and if you want to give guidelines to technical writers, then add a grammar rule for that term.

Simplify noun clusters (part of rule 2.2)

To simplify a noun cluster, you can "use hyphens (-) between words that are used as a single unit." Sometimes, hyphens in different locations are possible. For example, for the noun cluster filter unit top cover, hyphens in these locations are possible:

Add terms to grammar-projectterms.xml

To give guidelines to technical writers, add terms to grammar-projectterms.xml. Typically, add rules for these:

For examples of the types of rules that you can write, refer to grammar-projectterms.xml.

In English, many words have more than one part of speech. To prevent unwanted warnings, you can make a rule that shows a message only if a term has (or does not have) a specified part of speech. This example is from Managing terminology with term checker, Jake Cahill, 2018:


<rule id="PROJECT_NOT_APPROVED_screen" name="Project Not Approved noun: screen">
  <pattern>
    <token regexp="yes">screens?<exception postag="IS_VERB"/></token>
  </pattern>
  <message>The noun '\1' is not approved. Possible replacements: <suggestion><match no="1" postag_regexp="yes" postag="(NNS?)" postag_replace="$1">page</match></suggestion></message>
  <<short>Project Dictionary. Not approved noun: screen</short>
  <example correction="page" type="incorrect">This <marker>screen</marker> displays the results.</example>
  <example correction="pages" type="incorrect">If the <marker>screens</marker> do not show these messages, stop the test.</example>
  <example type="correct">On this <marker>page</marker> you can enter a new name.</example>
  <example type="correct">When you <marker>screen</marker> the drugs for side-effects...</example>
  <example type="correct">Who <marker>screens</marker> the drugs for side-effects?</example>
  <example type="triggers_error">When the medical technicians <marker>screen</marker> the drugs for side-effects...</example><!-- False positive -->
</rule>

This line in the rule tells the term checker to find the words screen and screens except if they are verbs:

<token regexp="yes">screens?<exception postag="IS_VERB"/></token>

You can use these values with the postag attribute: IS_ADJECTIVE, IS_NOUN, IS_NNP (=proper noun), IS_VERB.

In the term checker, the noun screen is approved as a technical name. The word is unknown as a verb. Thus, until you add the verb screen screen and its approved inflections in disambiguation-projectterms.xml, you will see a message that tells you not to use a TN as a verb. (You can deactivate the rule.)

Change grammar-ste7.xml rule STE_RULE_1_1_USE_APPROVED_WORDS so that it does not ignore proper nouns

This section is applicable only to the local files version of the term checker.

In LanguageTool, proper nouns have the postag NNP. Examples: London, Saudi Arabia, Tuesday, September, HTML, John Smith.

The term checker has rules that identify capitalized text as a proper noun. The term checker gives these proper nouns the postag IS_NNP. To prevent the term checker from using these rules, in grammar-ste7.xml rule STE_RULE_1_1_USE_APPROVED_WORDS, put this exception into comments:


            <exception postag="IS_NNP"/><!-- Proper nouns are from disambiguation rulegroup STE_TN_NOUN-PROPER. -->

To make sure that complex rules are correct, use testrules

If you write complex rules, use 'testrules' (http://wiki.languagetool.org/development-overview#toc7) to make sure that the rules are correct.

STE rule 1.6 shows that an unapproved STE term can be an approved project term. For example, the word regulation is not approved as a noun, but rule 1.5.15 and the example in people (n) show that it can be a technical name. The word is in the term checker and a rule tells you to make sure that it has the correct meaning. Not all the unapproved STE terms that can be technical names or technical verbs are in the term checker. For example, the word route as a noun and as a verb is not approved in ASD-STE100 and it is not in the term checker as a technical name or a technical verb. If route is an approved term in your organization, if you add the approved inflections of route to disambiguation-projectterms.xml, testrules will give an error message. (The term checker will correctly ignore route, because the rules in grammar-ste7.xml ignore approved project terms.)

Local files version only. To prevent the error message, put the STE rule into comments or delete the rule from grammar-ste7.xml and change the postags in the applicable rules in disambiguation-ste7.xml.

If you think that the unapproved STE term is a technical name or a technical verb that is applicable to most users of the TechScribe term checker, please tell Mike Unwalla at TechScribe. Possibly, TechScribe will add the term to the term checker.

Other customization

You can customize the rules to make other types of language quality-assurance software such as these:

RSS feed