Customization of the rules

All your technical terms terms (but not some proper nouns) are unknown to the term checker until you add them to the the disambiguation files. If you do not add your organization's technical names and the technical verbs to the term checker, the term checker cannot fully analyse your text.

To customize the term checker, you must know these:

To learn how to write rules, refer to https://dev.languagetool.org/development-overview. Use an XML editor or a text editor that has syntax highlighting (https://en.wikipedia.org/wiki/Syntax_highlighting).

You can write complex rules that use regular expressions. To learn about regular expressions, refer to www.regular-expressions.info.

For information about terminology management, refer to Case study: text simplification for shipping procedures (www.techscribe.co.uk/techw/text-simplification-for-shipping-procedures.htm).

To buy customization services, contact TechScribe. TechScribe can do these tasks for you:

To customize the term checker

  1. Make sure that you did the installation procedure 'Download the templates for your project terms'.
  2. Add technical names and technical verbs to disambiguation-projectterms.xml.
  3. Add not-approved terms and misused terms to grammar-projectterms.xml.
  4. To make sure that complex rules are correct, use testrules.

Add terms to disambiguation-projectterms.xml

The term checker contains technical names from these sources:

The term checker contains most technical verbs that are in rule 1.12.

For each approved term that is not in the term checker, add each inflection of the term. Use the rules that are in disambiguation-projectterms.xml as examples.

Multi-word terms are more difficult to add than 1-word terms because a separate rule for each inflection is necessary. If you keep the technical terms in a spreadsheet, you can use a script to convert the data to XML. TechScribe uses a series of regular expressions in PowerGREP (www.powergrep.com) to convert the terms to XML.

If a term is approved for only 1 meaning, and if you want to give guidelines to technical writers, add a grammar rule for that term.

If you write a complex disambiguation rule, do not use the immunize attribute. Immunization can cause a grammar rule not to find text.

To simplify noun clusters (part of rule 2.2)

To simplify a noun cluster, you can "use hyphens (-) between words that are used as a single unit." Sometimes, hyphens in different locations are possible. For example, for the noun cluster filter unit top cover, hyphens in these locations are possible:

You must make a decision about where to put the hyphens.

To add project terms that are not-approved in STE

A 1-word not-approved term in the STE dictionary can be technical term, if the meaning of the technical term is different from the meaning of the not-approved term. Examples are in the table:

A not-approved term can be a technical term
TermNot-approved meaningProject approved meaning
case (n) condition a type of bag: briefcase, suitcase
chip (n) particle semiconductors: integrated circuit, microchip
collapse (v) close, fall astronomy: for a star to fall in on itself
compile (v) make a list, record, collect software development: to change a high-level programming language to binary code to make an executable program
deposit (n) particle, contamination geophysics: a natural underground layer of rock or other material
route (n, v) noun: routing [direction of cables and pipes]
verb: put.
logistics: noun: the course to go from a start location to an end location.
verb: to calculate or to specify the course of a transport vehicle.

To ignore the default rule for the not-approved term, do one of these tasks:

For example, the noun case is not approved. You will see different messages for the noun case and the verb case with the two sentences that follow:
Each passenger is permitted to put 1 case in the overhead locker.
To prevent an accident, case the gun after you use it.

The not-approved noun 'case' has a different message to the unknown verb 'case'.

If you add the singular noun case to disambiguation-projectterms.xlm, the term checker will ignore the rule for the not-approved noun case and will use the rules for technical names and technical verbs.

The technical name 'case' has different messages to the not-approved noun 'case'.

To tell writers to use the term with the approved meaning, add a grammar rule for the term.

Most rules for not-approved terms have an exception for words that have a different part of speech. Thus, the rule STE_NOT_APPROVED_case_CASE does not give a warning for the verb case. Most rules for terms that are not-approved with more than one part of speech do not have exceptions for other parts of speech. For example, the word route is a not-approved as a noun and as a verb. The rule has an exception only for multi-word project terms and for proper nouns.

If you add the noun route to disambiguation-projectterms.xlm, the term checker will give a correct analysis, but because rule STE_NOT_APPROVED_route_ROUTE contains examples, if you use testrules, you will see an error message that contains this text:

Errors expected: 1
Errors found   : 0

To specify that text is part of a list item that starts a sentence

In a regular expression, the ^ character (caret) matches the start of a string. In LanguageTool, the largest possible string is a sentence. LanguageTool has a postag SENT_START, which is equivalent to a caret.

Technical documentation frequently contains numbered instructions. The term checker gives the postag NLI_SENT_START to the last token in some basic number patterns. Examples: 2), 3.c). The term checker cannot identify all possible number sequences. For example, Step n) is an unknown number sequence. The term checker gives warnings for the sentence that follows:
Step 3) Open the window.

The term checker gives a warning for the undefined list number 'Step 3)'.

If Step n) is an approved number sequence, to make the term checker ignore the number sequence and give a correct analysis for the words at the start of a sentence after the number sequence, add this rule to disambiguation-projectterms.xml:


<rule id="PROJECT_SENTENCE_START" name="Project sentence start: Step n)">
  <pattern>
    <token postag="SENT_START"/>
    <marker>
      <token case_sensitive="yes">Step</token>
      <token regexp="yes">[1-9]|1[0-9]</token><!-- The approved number range is 1 to 19 -->
      <token>)</token>
    </marker>
  </pattern>
  <disambig action="add">
    <wd pos="IS_NOUN"/><!-- In the context of the pattern, disambiguate the word to assert that it is a noun -->
    <wd pos="IS_NOUN"/><!-- In STE, a number is a noun -->
    <wd pos="NLI_SENT_START"/>
  </disambig>
  <example type="untouched">Step 99) If necessary, change the number range.</example>
  <example type="untouched">STEP 5) Use initial capitals only.</example>
  <example type="untouched">Before you do Step 5) Open the window.</example>
  <example type="ambiguous" inputform=")[)]" outputform=")[)/NLI_SENT_START]">Step 3<marker>)</marker> Open the window.</example>
</rule>

Add terms to grammar-projectterms.xml

To give guidelines to technical writers, add terms to grammar-projectterms.xml. Typically, add rules for these:

For examples of the types of rules that you can write, refer to grammar-projectterms.xml.

In English, many words have more than one part of speech. To prevent unwanted warnings, you can make a rule that shows a message only if a term has (or does not have) a specified part of speech. This example is from Managing terminology with term checker, Jake Cahill, 2018:


<rule id="PROJECT_NOT_APPROVED_screen" name="Project Not Approved noun: screen">
  <pattern>
    <token regexp="yes">screens?<exception postag="IS_VERB"/></token>
  </pattern>
  <message>The noun '\1' is not approved. Possible replacements: <suggestion><match no="1" postag_regexp="yes" postag="(NNS?)" postag_replace="$1">page</match></suggestion></message>
  <<short>Project Dictionary. Not approved noun: screen</short>
  <example correction="page" type="incorrect">This <marker>screen</marker> displays the results.</example>
  <example correction="pages" type="incorrect">If the <marker>screens</marker> do not show these messages, stop the test.</example>
  <example type="correct">On this <marker>page</marker> you can enter a new name.</example>
  <example type="correct">When you <marker>screen</marker> the drugs for side-effects...</example>
  <example type="correct">Who <marker>screens</marker> the drugs for side-effects?</example>
  <example type="triggers_error">When the medical technicians <marker>screen</marker> the drugs for side-effects...</example><!-- False positive -->
</rule>

This line in the rule tells the term checker to find the words screen and screens except if they are verbs:

<token regexp="yes">screens?<exception postag="IS_VERB"/></token>

In the term checker, the noun screen is approved as a technical name. The word is unknown as a verb. Thus, until you add the verb screen screen and its approved inflections in disambiguation-projectterms.xml, you will see a message that tells you not to use a TN as a verb. (You can deactivate the rule.)

In grammar-projectterms.xml, you can use these values with the postag attribute:

Notes:

To find the part of speech that a word has

  1. In LanguageTool, select Text Checking>Tag Text.
  2. The Tagger Result screen shows the parts of speech that a word has:
    Tagger Result screen shows the parts of speech for words in the sentence: The word 'disambiguation' is unknown.

To make sure that complex rules are correct, use testrules

If you write complex rules, use testrules (https://dev.languagetool.org/development-overview#testing-rules) to make sure that the rules are correct.

STE rule 1.6 shows that an not-approved STE term can be an approved project term. For example, the word regulation is not approved as a noun, but rule 1.5.15 and the example in people (n) show that it can be a technical name. The word is in the term checker and a rule tells you to make sure that it has the correct meaning.

Not all the not-approved STE terms that can be technical names or technical verbs are in the term checker. For example, the word route as a noun and as a verb is not approved in ASD-STE100 and it is not in the term checker as a technical name or a technical verb. If route is an approved term in your organization, if you add the approved inflections of route to disambiguation-projectterms.xml, testrules will give an error message because the term checker has a dictionary rule for the word. The dictionary rule contains an example of incorrect text and it also has an exception for project terms. Thus, there is a conflict, which testrules finds.

A testrules error occurs only if a conflict occurs in an example in a rule. For example, if you add the technical verb compile (rule 1.12.2.c), testrules does not give an error message. Although compile is a not-approved verb in the dictionary, no rule in the term checker has an example which contains <marker>compile<marker>.

Local files version only. To prevent the testrules error message, put the STE rule into comments or delete the rule from grammar-ste8.xml and change the postags in the applicable rules in disambiguation-ste8.xml.

Other customization

You can customize the rules to make other types of language quality-assurance software such as these:

RSS feed