The example in this section shows how to create a new entity type using a user-defined rule. Rules are defined using a regular-expression-based syntax. The rule is added to an extraction policy, and will then be applied whenever that policy is used.
The rule will identify increases, for example, in a stock index. There are many ways to express an increase. We want our rule to match any of the following expressions:
climbed by 5% increased by over 30 percent jumped 5.5%
Therefore, we will create a regular expression that matches any of these, and create a new type of entity. User-defined entities must start with the letter "x", so we will call our entity "xPositiveGain" as follows:
  ctx_entity.add_extract_rule( 'mypolicy', 1,
    '<rule>'                                                          ||
      '<expression>'                                                  ||
         '((climbed|gained|jumped|increasing|increased|rallied)'      ||
         '( (by|over|nearly|more than))* \d+(\.\d+)?( percent|%))'    ||
      '</expression>'                                                 ||
      '<type refid="1">xPositiveGain</type>'                          ||
    '</rule>');
Note the use of refid in the example. This tells us which part of the regular expression to actually match, by referencing a pair of parentheses within it. In our case, we want the entire expression, so that is the outermost (and first occurring) parentheses, which is refid=1.
In this case, it is necessary to compile the policy with CTX_ENTITY.COMPILE:
  ctx_entity.compile('mypolicy');
Then we can use it as before:
  ctx_entity.extract('mypolicy', mydoc, null, myresults)
The (abbreviated) output of this is:
<entities>
  ...
  <entity id="6" offset="72" length="18" source="UserRule" ruleid="1">
    <text>climbed by over 5%</text>
    <type>xPositiveGain</type>
  </entity>
</entities>
Finally, we are going to add another user-defined entity, but this time it is using a dictionary. We want to recognize "Dow Jones Industrial Average" as an entity of type xIndex. We will add "S&P 500" as well. To do that, we create an XML file containing the following:
<dictionary>
  <entities>
    <entity>
      <value>dow jones industrial average</value>
      <type>xIndex</type>
    </entity>
    <entity>
      <value>S&P 500</value>
      <type>xIndex</type>
    </entity>
  </entities>
</dictionary>
Case is not significant in this file, but note how the "&" in "S&P" must be specified as the XML entity &. Otherwise, the XML would not be valid.
This XML file is loaded into the system using the CTXLOAD utility. If the file were called dict.load, we would use the following command:
ctxload -user username/password -extract -name mypolicy -file dict.load
You must compile the policy using CTX_ENTITY.COMPILE.