Currently, when defining an entity value in pattern mode, the pattern matching is always case sensitive.

Since we are handling user generated content (utterances) here, we cannot rely on the correct spelling (correct uppercase/lowercase). This is especially a challenge in the German language where all nouns are normally (rule of grammar) spelled with an uppercase first letter. In the regular expressions we would therefore have to explicitly add two variants for each of these cases where a word could be spelled with either uppercase or lowercase initial letter.

Typically, every regular expression engine takes a flag for "case insensitive" matching. This flag should be exposed so the WA user can choose between case sensitive or case insensitive matching for each regular expression individually.

The benefit would be a significant reduction of the complexity (and length) of the regular expressions.

  • Guest
  • Jul 13 2018
  • Shipped
    August 02, 2018 03:51

    Just put (?i) at the start of the pattern if you want it to be case insensitive.  For example:

    Here's a pattern that matched "1 Month " and "5 MON ":
    (?i)\d+ ?(month|mon|mth|m)\b