NLP / WIC Benchmark:
There's now 3700 words (+1400). 900 WIC pattern sentences (+800). Re-added spell-checking, so the full WIC test takes about 2.5 seconds to complete.
The scale and pickup is actually immense. Each of the 900 WIC pattern sentences has 4-10 Symbolic Words, and each Symbolic Word represents 10-500 words. So each of the 900 WIC sentences can pick up a very high numer of variations.
One side effect is that I'll need to drop the old "Intention" categories used for the chatbot and use these new WIC categories instead as this picks up a much better variety. There are about 50 different groups (will be merging some) along the lines of:
"person or thing started to move / person or thing has him..."
"the object/concept of a had-thing"
"had the concept when..."
"a motion was taken / apply a rule / have-take the concept-chance to..."
"i play/avoid the / objects moved/ordered/fell to the
"logic-action an object"
"moving-action the object"
"an object of objects / vivid objects/objectives of"
These will be better used with the chatbot.