Real-time ASR Dev Blog: May 2020

NLP:

This is a project I've been working on for a few years. In 2019, I tested it on Pandora Bots. This year I'm converting it to C/C++.

The main goal is to be small, fast, and white-box - instead of algorithms, knowledge, self-learning.

Generally it works as a word and sentence compressor.

Words are matched to a pre-defined list in a lookup table. Matches return three 8-bit character symbols. [March 2023] There are 49 total groups for the first symbol, and up to 256 for the second and third. The first symbol is used in pattern sentences. The second and third are used for added context and uniqueness.

Sentences are made of 4-10 symbols (out of 49 possible symbols). Each symbol contains hundreds to thousands of words, meaning each sentence can detect hundreds of millions of sentences. These sentences are stored in groups which compress intention to a symbol.

For chatbots, the developer has access to one-character intentions (compressed sentences) and optional context iniforation per word to use, to write white-box responses. Output may be further modified using various lookup tables to output randomness in text/audio response. 50-100 white-box literal responses is enough to cover a broad range of sentence intentions, and makes changing chatbot personality very easy.

Sentence compression test with chatbot output. Size and Response times:
In 2019 in pandora bots (1000 words, 200 sentences) size is ~500kb, speed ~1 second response.
In 2020 in C on an Arm Cortex M4 @ 120mhz , total size is ~159kb, speed 15-100 ms / sentence.
In 2021 in C on a PC @ 2.6ghz with Binary Searching (set up time of 70ms), speed ~1ms / sentence.

Other chatbot features:
Total size is < 500kb including word databases.
One sentence takes less than 1 ms to process.
Limited chatbot responses make it easy to record an actors voice and change personality.
Private information is stripped during word compression (words that aren't in the pre-defined list are not recognised and therefore not compressible/recoverable).

Fine differentiation of intentions, eg between: Wondering, Questions, and Directions - "can you speak english" "do you speak english" "speak english".

Can count occurance of emotional words, logical words, burning-concept words, light-sense words to reply in kind better.

For chatbots in experiences or games and/or cpu restricted platforms, it solves:
Too much data or processing power required.
Cannot change personality/no personality.
Cannot change language/only one language.
Chatbot escaping the topic due to bad intention reading.
Chatbot returning bad views or knowledge calculation (only pre-determined responses).
Chatbot terrible voice synthesis (a voice actor can record all lines including random alternates).

Sunday, 24 May 2020