texnoforge - Words of Power devlog #9: TexnoLatin Alphabet in Training

It's my immense pleasure to finally announce alpha version of TexnoLatin alphabet!

TexnoLatin initially consists of 24 symbols:

Each symbol contains:

SVG image (vector-based and thus infinitely scalable)
20 to 50 drawings (training data)
working Gaussian Mixture Model (GMM) for symbol recognition

I've strived to create a set of distinct symbols of similar drawing complexity while providing as much semantic consistency as possible:

Symbols with opposite meanings are visually similar but contrasting, usually anti-symmetrical.
Symbols with "positive" meanings tend to point up, symbols with "negative" meanings tend to point down.
Spell shape symbols tend to go from left to right.
Most symbols reuse existing human associations, they should be quick to learn.

Even though symbol recognition using Gaussian Mixture Models already works surprisingly well, some symbols are simply annoying to draw. I'll keep gathering feedback and iterate as needed, possibly replacing offending symbols with better alternatives.

However, TexnoLatin already meets minimal requirements for usage in Words of Power game and it's only going to get better.

To celebrate this milestone, let me invoke a strong fire ball or fortis ignis sphaera in TexnoLatin:

TexnoLatin source repo: github.com/texnoforge/texnolatin
view TexnoLatin online: wop.texnoforge.dev/abc/texnolatin

Focusing Magic Contributions

It's been 2 years since previous Words of Power devlog and I've been busy learning the wonderful Godot engine through game jams and various prototypes.

After hundreds of hours of game and tool development in Godot over tens of different projects, I've confirmed that I'm able to implement all systems needed for the first Words of Power game in Godot so the time has come once again to return to the underlying TexnoMagic symbol recognition tech and get it production ready.

I've managed to persuade few friends (and one random internet citizen called Dan) to actually draw me an arbitrary alphabet of 26 symbols. Thank you all, dear Progenitors of Magic ♥ As much as this was useful and valuable, I must admit it's silly to expect people to willingly spend their time creating a made-up lexicon of symbols.

Furthermore, it takes a fair amount of data to train reliable symbol recognition models, so even if users were interested in creating their own alphabets, it's unlikely for a single individual to generate sufficient training data set.

In other words, it takes a lot of effort and collaboration to create a magic alphabet. Instead of splitting effort, I decided to focus on a single reference alphabet first and foremost.

People are also generally unwilling to install obscure software on their machines, not to mention how painful it is to support various platforms. I'll try to make the collaboration easy for users by providing online tools which can be run from any device and system with a modern web browser without the need to install anything. It should be as easy as clicking a link.

Gathering Drawings for Model training

In order to achieve reliable symbol recognition, a considerable amount of training data is required to train robust symbol models. Ideally, this data (drawings) should come from various users, machines, systems, and pointing devices to ensure models are general enough.

I've devised a web-based solution to collect anonymous user drawings from any device with a modern web browser thanks to Godot web exports and the power of Python.

First, I've created "perfect" symbol images in SVG format (as you can see above) to serve as desired symbol shapes.

Next, I wrote a simple Godot webapp called Words of Power Trainer (woptrainer) available online on train.texnoforge.dev for the singular purpose of collecting user symbol drawings.

User is tasked with simply drawing the symbol image on the screen:

Upon pressing GOOD or BAD button, the drawing is anonymously sent to new Words of Power Vault (wopvault) python server living at wopvault.texnoforge.dev where it's stored as CSV file for later processing. You can view online Vault report on collected drawings. At the time of writing, there are more than 1400 user submitted drawings in the Vault, with roughly 70 % imported into TexnoLatin.

Finally, new Words of Power Laboratory (woplab) tool allows to conveniently import the collected data from the Vault into TexnoLatin with its woplab vault export command.

Existing symbol models are used to judge Vault drawings quality and only drawings with recognition score above certain threshold are exported. This allows continuous auto-import of user drawings without worrying about introducing bad data.

TexnoMagic received a new colorful CLI and a bunch new features and improvements related to new functionality - check out TexnoMagic News (which are brand new as well).

Help by Drawing 🧙‍♂️

train.texnoforge.dev

There is no such thing as too much training data - thank you for your contribution ♥

Words of Power