October 07, 2020 08:40
As a first post on this new blog I wanted to take the time to let you all know what goes on behind the scenes at FPC (almost) every day. The thing is that the list of brands and inks doesn’t just magically appear, this is a process where some manual work is necessary.
Let’s for example look at the ink page for Akkerman #24 Zuiderpark Blauw-Groen:
In the lower half you can see that there are many names people have used for this ink, but I want to make sure that they are all combined into one entry so that the ink autocomplete is nicer (it only includes the most popular name for each ink), color values get applied to all entries, the ink leaderboard works, and the system also needs this when you compare your list of inks with someone else. There’s more features that will take advantage of this data set, but for these are the most important ones.
Unfortunately this process of combining inks can’t be fully automated. The naming conventions people use are just too different, and sometimes there are just multiple names for inks (especially Japanese ones) and no algorithm can do this. So whenever someone enters an ink that isn’t a match to one that is already in the system (there are some minor cleanups happening, like stripping out spaces and making everything lowercase) I get an email and I go into the admin interface, which then looks like this:
The part above the thick black bar is the ink (or inks as in this case there are multiple that the system has already pre-clustered) that the system doesn’t know what to do with. I can the assign it to an already existing ink using the “assign” button or I can create a new ink. There’s some clever sorting going on to move the most likely candidates to the top of the list, but there is also a search functionality for the cases where the correct entry has a completely different name. In the example shown above I would assign to the first existing entry, but for the example below I might create a new entry:
On any given day there are somewhere between 5 and maybe 20 entries to look at, but of course I first had to go through all 19k entries (!) and that took me a while. The most important thing to note however is that I’m sure I’ve made quite a few mistakes when I grouped (or didn’t group) some of the entries. So if you find a mistake, use the “report an error” button on the ink page and let me know!
Thank you for reading this article. If you are using this service and enjoy it please consider becoming a Patreon supporter. This will ensure that I will have the means to keep this site running independently of any third party. Feel free to reach out to me if you have any questions. Your friendly FPC software developer.