Wednesday, December 5, 2018

Feature Spotlight: Gym Matching

This is the first in a series of blog posts where I break down a feature that is either new or improved in Meowth 3.0, providing details and insights for users and a break from coding for myself. There's no better place to start with the most-requested feature of all time for Meowth: gym matching.


Introduction


For the uninitiated, "gym matching" is what we call the process of matching a location string supplied by a user to an actual gym in our database. For a very long time, Meowth did not have a database at all because I didn't know anything about databases or how to build them. Luckily for us Scragly did a lot of invaluable work in both building a database interface and helping me learn how to use it. As a result Meowth 3.0 does have a database, and one of the tables in that database is for gyms.


Storing the Data


A description of the gyms table

Each gym in the table has, at minimum:
  • a unique ID (generated by a sequence automatically as gyms are added)
  • the gym's name in Pokemon Go (supplied by a server administrator)
  • the gym's latitude and longitude (also supplied by a server administrator)
  • the ID of the Level 10 S2 Cell containing the gym
  • the ID of the Discord server that submitted the gym (note: in the current build this can be NULL, but in the final build it will be required)
There are also currently columns for:
  • a nickname
  • a True/False value indicating whether the gym is an EX Raid Gym or not (this information will have to be included by the server administrator)

Implementation


Building a table including this information is not too difficult. But at Meowth's intended scale there are some possible problems, for instance that there are lots of gyms with the same name (e.g. "Starbucks", "First Baptist Church", and others.) Sometimes there may even be multiple gyms of the same name even in one server's playing area. However, there are lots of ways we can narrow down the list of gyms to only the relevant ones for that area when trying to find a match. One of the important pieces has to do with the reporting channel. In Meowth 3.0, every reporting channel has an associated circular region. The server administrator who configures the channel for reporting must supply a center and radius for the region. Then, when a user reports a raid in that channel, the following steps are taken:
  • Build a query of the gyms table, selecting the 'id', 'name', and 'nickname' columns.
  • Filter out all gyms added by other servers.
  • Generate a covering of Level 10 S2 Cells for the circular region associated with the channel.
  • Filter out all gyms that are not located inside the covering.
  • Retrieve the data for all of the remaining gyms.
  • Check the location argument against the nickname list.
  • If there's not a match, check it against the name list.
  • Pick the best match as judged by a fuzzymatching module.
This is the basic algorithm as it stands right now. Before release, there will be a step added that asks for clarification if there are multiple close matches.

Benefits


Under the old system of locations, Meowth had a very unsophisticated process for guessing where a gym was - take the location string, add it to the "city string" in the configuration for the channel, shove all those words in a Google Maps URL, and hope Google knows about it. This worked sometimes but not all the time - unsurprising given the relative obscurity of some landmarks. So keeping an actual gym database with saved locations is obviously going to improve the directions Meowth gives. But there are other benefits as well. For one, detecting duplicate raids is now actually possible. After determining which gym a user is reporting a raid at, Meowth checks the raids table to see if a raid has already been reported there. If there has, Meowth sends the information for the raid, including the channel it's being coordinated in.

Another benefit is that it is now possible for Meowth to make raid channels that are visible to multiple reporting regions. Raids will be visible to any reporting channel whose circular region contains any part of the gym's Level 10 S2 cell. This means that if you report a raid that is on or near the border of two channels' regions, the resulting raid channel will be visible to users from both regions.

Drawbacks


This system is highly dependent on quality data from server administrators. Since an official gym map tool does not exist, we rely on user reports and have no way of vetting the data that is put in. Because of this, each server has access only to their gyms that they have imported, and only server administrators are capable of adding or removing gyms to that list. If the data is incomplete, the results may be inconsistent. In the event that Meowth cannot find a match to the gym, it will simply fall back to the old way of creating a Google Maps search for the string.

Preparing Your Data


This is a link to a spreadsheet template you can use to gather your server's gym data. You'll need to make a copy of it to use it. When Meowth 3.0 launches, you will be able to import a CSV that matches the template to add your server's gym data in bulk. Thereafter, there will be commands for adding to, removing from, and modifying your server's gym list.

Final Thoughts


This system is not likely to change drastically between now and release. As always, I welcome all constructive feedback! Let me know what you think in the comments or in the Meowth Discord Server!

Changes to Research

More details on today's changes to the research command are below. A new optional argument has been added to the research command. Thi...