Task 3 – Finding the nearest frog neighbour
The next step was to generate the ID vectors for all frog images in our database. To do this we simply fed all saved (row, photo) pairs to our InferenceModel and save the resulting vectors.
Since our goal was to use these vectors for nearest neighbour search, I chose to save them in a Facebook AI Similarity Search (FAISS) Index. This is a data structure that allows fast and precise similarity searches. After saving all the ID vectors to this Index, we can query this database with new ID vectors and receive the list of the K nearest neighbours for each query.
We needed to rerank the outputs of the identification models as the model uses only the visual information to compare to other frogs (e.g. it might rank two similar-looking frogs with widely different actual sizes or that belong to different locations). A straightforward solution to this problem was to start using some of the measurements the rangers take (SVL (mm)– Snout to Vent Length, Weight (g)– Weight of the frog, Capture photo code– Special frog encoding based on joints as features).
To rerank the nearest neighbour results, we decided to use a StandardScaler from the package scikit-learn to model the mean and variance of the features SVL and Weight. Then, at query time, we use the same scaler to transform both the query and the nearest neighbours’ features to have zero mean and unit variance. We then calculate the L2 normed difference between the query and nearest neighbour statistics. Because we rescaled the features to have zero mean and unit variance, the SVL and Weight features have the same weight even though they were originally in completely different scales (SVL being in the 20s and Weight close to 0).
We also made a simple edit distance between the Capture photo code column values of the query and the nearest neighbour samples. This feature looks like a binary number with four digits- e.g. “1100”, “1010”, “0011”. These digits represent the marks on a frog’s joints, and were the de facto method of comparing frogs before we produced our solution. Our distance increases as the coding differs between the two photos. We add this distance to the difference we calculated from the other features.
After calculating the difference along these features, we re-sort the nearest neighbour results according to this difference in ascending order to show the closest results first. This means that the first results will not just be visually similar to the query frogs, but also with similar SVL, Weight, and Capture photo code.