In this post, I describe how we have developed the Pepeketua ID app, an open-source software to help identify individuals of Archey’s frogs, a critically endangered species.
The app combines three different machine learning models to rotate and standardise the photos, find specific frog landmarks, and identify each frog based on its skin patterns. We developed Pepeketua ID to have an intuitive Graphical User Interface (GUI), a highly-efficient data management system and an easy-to-install approach. These characteristics make Pepeketua ID an ideal tool for biologists to quickly analyse frog monitoring data which assists in the preservation of Archey’s frogs.
Screenshot of the Pepeketua ID app.
Archey’s frogs are one of New Zealand’s endemic and threatened species. Unfortunately, they are currently classified as Critically Endangered by the IUCN Red List. The New Zealand Department of Conservation (DOC) monitors these tiny amphibians to better understand how to preserve them.
The frog monitoring program consists of two main stages. The first stage takes place once a year for each site, in which DOC rangers search for frogs in the wild for four consecutive nights, and take pictures of the frogs they find. The second stage is to categorise each frog’s photo as either a recaptured frog (i.e. previously photographed) or a new individual (i.e. first time photographed). The process to categorise each frog is done by hand and can last several months. This monitoring program allows DOC to track the frog population- knowing when the population grows (new frogs) or dwindles (missing frogs) from year to year.
Archey’s frog in the field.
Photo by James Reardon.
Isolated ML models
Wildlife.ai has been developing a set of machine learning models to help DOC biologists focus on the right set of pictures to compare, instead of using other manual systems to select which frogs might be similar.
The data rangers Gal Gozes, Dror Asaf, Shahar Gigi, Chen Yoffe and Guy Hay have developed three machine learning models to
1) Rotate and standardise the photos,
2) Find specific frog landmarks and,
3) Identify each frog based on its skin patterns and create an image matching tool to relieve the need to match photos manually.
Most of the models above were developed in isolation (i.e. they were not connected and there was no straightforward way for biologists to use these algorithms).
Example of the output of the landmark detection model in an image of Archey’s frog.
A Quest for Conservation and Technology
After learning about the project and finding out how much work it takes to preserve Archey’s frogs, it was clear to me that any work I can take off the plate of biologists will result in a direct advantage to the frogs. Instead of the biologists searching painstakingly through the photos in order to re-identify frogs in new pictures, I could build an app, using the previous models, to show them only relevant pictures of similar frogs and cut down the search time significantly. Thus, enabling them to focus on more relevant duties.
In addition, I really wanted to be involved in an environmental project. Not only will it be a chance for me to learn about new technologies and gain further experience in my field, I will also be supporting nature preservation and these cute frogs.
The ideal frog identification app
Implementing machine learning models into production can be one of the most difficult parts of a data science project. Data quality, security and privacy are some of the challenges to overcome while building an app to use machine learning models. To understand how to design the interface, I had to first understand my target audience.
After discussing this with Victor and the biologists, I came to understand that the GUI must be simple both visually and conceptually. Presenting it as a website instead of a local app, for example, eliminates the need to install it. In the app, every setting added to the interface complicates and prolongs work on this task. If installing and using the app is difficult, the biologist could get discouraged and prefer not to use the app. The task might even take longer than manually comparing the photos! In addition to a simplified GUI, it was important to show the frog photos as big as possible, due to the importance of viewing the details on their back, sides and snout. I put all of these design considerations together and created a list of goals for the app.
Overview of the frog monitoring using the Pepeketua id app.
Pepeketua ID design goals
In designing the Pepeketua ID software, there were three main goals we wish to achieve:
- Clean and save previous capture data to a temporary database
- Cleaning out unusable rows: We are provided with large excel sheets filled out manually with frog capture information (see previous post about this process). Each row represents a (temporarily) captured frog that has been identified, but due to the manual nature of this work, there are many rows with errors, or empty rows we cannot use.
- Matching each row to its corresponding frog capture photo: In order for us to create a usable database of frog photos, we must know which captures match which photos.
- Save information to temporary databases: In order for our app to have access to this data, we must save it in SQL and LMDB databases
- Extract identity vectors for all previous capture images
- Applying the three models to get ID vectors from all photos: We must apply the Rotation model, then the Landmark model, then the Identity model to get the ID vector for each photo.
- Save all vectors to searchable data structure: Saving the vectors to a searchable data structure will allow searching through it with queries ID vectors of new photos, thus allowing the app to return the query’s nearest neighbours.
- Build GUI to query previous capture data
- Upload and compare new capture photos to previous capture photos on the same grid: As per the requirements above, the GUI must allow the user to compare the uploaded pictures to ones saved from the same grid. At this point we do not assist the user at labelling the uploaded photos- they must do these things manually after they have decided on a frog’s classification.
- Make it look good and run smoothly: The GUI design must be inviting, intuitive and easy to use. Furthermore, it must run fast and smoothly to facilitate ease of use.
- The app must have the option to rebuild its internal databases in case of updated or new captures. This will allow new photos to be introduced to the database to compare the query against.
Building the app
Once the team agreed on the goals of the app, I divided the development process into five tasks and tackled them one at a time as described in the following sections.
Task 1 – Connecting the ML models
Utilising previously trained models: I exported the three models from the relevant codebases (1, 2) to TensorFlow SavedModel format, so that I wouldn’t have to copy the classes used to initialise the models into my code. In this format, I just needed the model weights to load them.
I refactored the code which does data processing from previous model repositories to these directories in my codebase. I then created a class that runs inference on all three models sequentially to produce the desired ID vector from any frog photo (the “inference_model” dir).
Outline of InferenceModel class used to perform inference on the frog photos.
Task 2 – Linking rows to photos
It’s very important to save the previous capture data, as it accompanies the photos and allows the rangers to identify the frogs more easily. For example, if a frog we produced has SVL (Snout to Vent Length) 22 mm, but the query frog has SVL 17 mm, this means that it’s probably not a good match, even if they look similar!
To make the previous capture data persistent and more accessible, I chose to import it to a local PostgreSQL server. The photos themselves I decided to save them in a Lightning Memory-Mapped Database (LMDB), which is a fast local key-value store. I preferred this over reading directly from disk because it allows faster development and less file-system related overhead.
Once I had my servers lined up, I cleaned the tabular data and created the “filepath” column which contains the path of the photo of each capture.
Capture excel sheet sample, truncated for brevity.
Task 3 – Finding the nearest frog neighbour
The next step was to generate the ID vectors for all frog images in our database. To do this we simply fed all saved (row, photo) pairs to our InferenceModel and save the resulting vectors.
Since our goal was to use these vectors for nearest neighbour search, I chose to save them in a Facebook AI Similarity Search (FAISS) Index. This is a data structure that allows fast and precise similarity searches. After saving all the ID vectors to this Index, we can query this database with new ID vectors and receive the list of the K nearest neighbours for each query.
We needed to rerank the outputs of the identification models as the model uses only the visual information to compare to other frogs (e.g. it might rank two similar-looking frogs with widely different actual sizes or that belong to different locations). A straightforward solution to this problem was to start using some of the measurements the rangers take (SVL (mm)– Snout to Vent Length, Weight (g)– Weight of the frog, Capture photo code– Special frog encoding based on joints as features).
To rerank the nearest neighbour results, we decided to use a StandardScaler from the package scikit-learn to model the mean and variance of the features SVL and Weight. Then, at query time, we use the same scaler to transform both the query and the nearest neighbours’ features to have zero mean and unit variance. We then calculate the L2 normed difference between the query and nearest neighbour statistics. Because we rescaled the features to have zero mean and unit variance, the SVL and Weight features have the same weight even though they were originally in completely different scales (SVL being in the 20s and Weight close to 0).
We also made a simple edit distance between the Capture photo code column values of the query and the nearest neighbour samples. This feature looks like a binary number with four digits- e.g. “1100”, “1010”, “0011”. These digits represent the marks on a frog’s joints, and were the de facto method of comparing frogs before we produced our solution. Our distance increases as the coding differs between the two photos. We add this distance to the difference we calculated from the other features.
After calculating the difference along these features, we re-sort the nearest neighbour results according to this difference in ascending order to show the closest results first. This means that the first results will not just be visually similar to the query frogs, but also with similar SVL, Weight, and Capture photo code.
Two examples of visually different Archey’s frogs.
Photos by James Reardon.
Task 4 – Developing a slick GUI
I selected Streamlit as the engine to build this app. Streamlit is an open-source Python library used for building interactive web applications for data science and machine learning. With Streamlit, developers can easily create interactive visualisations, charts, and graphs, and incorporate machine learning models into their applications. The library offers a wide range of widgets and components that can be used to build complex and responsive user interfaces.
The app presents the settings on the sidebar, the query photos in the centre, and the nearest neighbour results on the right. Expanders on the bottom of each photo showing the Excel information from each frog for easy comparison.
Screenshot of the frog photo search app, built with Streamlit.
Task 5 – Dockerizing Pepeketua ID
To make running this app easy and stable, I wrapped everything in Docker images.
Using docker compose, I called upon a PostgreSQL image to run the SQL server we needed inside this docker network, and created a python environment image in two stages: first ghostcow/pepeketua:base_image containing the ubuntu packages needed to run the project, then ghostcow/pepeketua:python_env building on the base image, which contains the code and pip packages needed to run the code.
Once I have these images at my disposal, I set up the bind mount and dedicated volumes needed to save all of the databases. I use commands such as docker compose run to run the preprocessing stage, and docker compose up to start the Streamlit application.
Docker compose definitions file used to define dockers in this project.
We are working with the biologists from the DOC to set up this app so that they can classify new frogs by searching nearest neighbours within the previously captured frogs. Every time they gather new data, they will update their excel sheets and quickly rebuild the databases of Pepeketua ID so that the app will have expanded its search to the new captures.
We have created a dynamic tool that allows the introduction of Deep Learning models to the process of identifying and conserving the Archey’s Frog species in the rainforests of New Zealand.
In the future, we could expand this app in several ways:
- Expanding the app to different species: This app can be easily modified to perform nearest neighbour searches within a photo database. For example- instead of Archey’s Frog, we can apply it to any species, given photos, metadata, and models to extract features from images of members of those species.
- Running a public instance of the app: We could run a public instance of this app in case we would need the public’s help in identifying frogs or members of some other species. In that case, the query (new) photos would be pre-uploaded to the server, and members of the public would try to identify those frogs using the app and report back to us. This could take even more work off the biologists, as the public would help prepare more focused lists of options for them. This was done successfully at an earlier stage of development, where the Zooniverse platform was used to have the public annotate keypoints on frog pictures.
- Improving the Deep Learning models: To improve the Recall of our solution (i.e. the amount of correct frogs found within the nearest neighbours of a query), we could retrain the models using new data, or offer a different algorithmic solution to the problem. Once finished, the changes can be easily implemented by updating InferenceModel.
GitHub repo: Pepeketua Interface