Artificial Intelligence for Primate Care and Research

For this quarter’s Hot Topics in primate welfare, ASP is featuring the use of Artificial Intelligence for research and enrichment across settings and species. We interviewed three experts who use Artificial Intelligence in their research and welfare programs: Dr. Fay Clark (Senior Lecturer, School of Psychological Science, University of Bristol, UK), Otto Brooks (PhD Candidate, School of Computer Science, University of Bristol, UK), and Dr. Daniel Schofield (Schmidt AI in Science Fellow, Visual Geometry Group, University of Oxford, UK).

Question 1. Please summarize the use of technology in your work. What is the technology? In what ways do you apply this in your studies or care of primates?

Dr. Fay Clark (FC): Observing primates by eye and recording everything by hand is a fatiguing and error-prone process, and zoos are full of distractions. I started using technology to overcome some of these challenges. I have used it to observe animals unobtrusively when my presence would disturb them, when their behavior was too fast or detailed to record in person, and when I wanted to record behavior 24/7. I use hidden surveillance equipment enabled by AI: small cameras and pressure/motion sensors embedded within tasks. For example, Gorilla Game Lab was a collaboration between the University of Bristol and Bristol Zoo (2018-2021) which integrated computer vision technology with cognitive enrichment for gorillas.

Otto Brookes (OB): My work focuses on developing deep learning models—a branch of machine learning—and addressing the technical challenges of applying them to primate research, such as dealing with unfamiliar environments or detecting rare behaviors. While the focus has been on model development so far, I’m now beginning to apply these methods in practice through a collaboration with the Wild Chimpanzee Foundation. The aim is for these tools to support real-world welfare and conservation efforts in the near future.

Dr. Daniel Schofield (DS): I use computer vision—a branch of AI that processes visual and audio data—to detect and track primates in the wild. These models can be used to detect and track animals, identify species and individuals, classify behaviors, and even map social interactions over time. By combining different models, we can study complex behavioral and social systems at scale. This opens new possibilities for research and conservation—for example, tracing cultural evolution across video archives and large camera trap datasets, tasks that would be extremely difficult to analyze manually.

Gorilla Game Lab integrates computer vision technology with cognitive enrichment for gorillas.

Project by Fay Clark and Otto Brooks.

Question 2. How does this technology spot and recognize primates? Can you explain in a simple way how Artificial Intelligence determines which primate it is seeing in videos and/or pictures?

OB: Visual AI is trained to recognize individual primates by being shown many examples—pairs of images or videos along with the correct ID—so it can learn what each one looks like. Over time, it learns to recognize patterns in shapes, colors, and motion that allow it to distinguish one individual from another.

DS: Deep neural networks are a type of AI algorithm loosely inspired by how neurons work in the brain. They consist of many interconnected layers that gradually extract patterns from data. To detect a primate in an image—or to recognize which species or individual it is—we train the network by feeding it many labeled examples. Over time, the model learns which visual features (such as coloration, facial shape, or posture) are useful for making accurate predictions. It starts by identifying simple patterns like edges or contours and, through repeated exposure, builds up to more complex representations. With enough examples, the network becomes increasingly reliable at detecting and classifying primates in new, unseen images or videos.

Question 3. How reliable are these technologies? What are the current limitations? Are these programs open-source?

OB: Yes, many of these models are open-source, but they still need to be retrained with project-specific data. That means collecting and carefully annotating images or videos of the individuals you want the system to recognize. These models can be highly reliable in controlled environments like zoos, where backgrounds and individuals stay the same. However, they face challenges in more dynamic settings—such as when individuals age or new ones are introduced—since the models need to be updated or retrained to adapt to those changes.

DS: We’ve seen rapid progress in AI performance, with models now matching or even surpassing human ability on many tasks. Just a few years ago, reliably detecting and tracking nonhuman primates—especially in the wild—was extremely difficult due to challenges like variable lighting, motion blur, and changes in pose. While some of these conditions remain challenging, performance has improved dramatically—enough to be practical for scientific research. Most computer vision models and datasets are open-source, which has helped accelerate progress across the field of machine learning. However, primatology is still catching up. One major limitation has been the lack of naturalistic, scientifically grounded datasets labeled by domain experts—but this is beginning to change.

Question 4. What is the expected cost to use this method, even if the software is open-source?

FC: Electrical components like piezoelectric sensors and Raspberry Pi computers are very affordable these days. But for safety (especially if a system will be used outside working hours), it’s vital to involve people with expertise in electronics, and their time needs to be budgeted for.

OB: As Fay mentioned, the main cost comes from the time and effort required for data collection and annotation—setting up cameras, managing footage, and labeling hundreds of images per individual primate. While the software is open-source, training the model requires access to a machine with a powerful GPU, and similar hardware is needed to run the model afterward, especially for processing video. Thankfully, cloud platforms like Hugging Face offer this affordably. If your goal is to collect data automatically over a long period using computer vision, then the upfront effort is well worth it—it can save a huge amount of time in the long run.

DS: While the software is often free, there are costs associated with storing large video or audio datasets and using graphics cards (GPUs) to train models. Cloud storage and compute can become expensive at scale—especially with terabytes of data—but prices are dropping. Today, researchers can often access entry-level cloud servers for small-scale training or experimentation for just a few dollars a month, without investing in high-end hardware. Increasingly, large companies are releasing pre-trained, generalizable models that can be used directly for inference—eliminating the need for costly training in many cases.

Question 5. What were some challenges with your projects, and how did you navigate them?

FC: It wasn’t possible to mess around with the installed technology very often, so we had to practice with mockups. We used something called the ‘Wizard of Oz’ design method. You can make it appear like a fully functional, automated system to the user (gorilla), but really, one of our research team acts as the ‘wizard’ behind the scenes. This allowed us to simulate the technology and gather feedback early in the design process, before investing in the final, fully functional system. Another challenge is that zoos are full of humans! We underestimated how often the AI would detect visitors, researchers and keepers in the background of our videos. For privacy, visitor images had to be deleted, and this took a lot of extra processing time.

DS: One of the biggest challenges has been the lack of labelled data for some of the tasks such as individual or action recognition, so we had to start from scratch. While human-focused models are already available and trained on millions of images, datasets for primates are far more limited. To address this, we developed workflows to streamline the labelling process —using detection and tracking models to automatically cluster images of the same individual, which can then be labelled in bulk using custom lightweight annotation software. Collaborating with computer vision researchers and software engineers has been immensely helpful in designing these tools. Their technical expertise has accelerated development, and the process has also been a valuable learning experience—bringing advanced machine learning techniques into primate research in a way that’s practical and scalable.

Question 6. In what ways is computer vision changing primatology? How does this affect primate welfare?

FC: In zoos, we often focus our research during core opening hours, but animals may be most active at dawn or overnight. I think all keepers and researchers have experienced days when their animals just seem ‘off’. Which leads you to think, ‘What happened before I got here? Has something disturbed their sleep?’ Assuming you can set surveillance camera systems up safely, machine learning can be used to monitor animal behavior and welfare 24/7 without a human being present. This isn’t about being lazy or complacent researchers – it’s about leveraging advances in computer vision to enable more data collection and assess it automatically, therefore collating more evidence to support management decisions.

OB: One of the biggest practical impacts computer vision is starting to have in primatology—especially in the field—is drastically reducing the time needed for analysis. For example, we can now process hundreds of thousands of camera trap videos to detect the presence of primates in just a few hours, a task that previously took months. While this capability exists, it’s not yet fully integrated into routine field pipelines. However, it holds great potential to accelerate studies of abundance, density, and conservation outcomes.

DS: Computer vision is transforming how we collect, analyze, and interpret primate behavior at scale. With tools like camera traps, passive audio recorders, and continuous monitoring in captive settings, we can gather vast datasets over long timeframes. AI enables us to process this data efficiently and reproducibly—generating behavioral records that support both scientific discovery and animal welfare. For example, we can detect early signs of illness, monitor activity patterns and social relationships, and track changes over time—all non-invasively. Crucially, AI also enhances transparency and reproducibility: we can trace exactly how observations were generated, compare algorithms, and revisit original data. I firmly believe many of the future breakthroughs in the field will be made possible by these advances in AI.

Question 7. In what other ways could these technologies be used? What might they do in the future?

FC: An extension of Gorilla Game Lab we have discussed (and would love to identify collaborators for!) is embedding technology within cognitive tasks and enrichment for covert health monitoring. Some great apes are beautifully trained to cooperate with routine husbandry procedures like offering their arm for blood pressure monitoring, or their eyes to check for cataracts. But some individuals find it quite stressful. Taking inspiration from some gamified eye testing applications for humans, we would like to develop ways to monitor health in zoo apes, and maybe also their wild counterparts.

DS: What’s exciting is how these models are becoming increasingly generalizable. We’re seeing a convergence of methods that can combine multiple modalities—visual, auditory, and even language—into unified systems. These models are becoming more flexible and data-efficient, able to perform tasks with minimal training. This opens up the potential to uncover patterns and structures in behavior that may not be obvious to human observers. For example, the application of AI interpret communicative signals is a rapidly growing area, with the potential to reveal new insights into the meaning and structure of primate communication.

Question 8. What are you working on now? Do you have plans to continue using AI in primate projects?

FC: I am currently interested in ‘flow state’ – a human state of high task focus which could also be present in great apes. Tied into the health monitoring discussed above, I am excited to try and use AI to develop highly engaging games for great apes, but also measure their behavioral and physiological responses.

OB: Yes, we’ve developed several AI models and large-scale datasets focused on extracting primate behavior from video footage collected in the wild. Our goal is not only to advance our own research but also to support the wider community by creating datasets that enable other researchers to build and apply similar capabilities. We definitely plan to continue developing and refining these methods to better understand primate behavior and support conservation efforts.

DS: Yes – I’m starting a Schmidt AI in Science Fellowship, and one of main goals is to scale these tools and primate models for research and conservation. As part of this, I’m working to make AI more accessible so that more researchers and conservationists can track populations and behaviors without relying on expensive equipment or acquiring large annotated datasets. Our lab is developing some exciting new software tools for managing large camera trap datasets and for helping to identify individuals in unhabituated populations. By building generalizable, low-barrier tools, I hope to widen the impact of AI in primatology, conservation and related fields.

Question 9. What advice would you give to people wanting to use computer vision to study primates or improve their welfare?

FC: My number one tip for zoos interested in computer vision, machine learning, AI etc. is to fully collaborate with computer scientists and engineers. There is little advantage to taking on the whole burden yourself (learning how to code and program, etc.), and I bet there are incredible potential collaborators at your local university or company, or further afield (Google ‘Animal Computer Interaction’). I have made some lifelong friendships in the process of working on these technology projects. Always remind yourself ‘what is the point?’ of this technological application and avoid adding cameras and computers to animal enclosures without a strong ethical justification. Technology should be used cautiously and with respect for the animals and their requirements.

OB: I completely agree with Fay’s point about the importance of collaboration. The community working at the intersection of AI and animal research is very welcoming, and it’s well worth reaching out to others to discuss your project ideas. Many of the challenges in using computer vision are highly project-specific, so getting advice early can save a lot of time. Since these methods often require a significant upfront effort—especially around data collection and labeling—it’s important to start on the right track and design your workflow carefully from the beginning.

DS: Computer vision and AI are tools that anyone can start using—you don’t need a background in computer science or mathematics. A basic understanding of what AI can do, and its limitations, is enough to get started. Begin with a simple, well-defined task such as object detection or counting animals in images. Open-source tools and demos make it easy to generate results and learn through hands-on exploration without getting bogged down in technical details. For more complex projects, collaborating with engineers or AI researchers can be incredibly valuable. But these tools are becoming more accessible every day, and now is a great time to explore what’s possible. For example, we’ve created an interactive notebook that lets users try out our detection and tracking workflows with no coding required: https://www.robots.ox.ac.uk/~vgg/software/follow-things-around/

References

Face recognition article: https://www.science.org/doi/full/10.1126/sciadv.aaw0736

Action recognition article: https://www.science.org/doi/full/10.1126/sciadv.abi4883

Measuring sociality with computer vision article: https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.14181

Visual Geometry Group, University of Oxford: https://www.robots.ox.ac.uk/~vgg/

Visual AI project software: https://www.robots.ox.ac.uk/~vgg/projects/visualai/software.html

Detection and tracking workflow: https://www.robots.ox.ac.uk/~vgg/software/follow-things-around/

New beta notebook for tracking all animals: https://colab.research.google.com/github/ox-vgg/follow-things-around/blob/v2_beta1/follow-things-around-v2-beta.ipynb#scrollTo=S4Hqz0eZ5ulO