AG真人百家乐官方网站

Skip to main content
NSF NEON, Operated by Battelle

Main navigation

  • AG真人百家乐官方网站 Us
    • Overview
      • Spatial and Temporal Design
      • History
    • Vision and Management
    • Advisory Groups
      • Science, Technology & Education Advisory Committee
      • Technical Working Groups (TWGs)
    • FAQ
    • Contact Us
      • Contact NEON Biorepository
      • Field Offices
    • User Accounts
    • Staff
    • Code of Conduct

    AG真人百家乐官方网站 Us

  • Data & Samples
    • Data Portal
      • Spatial Data & Maps
    • Data Themes
      • Biogeochemistry
      • Ecohydrology
      • Land Cover and Processes
      • Organisms, Populations, and Communities
    • Samples & Specimens
      • Discover and Use NEON Samples
        • Sample Types
        • Sample Repositories
        • Megapit and Distributed Initial Characterization Soil Archives
      • Sample Processing
      • Sample Quality
    • Collection Methods
      • Protocols & Standardized Methods
      • Airborne Remote Sensing
        • Flight Box Design
        • Flight Schedules and Coverage
        • Daily Flight Reports
          • AOP Flight Report Sign Up
        • Camera
        • Imaging Spectrometer
        • Lidar
      • Automated Instruments
        • Site Level Sampling Design
        • Sensor Collection Frequency
        • Instrumented Collection Types
          • Meteorology
          • Phenocams
          • Soil Sensors
          • Ground Water
          • Surface Water
      • Observational Sampling
        • Site Level Sampling Design
        • Sampling Schedules
        • Observation Types
          • Aquatic Organisms
            • Aquatic Microbes
            • Fish
            • Macroinvertebrates & Zooplankton
            • Periphyton, Phytoplankton, and Aquatic Plants
          • Terrestrial Organisms
            • Birds
            • Ground Beetles
            • Mosquitoes
            • Small Mammals
            • Soil Microbes
            • Terrestrial Plants
            • Ticks
          • Hydrology & Geomorphology
            • Discharge
            • Geomorphology
          • Biogeochemistry
          • DNA Sequences
          • Pathogens
          • Sediments
          • Soils
            • Soil Descriptions
        • Optimizing the Observational Sampling Designs
    • Data Notifications
    • Data Guidelines and Policies
      • Acknowledging and Citing NEON
      • Publishing Research Outputs
      • Usage Policies
    • Data Management
      • Data Availability
      • Data Formats and Conventions
      • Data Processing
      • Data Quality
      • Data Product Bundles
      • Data Product Revisions and Releases
        • Release 2021
        • Release 2022
        • Release 2023
        • Release 2024
        • Release-2025
      • NEON and Google
      • Externally Hosted Data

    Data & Samples

  • Field Sites
    • AG真人百家乐官方网站 Field Sites and Domains
    • Explore Field Sites

    Field Sites

  • Impact
    • Observatory Blog
    • Case Studies
    • Papers & Publications
    • Newsroom
      • NEON in the News
      • Newsletter Archive
      • Newsletter Sign Up

    Impact

  • Resources
    • Getting Started with NEON Data & Resources
    • Documents and Communication Resources
      • Papers & Publications
      • Outreach Materials
    • Code Hub
      • Code Resources Guidelines
      • Code Resources Submission
    • Learning Hub
      • Science Videos
      • Tutorials
      • Workshops & Courses
      • Teaching Modules
    • Research Support Services
      • Field Site Coordination
      • Letters of Support
      • Mobile Deployment Platforms
      • Permits and Permissions
      • AOP Flight Campaigns
      • Research Support FAQs
      • Research Support Projects
    • Funding Opportunities

    Resources

  • Get Involved
    • Advisory Groups
      • Science, Technology & Education Advisory Committee
      • Technical Working Groups
    • Upcoming Events
    • NEON Ambassador Program
      • Exploring NEON-Derived Data Products Workshop Series
    • Research and Collaborations
      • Environmental Data Science Innovation and Inclusion Lab
      • Collaboration with DOE BER User Facilities and Programs
      • EFI-NEON Ecological Forecasting Challenge
      • NEON Great Lakes User Group
      • NEON Science Summit
      • NCAR-NEON-Community Collaborations
        • NCAR-NEON Community Steering Committee
    • Community Engagement
      • How Community Feedback Impacts NEON Operations
    • Science Seminars and Data Skills Webinars
      • Past Years
    • Work Opportunities
      • Careers
      • Seasonal Fieldwork
      • Internships
        • Intern Alumni
    • Partners

    Get Involved

  • My Account
  • Search

Search

Impact

  • Observatory Blog
  • Case Studies
  • Papers & Publications
  • Newsroom

Breadcrumb

  1. Impact
  2. Observatory Blog
  3. What's That Beetle? Ask the Algorithm

Case Study

What's That Beetle? Ask the Algorithm

May 26, 2021

CPER pitfall trap samples at Biorepository

Can machine learning be used for accurate species identification of beetles and other invertebrates? Dr. Katie Marshall and Jarrett Blair at the University of British Columbia (UBC) sought to answer this question using carabid beetle data from the NEON program. Eventually, they hope to leverage machine learning to identify other species caught in the NEON beetle pitfall traps. Machine learning could one day be used to classify unidentified species in the NEON bycatch (species caught other than the target species) and answer new questions about invertebrate diversity and abundance across North America.

Classifying Carabids at NEON Field Sites

Carabids are a large family of insects commonly known as ground beetles. With more than 2000 known species in North America, and 40,000 worldwide, the Carabidae is one of the most species-rich taxa of invertebrate animals. They are found in nearly every ecosystem in North America. Their diversity and abundance make them a great taxon to study; shifts in carabid species diversity and range can provide important information about how ecosystems are changing.

That's why the NEON program collects and classifies carabids as part of the terrestrial observational sampling system. Ground beetles are collected in pitfall traps filled with a colorless and unscented preservative solution. A variety of other insects and arthropods also end up in the traps as bycatch. Carabids are identified to the species level wherever possible. Bycatch is not identified or classified but is preserved in the for interested researchers.

Specimen from OAES

Specimen sample at the NEON Biorepository, from a pitfall trap at the OAES field site in Oklahoma (D11).

Dr. Katie Marshall, an assistant professor in the Department of Zoology at the University of British Columbia (UBC), and Jarrett Blair, a Zoology Ph.D. candidate at UBC, are among those interested researchers. While their ultimate interest is in the non-carabid species collected in the bycatch, they first needed to test their machine learning algorithm using known species. Because the carabids collected by the NEON program and available from the NEON Biorepository were already identified to the species level, they provided an excellent opportunity to test the algorithm.

Their results were published in Ecology and Evolution in November 2020: "." The study was funded through a .

Putting Machine Learning to the Test with Carabids

Blair and Marshall worked with collaborators from the University of Oklahoma (OU) to develop the machine learning algorithm for carabid identification. Machine learning is a form of artificial intelligence that looks for patterns in large datasets. The algorithm is fed training data鈥攊n this case, image data for carabids that had already been identified down to the species or subspecies level. By looking at large datasets, the algorithm learns which features in the data (such as body morphology and color patterns) are associated with different species.

"Once the program is trained," Blair explains, "the idea is that when you feed it new data, it will be able to make the identification fairly accurately. The goal is to classify beetle species with an accuracy that mirrors or exceeds that of humans."

CPER pitfall trap samples at the NEON Biorepository

CPER pitfall trap samples at the NEON Biorepository.

For this project, they fed the algorithm extracted image data (such as specimen size, shape, and color) rather than raw pixel data. Dr. Michael Weiser at OU photographed carabid species from the NEON Biorepository and provided the extracted image data for the training set. An open-source software program was used to automate feature extraction from the raw images.

"The advantage of using extracted morphology data rather than raw images is that it provided important information such as body size," Marshall explains. "It also allows us to control the data that the program is learning from. If we just feed the model raw images, we don't really know what data it is extracting from those images to learn from." Using extracted image data also simplifies the dataset, allowing the program to run with much less processing power. "One of the goals was to create a model that would be accessible for other researchers. With this, you don't need a supercomputer to run the model鈥攜ou can do it on your laptop."

Algorithm development was led by Blair, expanding on earlier work. As an undergraduate in 2017, he and a friend developed a startup company creating apps that use machine vision and machine learning for insect identification. Specifically, they were developing a program to help farmers identify pest insects on crops. "This project brings together my interests in entomology and machine learning," he says.

The program uses taxonomic classification. If it runs across a species it cannot identify鈥攆or example, a rare species that was not present in its original training set鈥攊t will attempt to classify it to a higher taxonomic level, such as genus or subfamily. Human taxonomists can then complete the classification and feed the species data back to the algorithm, so it can continue learning over time. The algorithm can classify hundreds of unknown individuals nearly instantaneously, greatly expanding the ability of ecologists to make use of currently unclassified preserved specimens.

In tests using NEON carabid data, the best-performing algorithm reached a species identification accuracy of ~85% when presented with unidentified image data. Providing location data improved accuracy to more than 95% at the species level and ~99% at the subfamily level. Human field technicians are expected to exceed 80% accuracy in species identification for their site locations. Blair says, "Eventually, we could turn this into a classification pipeline for the NEON program. Right now, the bottleneck is at the imaging and data extraction stage. But once that is done, it would take a fraction of a second to identify thousands of specimens."

Beyond Beetles: Looking at the Bycatch

The study serves as a proof-of-concept for using machine learning for invertebrate identification. Marshall and Blair next plan to expand their studies to the NEON program bycatch. The bycatch contains many additional invertebrate species of interest, including ants, spiders, and other arthropods, both common and rare. A better understanding of invertebrate populations will provide insights into ecosystems as a whole.

Marshall explains, "Invertebrates are the unsung heroes of an ecosystem. They may not be flashy or exciting, but, as E.O. Wilson said, they are 'the little things that run the world.' They do a lot of work, decomposing, making bionutrients available鈥攚ithout them, we would be losing a lot of biodiversity in our ecosystems. This is a great opportunity to study them and get a better understanding of the roles they play in different ecosystems."

To create an identification algorithm for bycatch from the NEON program, researchers will first need to create a training data set. This will involve manual species identification for the diverse species found in the NEON bycatch鈥攁 daunting proposition. At first, they are only aspiring to identify specimens to the family, subfamily, or genus level. This analysis will provide insights into the presence and relative abundance of different types of invertebrates at the NEON field sites.

To assist with identification, Blair and Marshall plan to join forces with another team using environmental DNA (eDNA) analysis. This team, headed by Dr. Cameron Siler at OU, is sequencing DNA extracted from the preservative fluid in the bycatch sample tubes to identify the arthropod species present. Combining eDNA methods with the imaging and machine learning work headed up by Marshall and Blair could bring together the best of both worlds. eDNA provides evidence of which species are present, but not the relative abundance of each species in the sample tube. The imaging process could use the eDNA evidence to improve species identification while providing a count of how many individuals of each species are present.

Geographical Ecology in Action

This team's project is a great example of geographical ecology鈥攖he study of how organisms are distributed over space. Large-scale ecological data like that gathered by the NEON program allows researchers to look for patterns in which species live in different regions and habitats and how that is changing over time.

Machine learning has the potential to greatly expand the ability to make use of the data collected, especially the bycatch that is not currently identified by NEON field researchers. "Some people ask, 'will machine learning steal my job?'" says Blair. "But this is not a replacement for human researchers. The goal is to create a machine-assisted data pipeline, where the algorithm will provide an initial classification that can be verified and refined by human experts. This will vastly speed up the identification process so researchers can get to the more interesting part of the job鈥攁sking and answering questions about ecosystems."

They expect that the NEON program will continue to play a large part in geographic ecology research. Marshall says, "I know of no other organization that collects so much systematic data on such a broad spatial scale. There is no other comparable dataset that would allow you to answer these questions across such a wide range of habitats, from tropical to tundra. It's really an incredible resource鈥擨 don't know how else you would ask these kinds of questions."

Share

Related Posts:

Reconstructing the Mosquito Family Tree with DNA

July 17, 2024

Hand holding a mosquito

New Terrestrial Observation System (TOS) sampling boundary at the Domain 04 Lajas Experimental Station site

September 2, 2024

Out of the Classroom and Into the Field with Carolina Wildlands

May 28, 2024

Student holds a turtle
NSF NEON, Operated by Battelle

Follow Us:

Join Our Newsletter

Get updates on events, opportunities, and how NEON is being used today.

Subscribe Now

Footer

  • AG真人百家乐官方网站 Us
  • Newsroom
  • Contact Us
  • Terms & Conditions
  • Careers
  • Code of Conduct

Copyright © Battelle, 2025

The National Ecological Observatory Network is a major facility fully funded by the U.S. National Science Foundation.

Any opinions, findings and conclusions or recommendations expressed in this material do not necessarily reflect the views of the U.S. National Science Foundation.