The CalBug Project

Digitizing California Terrestrial Arthropod Collections

Principal Investigator: Rosemary Gillespie

Project Coordinators: Gordon Nishida and Peter Oboyski

Graduate Student Researchers: Joanie Ball and Meghan Culpepper

Collaborators: California State Collection of Arthropods, UC Davis Bohart Museum, California Academy of Sciences, Santa Barbara Museum of Natural History, UC Santa Cruz Museum of Natural History, LA County Museum, San Diego Natural History Museum and the UC Riverside Entomology Research Museum

UC Berkeley's CalBug Team
Front Row: Ginger Haight, Asia Kwan, Meghan Culpepper, Hanna Huynh, Skyler Valle
Back Row: Gordon Nishida, Pete Oboyski, Frank Ngo, Hannah Shin, Kent Nguyen

Worldwide, natural history collections house over a billion insect specimens collected over several centuries. Specimen labels provide data denoting species, location, and date that can be used to study biogeographic patterns, the spread of invasive species, and responses to environmental changes. However, access to these data is impractical for most of the research community. Because of enormous collection sizes, entomology has lagged behind other disciplines in digitizing collections.

In 2010, the Essig Museum, along with eight other California museums, began a collaborative five-year project with a goal to digitize and geographically reference over one million specimens from target groups and localities. A major goal of the project is streamlining data capture while maintaining data integrity and specimen safety. Digital imaging of labels reduces the need for repeated handling of specimens, allows enlargement of difficult to read text, and enables outsourcing of data transcription. Transcription is aided by a web-based citizen science program through collaboration with the Citizen Science Alliance ( Data from each label are keyed by multiple volunteers and compared for consistency. Data are vetted in-house, normalized for variations in locality descriptions, then repatriated to an online, open-access MySQL data cache. Georeferencing of localities will occur later using semi-automated services, such as BioGeomancer, following standard protocols.

Photocapture of one of Essig's specimens, the European honeybee

The greatest bottleneck in digitizing collections is the handling of individual specimens and labels. Using high-efficiency workstations, labels are removed from pins, digitally imaged with unique identifiers, repinned and returned to their trays. Annotation of images and naming of files are done with batch processing software.

For more information, visit the CalBug website.
To get involved as a volunteer, contact Gordon Nishida.