In precision agriculture, the detection and recognition
of insects play an essential role in the ability of crops to
grow healthy and produce a high-quality yield. The current machine vision model requires a large volume of data
to achieve high performance. However, there are approximately 5.5 million different insect species in the world.
None of the existing insect datasets can cover even a fraction of them due to varying geographic locations and acquisition costs. In this paper, we introduce a novel “Insect1M” dataset, a game-changing resource poised to revolutionize insect-related foundation model training. Covering
a vast spectrum of insect species, our dataset, including 1
million images with dense identification labels of taxonomy
hierarchy and insect descriptions, offers a panoramic view
of entomology, enabling foundation models to comprehend
visual and semantic information about insects like never
before. Then, to efficiently establish an Insect Foundation
Model, we develop a micro-feature self-supervised learning method with a Patch-wise Relevant Attention mechanism capable of discerning the subtle differences among insect images. In addition, we introduce Description Consistency loss to improve micro-feature modeling via insect
descriptions. Through our experiments, we illustrate the
effectiveness of our proposed approach in insect modeling and achieve State-of-the-Art performance on standard
benchmarks of insect-related tasks. Our Insect Foundation
Model and Dataset promise to empower the next generation
of insect-related vision models, bringing them closer to the
ultimate goal of precision agriculture.
|