Benchmarking tree species classification from proximally-sensed laser scanning data: introducing the FOR-species20K dataset

Autor: Puliti, Stefano, Lines, Emily R., Müllerová, Jana, Frey, Julian, Schindler, Zoe, Straker, Adrian, Allen, Matthew J., Winiwarter, Lukas, Rehush, Nataliia, Hristova, Hristina, Murray, Brent, Calders, Kim, Terryn, Louise, Coops, Nicholas, Höfle, Bernhard, Junttila, Samuli, Krůček, Martin, Krok, Grzegorz, Král, Kamil, Levick, Shaun R., Luck, Linda, Missarov, Azim, Mokroš, Martin, Owen, Harry J. F., Stereńczak, Krzysztof, Pitkänen, Timo P., Puletti, Nicola, Saarinen, Ninni, Hopkinson, Chris, Torresan, Chiara, Tomelleri, Enrico, Weiser, Hannah, Astrup, Rasmus
Rok vydání: 2024
Předmět:
Druh dokumentu: Working Paper
Popis: Proximally-sensed laser scanning offers significant potential for automated forest data capture, but challenges remain in automatically identifying tree species without additional ground data. Deep learning (DL) shows promise for automation, yet progress is slowed by the lack of large, diverse, openly available labeled datasets of single tree point clouds. This has impacted the robustness of DL models and the ability to establish best practices for species classification. To overcome these challenges, the FOR-species20K benchmark dataset was created, comprising over 20,000 tree point clouds from 33 species, captured using terrestrial (TLS), mobile (MLS), and drone laser scanning (ULS) across various European forests, with some data from other regions. This dataset enables the benchmarking of DL models for tree species classification, including both point cloud-based (PointNet++, MinkNet, MLP-Mixer, DGCNNs) and multi-view image-based methods (SimpleView, DetailView, YOLOv5). 2D image-based models generally performed better (average OA = 0.77) than 3D point cloud-based models (average OA = 0.72), with consistent results across different scanning platforms and sensors. The top model, DetailView, was particularly robust, handling data imbalances well and generalizing effectively across tree sizes. The FOR-species20K dataset, available at https://zenodo.org/records/13255198, is a key resource for developing and benchmarking DL models for tree species classification using laser scanning data, providing a foundation for future advancements in the field.
Databáze: arXiv