Airborne laser scanning (ALS) is a widely adopted remote sensing technology, renowned for its efficient and precise modeling of forests. This is attributed to its capability to accurately describe the geometric features of trees within a forest. However, automating the identification of individual trees and their species from ALS data poses a formidable challenge. Traditional closed-form clustering algorithms yield inaccurate segmentation results, and deep learning-based methods demand substantial amounts of labeled training data, which is impractical to establish manually.
This project aims to tackle the challenges associated with object labeling and accuracy by employing unsupervised and self-supervised approaches. Unsupervised methods are utilized to obtain a preliminary segmentation of the ALS data. Subsequently, these roughly segmented tree examples will be hand labeled and employed to train a classifier, facilitating the identification of well-segmented tree individuals. In the final step, these labels will be used to calibrate and refine state of the art segmentation and classification algorithms, employing a semi-supervised approach.
Task Description and Methodology
The focus of this phase of the project is to develop an active learning framework for the task of point cloud segmentation.
Active learning is a paradigm that involves an iterative process where the model actively selects the most informative data points for annotation. The goal is to improve the model’s performance with minimal labeled data, focusing on instances that are challenging or uncertain.
In the context of point cloud segmentation, active learning aims to enhance the accuracy and efficiency of segmentation models by strategically selecting points from the point cloud for annotation. Segmentation involves assigning a label to each point in the cloud, indicating the object or surface it belongs to.
Active learning will be applied for the point cloud segmentation task as follows:
Initial Model Training:
- Train an initial segmentation model on a small labeled dataset derived via an unsupervised method .
- Use this model to make predictions on the unlabeled data.
- Employ uncertainty estimation techniques to identify points where the model is uncertain or likely to make errors. This uncertainty can arise from ambiguous shapes, occlusions, or other challenging scenarios.
- Develop a query strategy to select points that contribute the most to reducing model uncertainty. Common strategies include choosing points with high prediction entropy or low-confidence predictions.
Annotation and Model Update:
- The selected points will be relabelled by a hand.
- Update the model with the newly annotated data and repeat the training process.
- Repeat the process iteratively, selecting new points for annotation based on the updated model.
- Over time, the model should become more accurate with fewer labeled examples.
- 10cm GSD airborne orthophoto / LiDAR pointcloud ~ 20 pnts/m2
- ~700 in-situ localized tree species observations
- 10cm Imaging spectroscopy/multispectral image data
- Report summarizing findings of the investigation
- Code implementation published to lab GitLab account
- Output data prepared in a format readable by a standard GIS software
- Experience with python, Deep Learning concepts and common deep learning frameworks (pytorch, Tensorflow,etc.), Computer Vision.
Interested candidates are kindly asked to send us by email their CV/github profile and a short motivation statement.
Jesse Lahaye , Laurent Jospin