Samsung Solve for Tomorrow
Team Lead
The Problem
Earth has lost half its topsoil in the last 150 years. In Pleasanton, where I grew up, erosion has caused entire backyards to collapse into creeks. Current detection relies on manually sampling soil over years, which limits coverage to accessible areas and catches problems late. We wanted to build something that could survey large areas quickly and flag erosion before it caused damage.
A team of six worked on this over sophomore and junior year. Researchers at Lawrence Livermore National Laboratory mentored us throughout; their feedback shaped the two stage architecture and helped define what counts as actionable erosion versus normal soil variation. We also validated the problem by talking with local landowners and city officials to confirm it was real and underserved.
Aerial Detection
The first stage uses a DJI Phantom 2 running DroneLink for automated survey flights. The drone captures aerial imagery with GPS coordinates, and those images go into a Bayesian CNN. The model outputs erosion probabilities rather than binary classifications; ambiguous regions get flagged for ground verification instead of forcing a decision on low confidence predictions.
The main bottleneck was training data. Labeled aerial erosion datasets are scarce, and erosion is visually subtle from altitude, which makes it hard to produce confident labels even with manual annotation. We trained on a manually labeled dataset split 80/20 for training and validation. The Bayesian approach helped with this; instead of forcing the model to classify uncertain images, it quantifies its own uncertainty and defers to ground truth collection where needed.
Ground Verification
Flagged coordinates go to a ground rover for physical confirmation. The rover was designed with GPS navigation, close range imaging, and soil sampling using differential steering for uneven terrain. The design was fully mapped out, but funding was redirected to the school before the autonomous navigation system could be fabricated.
Ground level data either confirms or rejects the aerial classification, and over time that feedback improves the CNN. Surveying from the air first and only sending the rover to flagged locations keeps the cost per acre low compared to sampling an entire site manually. The final output was designed as a color coded map overlay, so non technical users could see which zones need attention without interpreting model outputs directly.
Results
The project reached the California state finals in Samsung's Solve for Tomorrow competition, placing in the top 1% of over 30,000 teams nationwide. The $5,000 grant went to our school's engineering program.
Reflection
If this project were revisited today, the core idea is more viable than ever; satellite resolution and ML tooling have both improved significantly. The bottleneck would still be dataset quality. Labeled aerial erosion data is genuinely scarce, and erosion is visually subtle from altitude, making annotation difficult. Most of the upfront effort would need to go into building a reliable labeling pipeline, probably combining satellite datasets with field validated ground truth, before training the model. The architecture is sound; the data was always the hard part.