Solar Panel Detection with Faster RCNN 🛰️🌎🤖

5 min readMay 31, 2022

Hello everyone, In this work we are going to identify and detect solar panel in high resolution satellite images. For this we will apply Faster RCNN, a Deep Learning algorithm that uses convolutional layers to extract relevant information from the image in order to find the position of objects.

Solar power is simply usable energy generated from the sun in the form of electric or thermal energy. Solar energy is captured in a variety of ways, the most common of which is with a photovoltaic solar panel system, or PV system, that converts the sun’s rays into usable electricity. Aside from using photovoltaics to generate electricity, solar energy is commonly used in thermal applications to heat indoor spaces or fluids. Residential and commercial property owners can install solar hot water systems and design their buildings with passive solar heating in mind to fully take advantage of the sun’s energy with solar technology.

Object detection

Object detection is a computer vision technique for locating instances of objects in images or videos. Object detection algorithms typically leverage machine learning or deep learning to produce meaningful results. When humans look at images or video, we can recognize and locate objects of interest within a matter of moments. The goal of object detection is to replicate this intelligence using a computer.

Object detection is commonly confused with image recognition, so before we proceed, it’s important that we clarify the distinctions between them.

Image recognition assigns a label to an image. A picture of a dog receives the label “dog”. A picture of two dogs, still receives the label “dog”. Object detection, on the other hand, draws a box around each dog and labels the box “dog”. The model predicts where each object is and what label should be applied. In that way, object detection provides more information about an image than recognition.

Here’s an example of how this distinction looks in practice:

Deep learning-based object detection models typically have two parts. An encoder takes an image as input and runs it through a series of blocks and layers that learn to extract statistical features used to locate and label objects. Outputs from the encoder are then passed to a decoder, which predicts bounding boxes and labels for each object.

The simplest decoder is a pure regressor. The regressor is connected to the output of the encoder and predicts the location and size of each bounding box directly. The output of the model is the X, Y coordinate pair for the object and its extent in the image. Though simple, this type of model is limited. You need to specify the number of boxes ahead of time. If your image has two dogs, but your model was only designed to detect a single object, one will go unlabeled. However, if you know the number of objects you need to predict in each image ahead of time, pure regressor-based models may be a good option.

An extension of the regressor approach is a region proposal network. In this decoder, the model proposes regions of an image where it believes an object might reside. The pixels belonging to these regions are then fed into a classification subnetwork to determine a label (or reject the proposal). It then runs the pixels containing those regions through a classification network. The benefit of this method is a more accurate, flexible model that can propose arbitrary numbers of regions that may contain a bounding box. The added accuracy, though, comes at the cost of computational efficiency.

Faster RCNN

The most widely used state of the art version of the R-CNN family — Faster R-CNN was first published in 2015. In the R-CNN family of papers, the evolution between versions was usually in terms of computational efficiency (integrating the different training stages), reduction in test time, and improvement in performance (mAP). These networks usually consist of — a) A region proposal algorithm to generate “bounding boxes” or locations of possible objects in the image; b) A feature generation stage to obtain features of these objects, usually using a CNN; c) A classification layer to predict which class this object belongs to; and d) A regression layer to make the coordinates of the object bounding box more precise.

The only stand-alone portion of the network left in Fast R-CNN was the region proposal algorithm. Both R-CNN and Fast R-CNN use CPU based region proposal algorithms, Eg- the Selective search algorithm which takes around 2 seconds per image and runs on CPU computation. The Faster R-CNN paper fixes this by using another convolutional network (the RPN) to generate the region proposals. This not only brings down the region proposal time from 2s to 10ms per image but also allows the region proposal stage to share layers with the following detection stages, causing an overall improvement in feature representation. In the rest of the article, “Faster R-CNN” usually refers to a detection pipeline that uses the RPN as a region proposal algorithm, and Fast R-CNN as a detector network.

R-CNN, Fast R-CNN, Faster R-CNN, YOLO — Object Detection Algorithms

Understanding object detection algorithms

towardsdatascience.com

Data

The data used in this work are composed of high resolution RGB satellite images with a size of 901x791 , and annotations with the class (solar panel) and the bounds.

Training

To train an object detection model, we use Faster RCNN’s Pytorch implementation. Google Colab Pro was used for training due to the possibility of using the GPU and a good amount of RAM. Finally, the Model was trained for 100 epochs.

Results

After training, we apply our model to test images to verify the detections and compare them with the original annotations. The detections in green represent the original annotations and the ones in red represent those predicted by the model:

We saw that the model was able to identify the solar panels correctly, but a false detection occurred in this example image. Maybe we could collect more images, apply data augmentation or add more epochs to avoid this problem.

For more case studies follow me here or on Linkedin

Thanks!!