Harmonic NDVI Time Series Clustering with Python and GEE
Hello, in this post we‘ll use the NDVI temporal variation of different types of use and coverage of our study area to create harmonic time series. After that, we ‘ll use an algorithm to cluster the samples into three classes. For this task, let’s create the script on google Colab, using the following packages: Google Earth Engine Python API for obtaining spectral information, ipygee for converting information into a pandas dataframe, folium for visualization and tslearn for implementing the cluster algorithm.
Installing the packages
Let’s install the necessary packages:
!pip install rasterio
!pip install ipygee
!pip install tslearn
!pip install earthengine-api
!pip install geopandas
Now, let’s import and authenticate the Google Earth Engine python API:
import ee
ee.Authenticate()
ee.Initialize()
Then we can import the rest of the packages:
import folium
from folium import plugins
from IPython.display import Image
import geopandas as gpd
import json
print(folium.__version__)
from ipygee import*
import math
import pandas as pd
from tslearn.clustering import TimeSeriesKMeans
from tslearn.utils import to_time_series_dataset
Region of Study
First, let’s define our study area. Our AOI is located in Brazil, in the city of Lucas do Rio Verde-MT, a region with large soyabeans crop fields. We’ll also use a GEE function to create random points in our AOI:
AOI = ee.Geometry.Polygon(
[[[-56.37397887990624, -12.737207954893526],
[-56.37397887990624, -13.300834158918619],
[-55.37113311574608, -13.300834158918619],
[-55.37113311574608, -12.737207954893526]]])points = ee.FeatureCollection.randomPoints(AOI,100)
After defining the study area, the start and end date of the time series was selected:
months = ee.List.sequence(1,12)
years = ee.List.sequence(2016, 2019)
NDVI MODIS Collection
The dataset used will be the collection of daily NDVI images from the MODIS satellite. The Normalized Difference Vegetation Index is generated from the Near-IR and Red bands of each scene as (NIR — Red) / (NIR + Red), and ranges in value from -1.0 to 1.0. This product is generated from the MODIS/006/MOD09GA surface reflectance composites.
More in:
MD_NDVI = ee.ImageCollection('MODIS/MOD09GA_006_NDVI')
.filterDate('2016-1-1','2019-12-31')
.filterBounds(AOI).select('NDVI')modis_ndvi = MD_NDVI.median().clip(AOI)
mean_ndvi = MD_NDVI.mean().clip(AOI)
Thus, we can use the folium to interactively visualize the points generated and an image of the median of the NDVI of AOI for the period of analysis:
vis_params = {'min': 0, 'max': 1, 'palette':
['red', 'yellow','green']}basemaps = {
'Google Maps': folium.TileLayer(
tiles = 'https://mt1.google.com/vt/lyrs=m&x={x}&y={y}&z={z}',
attr = 'Google',
name = 'Google Maps',
overlay = True,
control = True
),
'Google Satellite': folium.TileLayer(
tiles = 'https://mt1.google.com/vt/lyrs=s&x={x}&y={y}&z={z}',
attr = 'Google',
name = 'Google Satellite',
overlay = True,
control = True
)}def add_ee_layer(self, ee_object, vis_params, name):
try:
if isinstance(ee_object, ee.image.Image):
map_id_dict = ee.Image(ee_object).getMapId(vis_params)
folium.raster_layers.TileLayer(
tiles = map_id_dict['tile_fetcher'].url_format,
attr = 'Google Earth Engine',
name = name,
overlay = True,
control = True
).add_to(self)
elif isinstance(ee_object, ee.imagecollection.ImageCollection):
ee_object_new = ee_object.mosaic()
map_id_dict = ee.Image(ee_object_new).getMapId(vis_params)
folium.raster_layers.TileLayer(
tiles = map_id_dict['tile_fetcher'].url_format,
attr = 'Google Earth Engine',
name = name,
overlay = True,
control = True
).add_to(self)
elif isinstance(ee_object, ee.geometry.Geometry):
folium.GeoJson(
data = ee_object.getInfo(),
name = name,
overlay = True,
control = True
).add_to(self)
elif isinstance(ee_object,ee.featurecollection.FeatureCollection):
ee_object_new = ee.Image().paint(ee_object, 0, 2)
map_id_dict = ee.Image(ee_object_new).getMapId(vis_params)
folium.raster_layers.TileLayer(
tiles = map_id_dict['tile_fetcher'].url_format,
attr = 'Google Earth Engine',
name = name,
overlay = True,
control = True
).add_to(self)
except:
print("Could not display {}".format(name))folium.Map.add_ee_layer = add_ee_layer
my_map = folium.Map(location=[-13.0912068,-55.9881647],
zoom_start=10)
basemaps['Google Maps'].add_to(my_map)
my_map.add_ee_layer(modis_ndvi, vis_params, 'NDVI')
my_map.add_ee_layer(points.geometry(), {}, 'Points')
my_map.add_child(folium.LayerControl())
display(my_map)
Let’s define a function to return the NDVI monthly median, generating a time series between January 2016 and December 2019.
def monthly(collection):
img_coll = ee.ImageCollection([])
for y in years.getInfo():
for m in months.getInfo():
filtered = collection.filter(ee.Filter.calendarRange(y,y,'year'))
.filter(ee.Filter.calendarRange(m, m, 'month'))
filtered = filtered.median()
img_coll = img_coll.merge(filtered.set('year', y)
.set('month', m)
.set('system:time_start',
ee.Date.fromYMD(y, m, 1)
.getInfo()['value']))
return img_collMonthly_MD = monthly(MD_NDVI)
We can use ipygee to generate a chart of the NDVI variation in the analysis period for any point in the AOI:
Point_1 = ee.Geometry.Point([-56.1070717726807,-13.168187045136754])MD_ndvi = chart.Image.series(**{'imageCollection': Monthly_MD,
'region': Point_1,
'reducer': ee.Reducer.mean()
'scale': 250,
'xProperty': 'system:time_start'})MD_ndvi.renderWidget(width='50%')
Harmonic Model
The harmonic analysis of the NDVI time series aims to characterize seasonal changes in the spectral behavior of targets by decomposing the time series into harmonic terms. Similar to principal component analysis, the majority of the variance in a data set is contained in the first few terms (components). Over 90% of the variance in the original time series was captured in the additive and the first two harmonic terms. The additive term, and the phase and amplitude for the first two harmonic terms were extracted and used for further analysis.
“The amplitude corresponds to half the value at which the function is maximized, and the phase is the displacement between the origin and the peak of the wave in the range of 0 to 2π. Each harmonic term represents the number of cycles completed by a wave, in a given time interval, and is responsible for a percentage of the total variance of the original time series. Thus, the first harmonic has a period T equal to the total period, the second harmonic corresponds to half the period of the first harmonic T / 2, the third harmonic to T / 3 and so on.” Antunes et al.,(2016) .
Now we will implement this using some functions of the Google Earth Engine API. This implementation is available at:
https://docs.google.com/document/d/1mNIRB90jwLuASO1JYas1kuOXCLbOoy1Z4NlV1qIXM10/edit
First, we define the band to be used in the time series, the number of harmonic terms:
dependent = 'NDVI'
harmonics = 3
harmonicFrequencies = list(range(1, harmonics+1))def getNames (base, lst_freq) :
name_lst = []
for i in lst_freq:
name_lst.append(ee.String(base + str(i)))
return name_lstcosNames = getNames('cos_', harmonicFrequencies);
sinNames = getNames('sin_', harmonicFrequencies);
independents = ee.List(['constant','t'])
.cat(cosNames).cat(sinNames);
Let’s define the functions to add the harmonic and temporal components to the MODIS image collection. We’ll define the functions to add the harmonic and temporal components to the MODIS image collection. Then we can apply them and fit the terms using linear regression:
def addConstant (image) :
return image.addBands(ee.Image(1));def addTime (image) :
date = ee.Date(image.get('system:time_start'));
years = date.difference(ee.Date('1970-01-01'), 'year');
timeRadians = ee.Image(years.multiply(2 * math.pi));
return image.addBands(timeRadians.rename('t').float());def addHarmonics (image) :
frequencies = ee.Image.constant(harmonicFrequencies)
time = ee.Image(image).select('t')
cosines = time.multiply(frequencies).cos().rename(cosNames)
sines = time.multiply(frequencies).sin().rename(sinNames)
return image.addBands(cosines).addBands(sines)harmonicMODIS2 = Monthly_MD.map(addConstant)
.map(addTime)
.map(addHarmonics);harmonicTrend = harmonicMODIS2.select(independents.add(dependent))
.reduce(ee.Reducer.linearRegression(independents.length(), 1))harmonicTrendCoefficients = harmonicTrend.select('coefficients')
.arrayProject([0])
.arrayFlatten([independents]);fittedHarmonic = harmonicMODIS2.map(lambda image : image
.addBands(image.select(independents)
.multiply(harmonicTrendCoefficients)
.reduce('sum')
.rename('fitted')));
It is thus possible to generate a graph of the harmonic model for the same point previously used:
MD_ndvi = chart.Image.series(**{'imageCollection': fittedHarmonic, 'region': Point_1,
'reducer': ee.Reducer.mean(),
'bands' : 'fitted',
'scale': 250,
'xProperty': 'system:time_start'})MD_ndvi.renderWidget(width='50%')
Time Series Clustering
After obtaining the harmonic model, let’s convert the time series of each random point into a Pandas dataframe and use the TimeSeriesKMeans algorithm to cluster the samples.
df_points = pd.DataFrame([])
for n in range(1,points.size().getInfo() + 1):
feat = ee.Geometry.Point(points.getInfo()
["features"][n-1]['geometry']['coordinates'])
point_ndvi = chart.Image.series(**{
'imageCollection': fittedHarmonic,
'region': feat,
'reducer': ee.Reducer.mean(),
'bands' : 'fitted',
'scale': 250,
'xProperty': 'system:time_start'})
df = point_ndvi.dataframe
df = df.transpose().copy()
df_points = df_points.append(df, ignore_index=True)X = df_points.values
formatted_X = to_time_series_dataset(X)
print(formatted_X.shape)km = TimeSeriesKMeans(n_clusters=3, metric="softdtw")
labels = km.fit_predict(formatted_X)newdata = gpd.GeoDataFrame.from_features(points.getInfo()
["features"])
newdata['Class'] = labels
Finally, we can generate an RGB image with the amplitude, phase and mean NDVI of each pixel, converting these values from HSV space to RGB. Then the result is shown on the map with the classes defined by the cluster algorithm.
phase = harmonicTrendCoefficients.select('sin_1')
.atan2(harmonicTrendCoefficients.select('cos_1'))
.unitScale(-math.pi, math.pi)amplitude = harmonicTrendCoefficients.select('sin_1')
.hypot(harmonicTrendCoefficients.select('cos_1'))
.multiply(5)rgb = ee.Image.cat([phase, amplitude, mean_ndvi]).hsvToRgb()resultmap = folium.Map(location=[-13.0912068,-55.9881647],
zoom_start=10)basemaps['Google Satellite Hybrid'].add_to(resultmap)
latitudes = list(newdata.geometry.y.values)
longitudes = list(newdata.geometry.x.values)
labels = list(newdata['Class'].values)for lat, lng, label in zip(latitudes, longitudes, labels):
if label == 0:
folium.Marker(
location = [lat, lng],
popup = str(label),
icon = folium.Icon(color='red')
).add_to(resultmap) elif label == 1:
folium.Marker(
location = [lat, lng],
popup = str(label),
icon = folium.Icon(color='blue')
).add_to(resultmap) else:
folium.Marker(
location = [lat, lng],
popup = str(label),
icon = folium.Icon(color='green')
).add_to(resultmap)vis_params = {'min': 0, 'max': 1}resultmap.add_ee_layer(rgb, {}, 'phase (hue), amplitude (sat), ndvi (value)')resultmap.add_child(folium.LayerControl())
display(resultmap)
Conclusion
Harmonic models are very useful to smooth the spectral curves of time series, enabling differentiation between targets, especially agricultural crops, with phenological cycles that generate curves with different phases and amplitudes. In addition to NDVI, it is also possible to use other spectral indices in order to group the selected samples. Because it uses an unsupervised algorithm and high resolution images, this example may contain some prediction errors, but the intent is to present the methodology and implementation for future analysis and learning.
Thanks!
Follow me on LinkedIn :
https://www.linkedin.com/in/jo%C3%A3o-otavio-firigato-4876b3aa/
References:
https://www.researchgate.net/publication/222411408
https://www.sciencedirect.com/science/article/pii/S0924271618300066
https://docs.google.com/document/d/1mNIRB90jwLuASO1JYas1kuOXCLbOoy1Z4NlV1qIXM10/edit