Endmember Extraction from Hyperspectral Imagery Using N-FINDR

Hyperspectral imagery often contains mixed pixels—pixels that represent areas with more than one class of interest. Given the complexity of the scenes, and the rigid structure of images, it is unrealistic to assume every pixel to correspond to a single class. These classes could include features like water, trees, shrubs, grass, soil, or roads when performing land cover classification using satellite or airborne imagery. In earth science, the class of interest often is minerals that forms rocks. In any way, trying to find the purest pixels , or endmembers, of each class in our image is something that often useful for further analysis.

In this post, I’ll provide a brief overview of one endmember extraction algorithm reffered to as N-FINDR. But, before diving into the details of the algorithm, it is important to acknowledge the general limitations of endmember extraction algorithms:

Purity is not guaranteed: These algorithms don’t guarantee that the extracted spectra represent a pure spectra of specific class of interest. Instead, they identify the purest spectra present in the image.
Known number of endmembers: The number of endmembers in the image must be known beforehand. This can be estimated using domain expertise, auxiliary measurements, other algorithms, or trial and error.
Linear mixture model assumption: Most endmember extraction algorithms, including N-FINDR, assume a linear mixture model.

How N-FINDR Works

The core idea of N-FINDR is simple yet clever: it identifies the vertices of the simplex formed by the data distribution in reduced dimensions. These vertices are the purest spectra present in the dataset. The general steps of the algorithm is as follows:

Reprojection: Reproject all pixels into a lower-dimensional space using an orthogonal transformation like Principle Component Analysis (PCA) or Minimum Noise Fraction (MNF). MNF is often preferred because it minimizes the influence of noise.
Dimensionality reduction: Reduce the dimensionality to the first \((n−1)\) components, where \(n\) is the number of endmembers. At this stage, the data points should lie inside an \((n-1)\) dimensional simplex.
Iterative vertex selection:
- Randomly select \(n\) points and calculate the simplex’s volume (or area in 2D).
- Iteratively replace one vertex with another pixel. If the new simplex’s volume is larger, accept the replacement. Otherwise, reject it and try again.
Termination: After a fixed number of iterations without improvement, the algorithm terminates, and the final set of pixels is taken as the endmembers.
Reprojection back: Reproject the selected pixels into the original dimensional space to obtain the endmember spectra.

Synthetic Data Example

To illustrate, let’s consider an idealized case of 1,000 synthetic mixed pixels generated from three endmembers. Since we’re dealing with three endmembers, we reproject the data into its first two principal components. In this reduced space, the data points lie perfectly on a 2D simplex—a triangle as shown in the image below:

THis image shows principal component analysis (PCA) of a synthetic mixture of three endmembers, where the data points form a 2D simplex, represented as a triangle.

Image 1. First two principal components of the synthetic mixture of three endmembers

If each data point is a linear mixture of three pure endmembers, the endmembers will be located at the vertices of a triangular region, commonly referred to as a simplex. In this 2D example, identifying the simplex is relatively straightforward through visual inspection. However, as the number of endmembers increases, the dimensionality of the principal component representation also grows, making it increasingly difficult to determine the simplex visually. To overcome this challenge, we employ an iterative simplex-finding algorithm, which can be generalized to handle higher-dimensional data. The animation below illustrates the iterative process of refining the simplex:

Animation illustrating the iterative process of the N-FINDR algorithm, refining the simplex to identify the purest endmember spectra in the data.

Image 2. Ilustration of NFINDR algorithm in action

Real Data Example

In real-world scenarios, hyperspectral data is rarely as ideal as synthetic examples. Complex noise, non-linearities, and other factors often result in an imperfectly formed simplex. Nevertheless, algorithms like N-FINDR can still provide valuable output.

For this example, we examine laboratory-acquired hyperspectral imagery of a sandstone sample. Below is a false-color composite image of the sample:

This image shows false-color composite hyperspectral image of a sandstone rock sample, illustrating the varied mineral composition of the sample.

Image 3. False color composite hyperspectral iamge of a sandstone sample

A quick inspection of the rock suggests that it is predominantly composed of three minerals. Thus, we assign \(n=3\) for this analysis. The results of the MNF transformation and the selected simplex vertices by N-FINDR are shown below:

Image of the first two principal components (MNF) of the sandstone rock sample, showing the identified simplex vertices (purest spectra) marked in red by the N-FINDR algorithm.

Image 4. First two principal components of the sandstone rock sample. The red do represent the selected purest spectra by N-FINDR algorithm.

Notice that the simplex is not perfectly formed, and the triangular structure appears slightly distorted. This suggests minor non-linear behavior in the actual mixtures. Additionally, some outliers lie outside the simplex, possibly influenced by accessory minerals. Such minerals, often present in small amounts, are common in natural rock samples and can affect the spectra of certain pixels. These influences may not be fully captured in a 2D representation.

We can examine the spectra of the selected purest pixels by transforming them back to their original dimensions. Below are the resulting spectra for the three identified endmembers:

Image of spectra of the three identified endmembers from the N-FINDR algorithm, showing distinct absorption features related to minerals such as white mica/clay, carbonate, and quartz.

Image 5. Resulting spectra of the three endmember from N-FINDR algorithm

If you know your mineralogy and infrared spectroscopy you can immedietly recognize these minerals. The red spectra have a strong absorption feature in ~2200 nm region which related the \(Al-OH\) bonds of white mica/clay mienrals. The blue spectra have an absroption feature in ~2330 nm region related to \(CO_3\) bonds of carbonate minerals. The green and bright spectra represent quartz mineral. we still can see a very weak feature in ~2200 nm sugest that it is not 100% pure spectra and there is still a small influence of clay minerals.

While these spectra are not perfectly pure, they are sufficiently representative to be useful for further analysis. For instance, as shown in our previous post, we used these endmember spectra for spectral unmixing and achieved excellent results, with an RMSE of only 0.0186.

To understand the spatial context, we can reshape the extracted data back into its original image structure to visualize the locations of the purest pixels:

Image of false-color composite hyperspectral image of a sandstone sample, with the locations of the purest pixels identified by the N-FINDR algorithm marked in the image.

Image 6. False color composite hyperspectral image of a sandstone sample along with its purest pixel location.

This example shows that N-FINDR is a powerful and intuitive algorithm for endmember extraction. Despite its limitations, such as the need for prior knowledge of the number of endmembers and the assumption of a linear mixture model, it often still provides a usefull results.

How N-FINDR Works

Synthetic Data Example

Real Data Example

Enjoy Reading This Article?