NDVI is a robust and versatile metric for assessing vegetation health on a global scale, but it is not optimal for all ground conditions. When vegetation is sparse, NDVI can fluctuate even if the state of vegetation does not change. This is a consequence of how soil in the area changes brightness depending on how wet or dry it is.

To understand this phenomenon, recall that NDVI follows from the idea that the reflection differential between red light and near infrared light (NIR - RED) will vary proportionally with the amount of vegetation present in the area being observed (See: NDVI from First Principles).

We can create an intuition for how NDVI works and where it breaks down with a stylized example, starting with an empty field. An empty field will reflect near-infrared and red light at nearly equal levels leading to a low NDVI:

Now suppose we cover 20% of this field with vegetation. NDVI goes up of course:

If we cover the entire field with vegetation, NDVI will approach 1.0:

Now let’s return to our field that is 20% vegetated. Suppose it rains. The 80% of the field covered in soil is now wet and, as a result, darker. The optical implication is that both near-infrared and red reflection levels will go down, by about the same amount. Note this causes NDVI to increase:

This of course is a problem. The NDVI for this area was 0.21. It rained, and NDVI went up to 0.25. We want a vegetation index to vary only as vegetation changes and not because it rains! The opposite thing happens when soil gets lighter as it becomes more dry:

Let’s look at this phenomenon mathematically, starting with the definition of NDVI:

The change in reflectance from varied soil color causes *NIR* and *RED* to increase or decrease by similar amounts. The changes are similar enough that we can assume they are actually the same without affecting the analysis that follows. We’ll denote the change as epsilon. The NDVI result for changed soil conditions will thus be:

Epsilon will be determined by how much of the field is covered by soil and how dark or light that soil is. If the field is mostly covered in vegetation, there will not be much visible soil at all meaning epsilon will be small relative to (NIR+RED). Ie, NIR+RED. Thus:

The math here simply affirms that if there is not much visible soil, NDVI will not be sensitive to soil color. However, when substantial soil is visible epsilon will be larger and will be significant relative to (*NIR+RED*):

The NDVI value will be materially impacted. What can be done? To start, let’s go back to this equation:

Notice that epsilon does not affect the numerator of NDVI because it cancels. So, if you simply used *NIR-RED* as your index, you would not have any trouble from varying soil colors! You would have a bigger problem though. NDVI would vary based on just how sunny it is: On a day where total light intensity was 50% of normal because of moderate clouds, *NIR-RED* would also be 50% lower. Hence the need for normalizing by dividing by total intensity (*NIR+RED*).

There is actually a simple mathematical adjustment you can make to the NDVI formula to reduce its sensitivity to epsilon: add a constant value to the denominator:

When soil effects manifest, we will get:

To see how this works, suppose L were very large relative to NIR and RED:

So for large L, we have no sensitivity to epsilon (which was our objective), but we have reduced the formula to a scaled version of *NIR-RED*, which we know is unsatisfactory. So all we did was trade one problem for another.

At the other extreme, at L=0, we just end up back at the NDVI formula, which is too sensitive to epsilon. There is a compromise: select an L that is large enough to desensitize the numerator to epsilon but not too large as to completely undo the normalization effect of dividing by (NIR-RED).

For a given region, a practitioner can infer an effective value for L by calibrating against ground truth. In general however, researchers have concluded that L=0.5 creates a vegetation index substantially less sensitive to soil color than NDVI and still sufficiently normalized so that overall changes in light intensity do not change the quotient materially:

There is one final step to make this new measure convenient: We would like it to always fall between -1 and 1. By arbitrarily increasing the numerator by a value of L, we have made it such that the max value of the index (occurring, theoretically, at NIR=1, RED=0) is 1/1+L. If we want the max value to be 1.0, we need to multiply by 1+L:

The above is an index that ranges from -1 to 1, is insensitive to soil color, insensitive to total light intensity and increases in proportion to vegetation health and density. It is thus better than NDVI for less vegetation dense regions where soil color can change. This index is called the Soil-Adjusted Vegetation Index or SAVI. It was conceived by Alfredo Huete in 1988.

Let’s compare NDVI and SAVI:

The figure above illustrates the difference between NDVI and SAVI at different vegetation densities. Notice that:

- At high vegetation density, NDVI is stable because minimal soil is visible
- At sparse and moderate vegetation levels, NDVI has substantial variance due to soil color sensitivity. SAVI has only modest variance.

A practitioner will prefer SAVI to NDVI in any situation where significant soil is visible and brightness changes are possible. SAVI is not unequivocally superior however; in mitigating soil brightness effects we have compromised the index’s previous complete insensitivity to total light intensity.

There is a second mathematical property that SAVI gains from the additional L term: the SAVI saturation threshold is higher than NDVI. To get an intuition for why, recall that NDVI saturates because red reflectance (*RED*) gets close to zero well before vegetation density reaches a maximum. (See NDVI Saturation.)

Let’s review saturation mathematically:

As *RED* gets close to zero, NDVI will still increase as *NIR* does, but only asymptotically. Ever larger increases in NIR have an ever diminishing effect on NDVI; i.e. saturation. But now consider the effect of adding L to the denominator:

SAVI will still reach saturation, but it will happen at a larger *NIR*. (For intuition on this, set *RED *= 0 and make *L* very large relative to NIR. As such, *L* dominates the denominator. The function is now effectively just NIR/L. It increases linearly with *NIR*; i.e. there is no convexity and thus no saturation! Generalizing: as L increases from 0, the saturation point increases with it.)

The plot below illustrates the two functions, with L at a realistic value:

Thus, the additional term in the numerator allows SAVI to register increasing vegetation density after NDVI has already saturated.

Any time there is soil exposure and especially if its moisture content can change, SAVI is going to be superior to NDVI. The better saturation characteristics of SAVI are highly desirable in any dense vegetation scenario.