U.S. Geological Survey
20200731
Remote Sensing Shrub/Grass National Land Cover Database (NLCD) Base Products for the Western U.S. - Annual Herbaceous Fractional Component
Raster Digital Data Set
Data Release
3rd release, version 071520
Earth Resources Observation and Science Center, Sioux Falls, SD
U.S. Geological Survey
https://doi.org/10.5066/P9MJVQSQ
https://www.mrlc.gov/data-services-page
https://www.mrlc.gov/data
Matthew Rigge
Collin Homer
Lauren Cleeves
Debra K Meyer
Brett Bunde
Hua Shi
George Xian
Matthew Bobo
Spencer Schell
20200128
Quantifying Western U.S. Rangelands as Fractional Components with Multi-Resolution Remote Sensing and In Situ Data
Publication (Journal)
Remote Sensing
2020, 12, 412
Miller, R.F., Bates, J.D., Svejcar, T.J., Pierson, F.B., Eddleman, L.E., 2005. Biology, ecology, and management of western juniper. Technical Bulletin 152. Oregon State University Agricultural Experiment Station. Available at: http://juniper.oregonstate.edu/bibliography/documents/phpQ65pOk_tb152.pdf
Quantifying Western U.S. shrublands as a series of fractional components with remote sensing provides a new way to understand these changing ecosystems. The USGS NLCD team in collaboration with the BLM has produced the most comprehensive remote sensing-based quantification of Western U.S. shrublands to date. Nine shrubland ecosystem components, including percent shrub, sagebrush (Artemisia spp), big sagebrush, herbaceous, annual herbaceous, litter, and bare ground cover, along with sagebrush and shrub heights, were quantified at 30-m resolution by mapping region. Each region required extensive ground measurement for model training and validation, two scales of remote sensing data from commercial high-resolution satellites and Landsat 8, and regression tree modeling to create component predictions. In the mapped portion (1,946,100 km²) of the total study area (2,557,556 km²), bare ground averaged 46.8%, shrub 14.4%, sagebrush 4.4%, big sagebrush 3.1%, herbaceous 22.8%, annual herbaceous 4.3% and litter 15.6%. Shrub height averaged 39.8 cm and sagebrush height 10.5 cm. Component accuracies using independent validation averaged R² values of 0.46, RMSE of 10.37 and nRMSE of 0.12, and cross validation averaged R² values of 0.72, RMSE of 5.09 and nRMSE of 0.062. Component composition strongly diverges by level III ecoregions, where 13 of 22 ecoregions are bare ground dominant, 8 are herbaceous dominant, and one is shrub dominant. Sagebrush physically covers 86,219 km², or 4.4%, of our study area, but it is present in 835,507 km², or 42.9%, of the non-masked area of our study area, underscoring its widespread distribution. Fractional component maps will be integrated into the 2016 National Land Cover Database (NLCD). Component products can be downloaded from www.mrlc.gov.
The goal of this project is to provide a rigorous large-area shrubland-habitat classification and inventory with statistically validated products and estimates of precision across the Western U.S. This is the third release of the Shrubland Base products.
Although this Federal Geographic Data Committee-compliant metadata file is intended to document the data set in nonproprietary form, as well as in Esri format, this metadata file may include some Esri-specific terminology.
20130515
20190925
ground condition
As needed
-130.2380
-99.6688
52.7905
26.2039
ISO 19115 Topic Category
biota
environment
geoscientificInformation
imageryBaseMapsEarthCover
USGS Thesaurus
shrubland ecosystems
terrestrial ecosystems
Alexandria Digital Library Feature Type Thesaurus
shrublands
None
shrub
sagebrush
big sagebrush
herbaceous
annual herbaceous
litter
grass
vegetation
bare ground
rangeland
shrubland
Common geographic areas
United States
Washington
Oregon
Montana
North Dakota
South Dakota
Wyoming
Idaho
Nebraska
Nevada
Utah
California
Colorado
Arizona
Texas
New Mexico
None
NM
TX
AZ
CO
CA
UT
NV
NE
ID
WY
SD
ND
MT
OR
WA
Great Basin
Arizona Plateau
Black Hills
Blue Mountains
Chihuahuan Desert
Colorado Plateau
Columbia Plateau
Grand Canyon
Middle Rockies
Rocky Mountains
Gunnison
Sonoran Desert
Southwest Tablelands
Three Forks
Wasatch
Western US
Yellowstone
Northern Mountainous
Any downloading and use of these data signifies a user's agreement to comprehension and compliance of the USGS Standard Disclaimer. Ensure all portions of metadata are read and clearly understood before using these data in order to protect both user and USGS interests.
These data are all provisional products. There is no guarantee of warranty concerning the accuracy of the data. Users should be aware that these data were developed from models which can contain some local error. Also, temporal changes may have occurred since data were collected, resulting in discrepancies between data and actual surface conditions. Users should not use these data for critical applications without a full awareness of their limitations. Acknowledgement of the originating agencies would be appreciated in products derived from these data. Any user who modifies these data is obligated to describe the types of modifications they perform. User specifically agrees not to misrepresent the data, nor to imply that changes made were approved or endorsed by the U.S. Geological Survey. Please refer to https://www.usgs.gov/privacy.html for the USGS disclaimer.
U.S. Geological Survey
Customer Services Representative
mailing and physical
47914 252nd Street
Sioux Falls
SD
57198-0001
U.S.
605-594-6151
605-594-6589
custserv@usgs.gov
U.S. Geological Survey (USGS)
Bureau of Land Management (BLM)
Environment as of Metadata Creation: Microsoft Windows 7 Version 6.1 (Build 7601) Service Pack 1; Esri ArcGIS 10.5.1 (Build 7333) Service Pack N/A (Build N/A)
A field based independent accuracy assessment was completed for the shrubland products. A total of 1860 sites from across the Western U.S. were randomly chosen within 227 5-km radius sites for field collection of validation data. The relationship between each completed product and the validation dataset was analyzed using statistical formulas in Microsoft Excel.
Independent Validation --
Bare Ground RMSE 14.8, R² 0.70;
Herbaceous RMSE 13.1, R² 0.67;
Litter RMSE 8.9, R² 0.35;
Shrub RMSE 10.6, R² 0.37;
Sagebrush RMSE 7.5, R² 0.40;
Big Sagebrush RMSE 7.8, R² 0.16;
Annual Herbaceous RMSE 9.8, R² 0.58;
Sagebrush Height RMSE 25.6, R² 0.24;
Shrub Height RMSE 39.5, R² 0.19
A cross-validation accuracy assessment was completed for the shrubland products. A total of 840,000 points were area-weighted for mapping zone size contribution differences.
Cross Validation --
Bare Ground RMSE 8.0, R² 0.85;
Herbaceous RMSE 6.3, R² 0.79;
Litter RMSE 3.8, R² 0.75;
Shrub RMSE 6.0, R² 0.73;
Sagebrush RMSE 3.4, R² 0.63;
Big Sagebrush RMSE 4.1, R² 0.63;
Annual Herbaceous RMSE 4.1, R² 0.66;
Sagebrush Height RMSE 7.8, R² 0.59;
Shrub Height RMSE 17.8, R² 0.62
The methods employed to map fractional vegetation components in shrubland ecosystems in the Western U.S. include: (1) modelling shrublands as a series of independent continuous field components that can be combined and customized by any user at multiple spatial scales; (2) collecting ground-measured plot data on 2-meter WV2 or WV3 (WorldView) or Pleiades high-resolution imagery in the same season the satellite imagery is acquired; (3) effective modelling of ground-measured data on 2-meter imagery to maximize subsequent extrapolation; (4) acquiring multiple seasons (spring, summer, and fall) of 30-meter Landsat imagery for large-area modelling; (5) using regression tree classification technology that optimizes data mining of multiple image dates, indices, and bands with ancillary data to extrapolate 2-meter scale products to Landsat-scale imagery; and (6) employing rigorous accuracy assessment of model predictions to enable users to understand the inherent uncertainties. Users are cautioned that multiple scale fractional estimations are subject to the Modifiable Areal Unit Problem (MAUP), where aggregation can cause different variance patterns in the data. In this case, the modeled range of a given variable can become compressed as the spatial size of the pixel increases. Hence, models developed with fractional estimates at the 2-meter scale, may not directly apply at the 30-m scale.
This provisional fractional estimation of nine shrubland habitat variables in the Western U.S. is the version dated 20200715. Data set is considered complete for the information presented, as described in the abstract. Users are advised to read the rest of the metadata record carefully for additional details.
A formal accuracy assessment of the horizontal positional information in the data set has not been conducted.
A formal accuracy assessment of the vertical positional information in the data set has either not been conducted or is not applicable.
The method employed to map fractional vegetation components in the Western U.S. shrublands consists of three key steps: collection of plot data in the field to be used in classifying fractional vegetation components on high spatial resolution images, calibrating density prediction models using reference data and Landsat 8 spectral bands and extrapolating the developed models spatially to map per-pixel fractional components. Products consist of four primary variables including: percent bare ground, percent herbaceous, percent shrub, and percent litter; and five secondary variables including: percent sagebrush, percent big sagebrush, percent annual herbaceous, and shrub and sagebrush heights in cm. Secondary variables are 'nested' within the primary variables; they occur only within the spatial extent of the primary variable shrub or herbaceous for annual herbaceous.
20132017
Field Collection --
Field measured vegetation data for model training were measured within 331 pre-determined high-resolution collect sites across the Western U.S. Collect sites were ~15 by 15 km in size and were sited to best represent the range of biophysical, ecological, and climatic conditions within the mapping area. Plot data within each collect site were recorded on tablets equipped with high-accuracy GPS units and GIS software.
Field plots were sited based on a combination of high-resolution imagery and actual ground conditions. Individual pixel(s) were evaluated for sampling based on examining the local imagery displayed on the tablet and ground conditions simultaneously. Those pixels that well represent ground conditions and were not 'mixed' pixels were selected for sampling. Plots were collected ad hoc to represent the 1) range of each component, 2) range of elevation, aspect, and slope, 3) range of management practices, vegetation condition, and disturbance, and 4) range of color and brightness of bare ground within each collect site. At each plot, cover was estimated from an overhead perspective (satellite), and the total cover of all the primary components could not sum beyond 100 percent. Height of all shrubs and sagebrush was measured (in cm) using meter sticks. Height measurements were intended to measure the average canopy height within each plot, not extreme heights. Each high-resolution collect site contained approximately 95 field plots.
High-resolution data were supplemented with 30 m Landsat - scale training mainly located between high-resolution collect sites in disturbed, unique, and large homogeneous areas. Additional 30 m pixels located in recently burned areas were collected, and data from the BLM (Bureau of Land Management) AIM (Assessment, Inventory, and Monitoring) data were also used to supplement the 30 m Landsat plots.
Validation Data --
Approximately 227, 5 km radius sites outside of high-resolution collect sites were randomly selected for independent validation data collection. In each validation site, an average of 10 randomly placed points were sampled, with a total of 1860 points. The sampling protocol at each validation site consisted of seven, 1-m² quadrants placed every 5 m along a 30 m transect. The randomly selected points served as the starting point of each transect, with the transect direction also randomly chosen. The seven observations were averaged and converted to a 30 m resolution pixel to be used for comparison to Landsat-scale model results.
20132019
High-resolution Classification --
The proportion of each of the nine components occurring within high-resolution footprints were classified on a per-pixel basis independently by using commercial regression tree (RT) software called Cubist (Quinlan, 1993), which identifies empirical relations between each component and the high-resolution data. Each RT model was run using a committee of 10 members, maximum of 500 (unbiased) rules, and a 10% extrapolation allowed. All eight WorldView spectral bands were used as independent inputs, while the rasterized field plot values were used as the dependent variable. ERDAS 2011 was used to spatially apply the rules generated in Cubist across the high-resolution image. Masks were created for the non-shrubland components, forest, and cloud/shadow areas. The spatial product was reviewed for accuracy and consistency with field observations.
20132018
Landsat Classification --
Component predictions (described in the preceding step) from the high-resolution imagery footprints were aggregated from 2-m cells to 30-m cells for use as training data for Landsat imagery. The training data were then refined by using a filtering model to eliminate outlier pixels, only those pixels with prediction summations for the four primary components (percent bare ground, shrub, herbaceous, and litter) that ranged from 90 percent to 110 percent were retained for training. In addition, the training data for each of the nine component response variables were divided into three approximately equal bins of low, mid and high training values.
Shrubland components were mapped individually using different regression models produced by Cubist version 2.08. Landsat 8, at-satellite reflectance, scenes for three seasons were used to map shrubland components. To improve mapping efficiency, seasonal Landsat 8 imagery mosaics were produced for each of the 21 independent mapping regions. A normalization approach was developed to normalize the individual scenes in the same season. This method was used on Landsat 8 spectral bands 2-7 and the thermal band. Each season was then mosaicked with fewer seam lines. Regression modelling used to map a shrubland component requires several ancillary datasets related to topographic features as independent variables. Elevation data is also needed for the model to extrapolate spatial distribution of shrubland components. However, Landsat 8 thermal band (band 10) data was used to replace DEM (elevation) in the regression model because thermal data is representative of elevation variation but does not create DEM data related artificial effects.
20132018
Post modelling --
Cloudy areas in the Landsat mosaics were limited in extent and were masked out as needed. Component predictions were modelled in these areas using the two seasons that were cloud free. The two season predictions were buffered and used as fill for the cloudy areas.
Burned areas were identified using 2016 GeoMac (Geospatial Multi-Agency Coordination) fire perimeters. The burn scars were filled, like the cloud fill, using the seasonal prediction, often fall, with the defined burn perimeter. For past fires, from 2010 on, the Monitoring Trends in Burn Severity (MTBS) severity layer was used to re-code shrub/sagebrush and nested components. Within the burn areas, where the severity was high or moderate, these components were re-coded to 0% coverage.
In some mapping regions an extent mask was applied to a component prediction (usually sagebrush or annual herbaceous) to define those areas that contain the specific component vegetation. The extent is determined by a combination of topographic inputs with elevation data an important driver. Pixels where the component is outside the extent were re-coded to 0% coverage.
If sagebrush is greater than 80% of the shrub cover, then the higher of the shrub or sagebrush height estimate is inserted for both the shrub and sagebrush height.
20132019
Masking --
Shrubland in the Western U.S. is often mixed with other land cover types such as forest lands that have tree cover of varying densities. To map distributions of shrubland ecosystem components, it is therefore necessary to develop a mask to characterize the extent of shrubland and non-shrubland vegetation across the entire region. This mask map was generated using satellite imagery and ERDAS models. Four datasets, including three seasons of Landsat normalized difference vegetation index (NDVI) and/or the modified soil-adjusted vegetation index (MSAVI), land cover type (NLCD2011), and tree canopy layers from the (NLCD2011) were used to build mask generation models. Mask development occurred in two phases; forest masking and other (non-shrubland) land cover type masking.
Forest Masking -
The four datasets mentioned above were used to generate the forest mask. The forest masking used a combination of forest canopy cover greater than 40%, along with certain NDVI or MSAVI thresholds, and NLCD2011 forest cover.
Non-forest Masking --
All developed urban areas in NLCD2011 were masked from the shrubland predictions. Cultivated crops and pastures/hayfields were masked using a combination of certain NDVI values and NLCD2011 classes and updated with the latest (2013) NASS CDL (Crop Data Layer) dataset. All areas classified as agricultural (non-rangeland) in NLCD, CDL and meet the NDVI or MSAVI thresholds were masked. Water, snow and ice-covered areas in NLCD2011 were masked from the shrubland predictions. NWI Normalized Water Index was also calculated to update the water body masking. After the mask modelling was completed the mask was compared to the Landsat mosaics used in the component processing, hand edits were then applied where needed.
20132019
Mosaicking
The final step was mosaicking all 21 mapping region components together. Within the overlap between study areas a cut-line was used in masked areas or along terrain features to "stitch" the final predictions together. This process resulted in a wall-to-wall prediction from the mid U.S. to the western edge of the U.S. encompassing the largest area of shrublands.
List of Mapping Areas-
Arizona Plateau
Black Hills
Blue Mountains
California - NE
California - Central
Central Great Basin
Chihuahuan Desert
Colorado Plateau
Columbia Plateau
Grand Canyon
Gunnison
Middle Rockies
Montana
North Central Great Basin
Northwest
Sonoran Desert
Southwest
Southwest Tablelands
Three Forks
Wasatch
Wyoming
20190401
Landsat Scenes -
The smallest mapping region required 12 Landsat scenes and the largest mapping region required 60 Landsat scenes for mapping. A total of 975 Landsat scenes were used for the 21 mapping regions covering the Western U.S.
20132017
Landsat processing -
Additional modelling was completed to fill data voids in the first release. Two large areas were in the Yellowstone region and the Northern mountainous regions of Idaho and Montana. Composite Landsat 8 images for three seasons were used to map shrubland components. The Landsat 8 bands were downloaded using the Google Earth Engine Code Editor (GEE). The spring imagery was selected from May - June months, summer July - Aug months, and fall Sept - Oct months. All scenes collected had a threshold of <30% cloud cover and were dated between 2016 - 2018. Each pixel is a median value from all Landsat images that fit the threshold. The downloaded GEE scenes were converted to a projected raster (.img), with a cell size of 30m.
Training -
Yellowstone 761 30m training plots were collected.
Northern ID/MT 1224 30m training plots were collected.
No validation.
Same modelling process as previously used.
20191015
20200715 version addresses Pinyon-Juniper woodland masking –
For this third release we addressed component predictions within the Pinyon-Juniper woodland regions. Pinyon-Juniper woodlands and forest occur in areas with climates transitional to forest and have been expanding in modern times due to fire suppression. Since tree cover is often spectrally confused with shrub, our shrub cover estimates tend to be positively biased in pixels with pinyon-juniper cover. In this new version of the base, we address this issue with two strategies. First, we exclude additional pixels from our mapping region, specifically, pixels with greater than 25% tree canopy cover were included in our non-rangeland mask. We chose this threshold as it represents the threshold between “phase II” of woodland succession, where trees are co-dominant, and phase III, where trees dominate (Miller et al. 2005). Next, we re-coded the shrub cover pixels which fall in the Pinyon-Juniper woodland regions. We reduced the shrub cover prediction using pixel values between 8-25% tree canopy cover. The secondary components of sagebrush and big sagebrush were reconciled to the new shrub values by retaining the original ratios between the primary and secondary components.
20200715
Raster
Grid Cell
87666
71204
1
Albers Conical Equal Area
29.5
45.5
-96.0
23.0
0.0
0.0
row and column
30.0
30.0
meters
WGS_1984
WGS 84
6378137.0
298.257223563
Attribute Table
Table containing attribute information associated with the data set.
Producer defined
Value
Column for value indicating per-pixel percent for bare ground, shrub, herbaceous, litter, sagebrush, big sagebrush, and annual herbaceous, component range from 0 to 100, a value of 101 for the masked areas, and a value of 102 for no data areas. For sagebrush height and shrub height the value indicates per-pixel centimeters, with a value of 998 for masked areas and 999 for no data areas. The raster attribute table for the Error Map contains a column for a value indicating per-pixel absolute error in percentage for bare ground, shrub, herbaceous, litter, sagebrush, big sagebrush and annual herbaceous. The height components indicate absolute error in centimeters.
Producer defined
0
102 or 999
Percent or Centimeter
Count
All raster attribute tables include a column for count describing a nominal integer value that designates the number of pixels that have each value in the file; histogram column in ERDAS Imagine raster attributes table.
Producer defined
0
3391040795
Integer
Red
Red color code for RGB. The value is arbitrarily assigned by the display software package, unless defined by user.
Producer defined
0
1
Percent
Green
Green color code for RGB. The value is arbitrarily assigned by the display software package, unless defined by user.
Producer defined
0
1
Percent
Blue
Blue color code for RGB. The value is arbitrarily assigned by the display software package, unless defined by user.
Producer defined
0
1
Percent
Opacity
A measure of how opaque, or solid, a color is displayed in a layer.
Producer defined
0
1
Percent
The entity and attribute information provided here describes the tabular data associated with the data set. Please review the detailed descriptions that are provided (the individual attribute descriptions) for information on the values that appear as fields/table entries of the data set.
The entity and attribute information was generated by the individual and/or agency identified as the originator of the data set. Please review the rest of the metadata record for additional details and information.
U.S. Geological Survey
GS ScienceBase
mailing address
Denver Federal Center, Building 810, Mail Stop 302
Denver
CO
80225
United States
1-888-275-8747
sciencebase@usgs.gov
Although these data have been processed successfully on a computer system at the USGS, no warranty expressed or implied is made by the USGS regarding the use of the data on any other system, nor does the act of distribution constitute any such warranty. Data may have been compiled from various outside sources. Spatial information may not meet National Map Accuracy Standards. This information may be updated without notification. The USGS shall not be liable for any activity involving these data, installation, fitness of the data for a particular purpose, its use, or analyses results.
Digital Data
https://doi.org/10.5066/P9MJVQSQ
None
20200731
U.S. Geological Survey
Customer Services Representative
mailing and physical
47914 252nd Street
Sioux Falls
SD
57198-0001
U.S.
605-594-6151
605-594-6589
custserv@usgs.gov
FGDC Content Standard for Digital Geospatial Metadata
FGDC-STD-001-1998