Key Features

Collect Earth is a user-friendly, Java-based tool that draws upon a selection of other software to facilitate data collection. The following training materials include guidance on the use of Collect Earth and most of its supporting software.  Documentation on the more technical components of the Collect Earth system (including SQLite and PostgreSQL) is available on the Open Foris Support Forum. Collect Earth runs on Windows, Mac and Linux operating systems.

Collect Earth Interface

Collect Earth uses a Google Earth interface in conjunction with an HTML-based data entry form. Forms can be customized to suite country-specific classification schemes in a manner consistent with guidelines of the Intergovernmental Panel on Climate Change (IPCC). The default Collect Earth form contains IPCC-consistent land use categories and sub-categories with land use sub-divisions. For guidance on creating new customizations of the Collect Earth data entry form, visit the Open Foris Support Forum.  

Collect Earth was first developed in 2013 and first published in 2016 and has continuously improved by integrating the latest developments and updates in Google Earth Engine. These improvements include incorporating new high-resolution data, such as the European Space Agency’s Copernicus Sentinel 2 and the high-resolution Planet data made available through the agreement between Planet and the Government of Norway’s International Climate and Forests Initiative. It continues to be a free, open access tool that non-remote sensing or GIS experts can quickly learn to use to undertake land-use and land-cover assessments. Users can learn to use the tool through freely accessible online tutorials and a ‘user forum’ where questions can be posted and FAO experts provide assistance.

Users can assess several elements (e.g., number of trees) of land-use and land-cover and their associated changes through a pre-defined time horizon (e.g., 2010–2020) in a sample plot using an ‘html’ data collection form that appears for each plot in the sampling design through the Google Earth interface. The parameters of this form are set at the beginning of an assessment by the user, depending on the type of information the user would like to collect. For example, for land-use and change assessments in a country, the data collection form will include the six IPCC land-use categories, the national sub-categories (defined by the country and consistent with the reporting in the greenhouse-gas inventory), as well as the possible land-use changes from one category to another. The augmented visual interpretation of the plots is supported through the various indices loaded in the Google Earth Engine interface and by using the images in Bing Maps for corroboration.

High spatial resolution imagery and Google Earth Engine

Collect Earth facilitates augmented visual interpretation with low costs thanks to high spatial resolution imagery (e.g., Google Earth Pro, Bing Maps, Yandex, Baidu, and others) and high temporal resolution imagery that can be accessed through cloud computing (e.g., Google Earth Engine, Google Earth Engine Code Editor). Google Earth’s virtual globe is largely comprised of 15 meter resolution Landsat imagery, 2.5m SPOT imagery and high resolution imagery from several other providers, especially from MAXAR.  Microsoft’s Bing Maps presents imagery provided by Digital Globe ranging from 3m to 30cm resolution.  Yandex feeds its Yandex.Maps portal by purchasing satellite imagery from Scanex. The One Atlas data are integrated into Yandex.Maps, ensuring access to fresh SPOT satellites 1.5m resolution images on a global scale, and Pléiades satellite 0.5m resolution product over cities. Planet Labs, Inc. is an American private Earth imaging company based in San Francisco, California. Through Norway’s International Climate & Forests Initiative, anyone can now access Planet’s high-resolution, analysis-ready mosaics of the world’s tropics in order to help reduce and reverse the loss of tropical forests, combat climate change, conserve biodiversity, and facilitate sustainable development. Real and False-color mosaics of <5 m/px mosaics of the tropics with monthly cadence from August 2020 onwards (and an archive from December 2015 – August 2020 of Bi-Annual mosaics) offer a better understanding of the vegetation as it uses the Near Infrared (NIR) band.

Google Earth Engine’s web-based platform facilitates access to United States Geological Survey 30m resolution Landsat imagery, to 250 m Moderate Resolution Imaging Spectroradiometer (MODIS) instruments on-board of NASA’s Terra and Aqua satellites and to Sentinel’s 20 m resolution images from the European Space Agency (ESA). Collect Earth synchronizes the view of each sampling point across all three platforms.

Which satellite images and graphics can you find in the GEE app?

SatelliteSpatial resolutionTemporal resolutionImagery available since
MODISLow (250 m)High (daily revisit time, graph shows less-cloudy image during16 days)  2000
Landsat 7/8High (30m)Low (16 days revisit time)2000
Sentinel 2High (20m)High (5 days revisit time)2015

The imagery used within Google Earth, Bing Maps, Yandex, Planet Imagery and Google Earth Engine differ not only in their spatial resolution, but also in their temporal resolution.  Collect Earth enables users to enter data regarding current land use and historical land use changes.  Users can determine the reference period most appropriate for their land use monitoring objectives.  The IPCC recommends a reference period of at least 20 years based on the amount of time needed for dead organic matter and soil carbon stocks to reach equilibrium following land-use conversion.   Most of the imagery available in Bing Maps and Google Earth have been acquired at very irregular intervals over the past 10 years.  In contrast, Earth Engine contains over 40 years of imagery that has been acquired every 16 days.  

Sampling design with the Grid Generator/QGIS

Collect Earth is a sample-based tool. After identifying the scale of data collection (global/national/district/local), the sampling method for the grid has to be planned. The grid can be created using QGIS, Google Earth Engine (via the Grid Generator application), ArcGIS, SEPAL or similar geospatial tools.

The Grid Generator app developed with GEE has the form of a website and offers different options to create the grid. This tool allows to design and generate systematic/random grids for your Collect Earth project in a given area of interest that may be (i) a country, province or region, administrative boundary, (ii) a shapefile/polygon uploaded by the user as an asset in GEE or (iii) a drawn polygon or rectangle.  The user may create a systematic grid setting the distance between the plots in meters (will be always the same distance) or random grids setting the number of plots to be generated within the area of interest. The user may also add ancillary data to the plots of the grid such as GAUL (global administrative unit layers) country/province/district information, DEM data from SRTM-30m ( elevation/slope/aspect) or IPCC ancillary data, specific for GHG inventories. With these options the user can augment the information contained in the grid that will be used during the analysis phase of the data collected with Collect Earth.

 

Survey design with Collect

For the desktop version of Collect Earth, the survey needs to be created and configured within a separate survey design tool called Collect, which is downloadable from the Open Foris suite of tools. Surveys are organized into separate panels of questions called “cards,” which help structure information by theme and better drive the logic of the survey questions. The cards are navigated via a series of tabs at the top. Once the user is working in Collect, all the surveys can be on modified or published mode, in both cases the user can edit and modify the survey design but only when it is in the publish mode we can manage the data and work with the data cleansing. In order to create a new survey the best solution is to clone an old one and starting from that survey customize the new one.

Database options: SQLite and PostgreSQL

The data entered in Collect Earth is automatically saved to a database.  Collect Earth can be configured for a single-user environment with a SQLite database.  This arrangement is best for either individual users or for geographically disperse team. A PostgreSQL database is recommended for multi-user environments, particularly where users will work from a shared network.  The PostgreSQL configuration of Collect Earth facilitates collaborate work by allowing users to see in real time when new data has been entered.  It also makes it easier for an administrator to review the work of others for quality control purposes.

By default, Collect Earth uses an SQLite database that is stored locally on each user’s computer (i.e., allowing users to work individually). However, a Postgres database can also be configured to enable users on a single network to automatically pool data into the same database (i.e., allowing multiple users to work on the same assessment simultaneously through a shared network). Whether using an SQLite or Postgres database, Collect Earth also generates data tables that can be shared and backed up.

Data & image analysis and quality control

Generate statistics and graphs: Saiku and Microsoft Excel pivot tables are the most common data analysis software packages for generating statistics and graphs using Collect Earth data. Summarizing the data by administrative jurisdiction and providing averages or totals for the entire area of interest are the most common ways to present results. When using the results to recommend interventions, it is important to consider the goals of the mapathon. For example, if the goal was to assess the status of tree cover outside of forests, it is often useful to report average tree cover per land use type, such as tree cover in settlement areas or on croplands.

Both types of databases SQLite and PostgreSQL automatically populate Saiku Server, an open-source web-based software produced by Meteorite consulting.  A version of this open-source software has been customized for visualizing and analyzing Collect Earth data.  Countries using Collect Earth for a national land use assessment may generate data in Collect Earth for tens of thousands of points.  Saiku organizes this wealth of information and enables users to run queries on the data and immediately view the results in tabular format or as graphs.  

Generate maps using Collect Earth data: Maps are a powerful way to present the information generated from a Collect Earth mapathon. There are multiple ways of presenting Collect Earth data on a map:

(i) Simple mapping: Display the sample plots themselves, classified according to one of the variables that were collected with a open source software such as QGIS. These types of maps are most useful when a systematic sampling approach is used as the maps evenly distribute the data across the full area of interest.

(ii) Voronoi mapping:  Land cover maps covering for example an entire national territory by applying the voronoi statistical method in QGIS. Voronoi maps are constructed from a series of polygons formed around a sample point (in our case the plot). 

(ii) Wall-to-wall: Another approach to mapping is developing wall-to-wall land-cover and tree-cover maps using SEPAL or Google Earth Engine. In the Google Earth Engine approach, one would use the Collect Earth data to train the classification algorithm of your choice, which will accordingly assign the pixels (unsampled areas) into one of the land use/land cover types even if that pixel was not part of the Collect Earth sampled plots. For example, imagery classified as cropland using Collect Earth has a certain spectral signature value. The trained algorithm can use the information stored in the memory from training to remember and classify new pixels of imagery with similar spectral signatures as “cropland,” even if the pixels are  outside of the sample area. Thus, these maps are called “wall-to-wall” when they are made for the full study area.

Control of the data collected:

– Data cleaning: Review of the entire database in SAIKU in order to identify possible errors.

– Review of 5% of the surveyed plots for errors and make corrections in Collect Earth or using data analysis software.

– Data quality control is essential to produce results that are as accurate as possible.

– Depending on the time and resources available, as well as the objectives of the survey, the approach to addressing errors may be to re-evaluate plots with errors or to “discard” erroneous data.

– Although “discarding” the erroneous data is the quickest solution, this approach may cause problems of statistical bias if there are systematic errors or of statistical significance if the sample size is too small to produce representative results. However, sometimes poor image quality or other problems make a re-evaluation of the plot impossible.

Field verification of data:

Field verification means that a subset of the data collected through the mapathon is corroborated with what can be seen in the field during a visit.