The Quest for Fit-for-Purpose Earth Observation and Remote Sensing Data

By Maryam Rabiee

Photo by SpaceX on Unsplash.

The Sustainable Development Goals’ (SDGs) framework was set to transform our world by 2030, however, progress on the Global Goals is lagging. According to the World Bank’s Statistical Performance Indicators, most statistical systems still struggle to provide data on the SDGs. Even so, new data sources and innovative approaches using big data techniques, spatial analysis, predictive modeling, and other technologies are creating a range of new datasets that can help measure and monitor SDG-related targets and indicators. And while data availability is vital to tackling any global agenda, more data does not always guarantee good decision-making. To take meaningful action--whether it be a global crisis, like the COVID-19 pandemic, or the recent earthquake in Haiti--policymakers and other SDG stakeholders must understand what data are fit-for-purpose.

Earth observation (EO) and remote sensing products have emerged as an important source of data that present an opportunity to monitor environmental, agricultural, and other SDG-related indicators in fine temporal and spatial resolutions. With the increase of near real-time information, how do users identify the best-suited data source for their application of interest? Do all SDG applications benefit from near real-time analysis? And does a 30-m spatial resolution provide meaningful insights across all urban and rural settings?

Such questions necessitate the formation of a knowledge base that supports more informed data selection processes. Advancements in the integration of innovative methods and technologies will not serve the people and the planet if we cannot align data production efforts with actual social, economic, and environmental needs.

What Kind of Data Do We Need, and For What Purpose?

Asking the right questions is challenging, but necessary. With the rise of big data and new methods of measuring and monitoring the SDGs, the UN and other SDG stakeholders are exploring ways to utilize these new data sources for policymaking. For example, a recent study on big data for the SDGs by Allen et al. highlights that for the most part, the production of new datasets is demand-driven. However, National Statistical Offices (NSOs) still need to determine their relevance for specific data and SDG monitoring needs. 

Organizations and initiatives, such as the Group on Earth Observations (GEO), the POPGRID Data Collaborative (POPGRID), and the UN-GGIM have helped to expand and promote the use and production of EO and remote sensing data for the SDGs by using a fit-for-purpose approach. Many of the SDG indicators require timely and reliable population estimates, and data providers from the EO and remote sensing communities are developing products that combine information from censuses with satellite-derived geospatial features to produce a range of gridded population datasets. And while these gridded population products often produce more reliable and up-to-date data than traditional data sources alone, they may provide varying outputs even when applied to the same application. 

Without asking the right questions about the data needed and for what purpose, a user could end up underestimating the population impacted by a flood or overestimating infectious disease rates in a vulnerable region. For instance, when an earthquake struck Haiti this past August, policymakers needed to have timely and reliable population data to save lives, provide access to critical supplies, and ensure a rapid emergency response. To best understand the situation, selecting the right dataset was critical to providing them with the most accurate estimates of the affected population and assessment of critical supplies needed to ensure an efficient response effort.

Over the last few years, POPGRID has brought together the leading data producers, users, and sponsors of georeferenced data on population, human settlements, and infrastructure to address these pertinent issues by improving the accessibility and consistency of data, supporting users by mitigating duplication and confusion, and encouraging innovation and cross-disciplinary use. A fit-for-purpose approach is at the core of the group’s activities. For example, POPGRID offers users seven global gridded population datasets to choose from, but no one dataset is best suited for every application, nor can we assume that an average of the population estimates will provide a more accurate output. Moreover, each dataset utilizes different modeling approaches to disaggregate population data using spatial data and satellite imagery. The group works to foster a better understanding of which dataset is a better fit for each application by developing resources and tools, such as the POPGRID Viewer, which enables users to compare gridded population data and information from six POPGRID data sources and access a population estimate comparison chart.

Sample results from the POPGRID Viewer.

Applying a Fit-for-Purpose Approach

As timely new data sources, EO-derived products can provide inputs needed for a range of SDG indicators. Yet, the use of each product can produce very different assessments of the world’s performance on the SDGs. Leyk et al. provide the knowledge base needed to make informed decisions about the appropriateness of the gridded population data products related to the application of interest. TReNDS’ Leaving No One Off the Map report also outlines several questions for consideration when selecting a data product, including:

Time Scale: Is the product based on nighttime population (where people live), daytime population (where people work), or ambient population (where people are likely to be throughout the day)? For example, LandScan USA is one of the products which estimates are an average of daytime and nighttime population count.

Modeling Approach: What is the resolution of the dataset? How does the dataset define what constitutes a built-up area or how populations are allocated to built-up areas? What other inputs were included? 

Demographics: Does the user need demographic information for their application of interest? Does the data product provide disaggregated information on age and sex? For example, some products, including  GPWv4 and WorldPop, provide demographic data (age and sex) at different resolutions. 

Cross-Country Analysis: Is the user conducting analysis across countries? Datasets adjusted to a globally consistent set of population estimates, such as the UN World Population Prospects’ estimates might be a better fit for cross-country analysis. 

Comparison Across Time: Does the user need to compare population estimates across different time periods? Some data providers, such as WPE, use different methods every year, and users should be aware of this if they are looking for a product that could be used in a consistent time series.

Urban and Rural Estimates: Is the application of interest in an urban or rural area? Users may want to consider the resolution of the dataset and the relevant variables included in the product.

Environmental Factors: What are some of the environmental factors that the dataset takes into consideration? Does it assume that no one lives on a water body or protected area?

Covariates: Is the user trying to understand the relationship between population density and another factor? If so, selecting a product that does not include that factor as a covariate may be beneficial, as the dataset will include the preexisting assumption of how that factor affects population distribution.

Read our POPGRID StoryMap to learn more about the methods used for gridded population datasets

The Way Forward 

A number of initiatives have begun to share real-world applications of EO and remote sensing data for the SDGs, which is helping to encourage fit-for-purpose approaches. EO4SDGs recently launched the EO Toolkit for Sustainable Cities and Human Settlements, which highlights use cases on how EO-derived data are selected and used to address specific urban challenges across different geographies and timeframes. Additionally, our upcoming AGU 2021 session will address fitness-for-use guidelines concerning georeferenced population and infrastructure data for humanitarian response and disaster risk management.

How we approach the framework of fit-for-purpose data and monitoring systems will determine the outcome of the Global Goals, and as new products and technologies offer us more information about the state of sustainable development, we must ensure that the necessary tools and resources are also made available to empower policymakers, researchers, and citizens to ask the right questions to produce more sustainable and equitable solutions.