SDG Data Catalog
An open, evolving global database of SDG relevant data sets.
The SDG Data Catalog is an open, extensible, global database of data sets, metadata, and research networks built automatically by mining millions of published open access academic works.
The SDG Data Catalog leverages advancements in Artificial Intelligence (AI) and Natural Language Processing (NLP) technologies to extract and organize knowledge from public datasets that is otherwise hidden in plain sight in the continuous stream of research generated by the scientific community.
The goal, ultimately, is to connect researchers and students with SDG-relevant datasets so that their work can make meaningful progress towards social good.
Hidden in Plain Sight: Building a Global Sustainable Development Data Catalogue
By James Hodson & Andy Spezzati
Modern scientific research for Sustainable Development depends on the availability of large amounts of relevant real-world data. However, there are currently no extensive global databases that associate existing data sets with the research domains they cover.
We present the SDG Data Catalogue – an open, extensible, global database of data sets, metadata, and research networks built automatically by mining millions of published open access academic works. Our system leverages advances in AI and NLP Technologies to extract and organise deep knowledge of data sets available that is otherwise hidden in plain sight in the continuous stream of research generated by the scientific community.
Explore SDG Datasets
AI Induced Empathy
Datasets and projects designed to increase empathy for often impoverished victims of far-away disasters.
People Living in Extreme Poverty
The World Poverty Clock developed by the World Data Lab provides real-time poverty estimates through 2030 for nearly all countries.
Poverty and Equity Database
The latest poverty and inequality indicators compiled from officially recognized sources with national, regional and global estimates.
Social Protection Responses to COVID-19
Annual social protection data are compiled by the International Labour Organization (ILO) through its Social Security Inquiry, sourced from national administrative data. The indicators are disseminated through ILO World Social Protection Data Dashboards
UNICEF Research & Reports
A list of equitable data sets, research and reports from UNICEF Office of Innovation to support programmes, campaigns, and initiatives.
The Group on Earth Observations Global Agricultural Monitoring (GEOGLAM) Crop Monitor (https://cropmonitor.org/) is an international initiative that was developed under the framework of the 2011 G20 Action Plan on Food Price Volatility in Agriculture.
Created by the UN Food and Agriculture Organization (FAO), an agency dedicated to international efforts to end hunger, this dataset tracks desert locust observations, as well as whether the observed locusts are adults or nymphs (known as hoppers) and whether the locusts form a group.
Global Hunger Index 2013
The Global Hunger Index (GHI) is a tool designed to comprehensively measure and track hunger globally, by region and country.
Global Hunger Index 2017
The Global Hunger Index (GHI) is a tool designed to comprehensively measure and track hunger globally, by region and country.
Insufficient Food Intake
The World Food Programme (WFP) has developed the HungerMapLIVE, a global hunger monitoring system that tracks and predicts hunger in near-real time.
Good Health & Wellbeing
1000 Genome Project
The International Genome Sample Resource contains the most extensive catalogue of genetic variation in humans including SNPs, structural variants and haplotype context.
Access to Health Services for women of child bearing age
World Pop is an applied research group focussed on mapping demographics in low and middle income countries, and works to measures the availability and geographical accessibility of healthcare services at the national and sub-national levels across Sub-Saharan Africa as one of its activities.
Created by the John Hopkins University Center for Systems Science and Engineering, this dataset reports COVID-19 cases at the provincial-level in China, at the county-level in the U.S., and at the state and national-levels for other countries.
COVID-19 Vaccine Procurement
This dataset provides the most recent data on vaccine purchases and negotiations by individual countries and unilateral partnerships from 16 companies.
Global Health Observatory
The GHO data repository contains data collected by the World Health Organization on various health-related statistics including mortality and disease burden rates in 194 countries.
Global Data Set on Education Quality
Panel database on education quality featuring data from 163 countries between 1965-2015.
Learners impacted by COVID-19
The United Nations Educational, Scientific and Cultural Organization(UNESCO) is supporting countries in their efforts to mitigate the immediate negative impact of school closures and to facilitate the continuity of education through remote learning.
World Inequality Database on Education
The World Inequality Database on Education (WIDE) highlights the powerful influence of circumstances, such as wealth, gender, ethnicity and location.
Digital Gender Gap
The University of Oxford and Qatar Computing Research Institute(QCRI), with support from Data2X, are collaborating to measure digital gender gaps in real time.
Family Planning by Harvard Dataverse
Database providing data from family planning surveys conducted in various countries.
Female World Leaders
The United Nations Protocol and Liaison Service maintains a list of Heads of State, Heads of Government, and Ministers for Foreign Affairs of all Member States based on the information provided by the Permanent Missions.
Percentage of Women in Parliament
The Inter-Parliamentary Union (IPU) tracks monthly rankings of the percentage of women in parliament from January 2019 onwards through Parline, a free resource with over 600 data points provided directly by national parliaments on their structure, composition, working methods, and activities.
UNECE Big Datasets on Fertility, Family and Households
Provides datasets on households across the globe including marriages, fertility rates, adolescent fertility, etc.
Clean Water & Sanitation
Datasets by World Research Institute
Provides datasets on various issues including flood hazard maps, water risk indicators and water stress projections across the globe.
Global & Regional Stats by World Data Atlas
Datasets providing world and regional statistics, data and maps.
International Water Footprint by WaterStat
The most comprehensive source of international water footprint data including scarcity and pollution issues.
Trophic State (Water Quality)
The UN Environment Programme (UNEP) works with partners to support the global monitoring of freshwater ecosystems, as reported through the Freshwater Ecosystems Explorer, which provides up-to-date geospatial data on changes to their extent and water quality.
The ISciences Water Security Indicator Model v2 (WSIMv2) describes places where water availability during the most recent 12-month period is more or less than would be expected based on a 1950-2009 baseline period.
The Falkenmark Water Stress Index is a widely used metric to characterize water stress based on annual renewable water supply per capita.
Affordable & Clean Energy
Data on energy consumption and per capita energy consumption of a few countries.
Irena Renewable Energy Statistics
Detailed statistics on renewable energy capacity, power generation and renewable energy balances.
Our World in Data
Data on global energy consumption by source, energy production and trade, energy transitions and renewable energy investments.
Population Without Electricity
Developed by Fondazione Eni Enrico Mattei (FEEM), a sustainable development think-tank, this dataset measures electricity access in Sub-Saharan Africa.
The Shift Project
Datasets on primary energy production and consumption, CO2 from fossil fuels, greenhouse gas emissions, renewable energy and electricity.
US Energy Information Administration
Global data on energy generation and consumption, energy intensity, CO2 emissions as well as import and export statistics.
Decent Work & Economic Growth
Research by the Oxfard Martin School on child labor internationally.
COVID-19 Fiscal Response
The International Monetary Fund (IMF) compiles a database on fiscal measures announced by 141 different governments in response to the COVID-19 pandemic
COVID-19 impact on employment
The International Labour Organisation (ILO) is tracking the impacts on the world of work that has been severely impacted by COVID-19
GDP Growth Rates
The OECD’s quarterly national accounts (QNA) dataset presents GDP growth data collected from all the OECD member countries and some other major economies on the basis of a standardised questionnaire.
International Monetary Fund Data
The IMF publishes a range of time series data on IMF lending, exchange rates and other economic and financial indicators.
Industry, Innovation & Infrastructure
Access to Internet
The International Telecommunication Union measures internet access across the globe twice a year using survey data.
Globalization and Industrialization in Developing Countries
This study investigates the effect of the latest wave of economic globalization on manufacturing employment in developing countries.
UNIDO Industrialization Intensity Index
Gives graphs as well as country highlights relevant to survey results.
Global Database of Shared Prosperity
The World Bank’s Global Database of Shared Prosperity covers 83 countries, with 75 percent of the world’s people, with most recent estimates available for 2013.
International Migrant Stock
Reported by the UN Division of Economic and Social Affairs (UN DESA), International migrant stocks are estimates of the total number of international migrants present in a given country at a particular time.
ITU AI Repository
A global Artificial Intelligence (AI) repository to identify AI related projects, research initiatives, think-tanks and organizations that can accelerate progress towards the 17 UN Sustainable Development Goals.
The Standardized World Income Inequality Database
The goal of the SWIID is to meet the needs of those engaged in broadly cross-national research by maximizing the comparability of income inequality data.
Sustainable Cities & Communities
Air Quality (PM2.5)
OpenAQ, a non-profit organization, collects daily air quality information from stations around the world and provides it as free and open data to help better monitor and manage the air we breathe.
COVID19 Community Mobility Reports
Google’s Community Mobility Reports chart the geographic movement trends associated with COVID-19 over time and provides the data, aggregated and anonymized, to the public.
European Data Portal
The European Data Portal harvests the metadata of Public Sector Information available on public data portals across European countries. Information regarding the provision of data and the benefits of re-using data is also included.
India Smart Cities: Open Data Platform
A compilation of smart cities around the world that have shared open data in an aggregated data portal.
Major Smart Cities with Open Data 2019
A compilation of smart cities in the world that have shared out open data in an aggregated open data portal.
The database constitutes a comprehensive set of settlement polygons. It is in geodatabase format and consists of three feature classes for built up areas (BUA), small settlement areas (SSA), and hamlets (hamlets).
Sustainable Cities and Society Mendeley Datasets
Mendeley Data Repository is free-to-use and open access. It enables you to deposit any research data (including raw and processed data, video, code, software, algorithms, protocols, and methods) associated with your research manuscript.
The Settlement Profiling Tool
The Settlement Profiling Tool guides field personnel in creating cross-sectoral settlement profiles intended to help inform future urban development plans and policies in displacement affected contexts.
Responsible Consumption & Production
Global SDG Indicators Database
This platform provides access to data compiled through the UN System in preparation for the Secretary-General’s annual report on “Progress towards the Sustainable Development Goals.”
Installed Renewable Energy Capacity
The International Renewable Energy Agency (IRENA), an intergovernmental organization that supports countries in their transition to a sustainable energy future, compiled this dataset by measuring the maximum net generating capacity of renewable and non-renewable energy sources by country.
Moldova | SDG Integration
A collaborative data platform that integrates different types of data to allow the Moldovan Government access to exhaustive information on land coverage, population density and mobility behaviour.
SDG Production Tracker
SDG Tracker is a free, open-access publication that tracks global progress towards the SDGs and allows people around the world to hold their governments accountable to achieving the agreed goals.
Arctic Sea Ice
Areas of the ocean that have frozen are considered “sea ice,” and can vary from slushy, barely solid areas to sheets of ice that are meters thick.
Carbon Dioxide Emissions
The Carbon Monitor dataset, led by researchers Zhu Liu, Philippe Ciais and Steven Davis, was created as the first estimate of daily CO2 emissions for six different sectors, including power, ground transportation, industrial production, residential consumption, and maritime and aircraft transportation.
Drought and Precipitation
The Climate Hazards Group InfraRed Precipitation with Station Data (CHIRPS) is a joint project between the US. Geological Survey and UC Santa Barbara.
Global Temperature Change
The National Oceanic and Atmospheric Administration (NOAA), the National Aeronautics and Space Administration (NASA), and the UK Meteorological Office (UK Met) have used detailed station data going back to the 1800s to analyze temperature changes and have all confirmed the warming of our planet.
National Centers for Environmental Information
NCEI provides the world’s largest collection of weather and climate data, including information that’s “land-based, marine, model, radar, weather balloon, satellite, and paleoclimatic” alongside other datasets.
NOAA – Climate.gov
Provides science and information, focusing on news, data, and climate teaching materials, and the data products and services to track global climate data.
Our World In Data
Our World Data provides a complete guide to CO2 and Greenhouse gas emission profiles for individual countries, charting how emissions are changing in each country, reduction progress and statistics.
Life below Water
Bleaching of Coral Reef Areas
Coral reefs are one of the most diverse and ecologically important areas in the world, but many are threatened by rising ocean temperatures.
Figshare – Plastic Pollution in the World’s Oceans 2007-2013
A global dataset of 1571 locations where surface manta tows were conducted
Figshare – River Plastic Emissions to the World’s Oceans
Sources of ocean plastic organized by river.
Global Critical Habitat of Biodiversity – Ocean Data Viewer
The global spatial distribution of likely or potential Critical Habitat, as defined by the International Finance Corporation’s Performance Standard 6 (IFC PS6) criteria, comprises 20 underlying datasets.
Global Fishing Activity
Global Fishing Watch (GFW) is advancing ocean governance through increased transparency of human activity at sea.
NOAA – Marine Debris
Datasets on marine debris and garbage patches in the oceans
Ocean Tracking Network
The Ocean Tracking Network is a global aquatic animal tracking, data management, and partnership platform.
Plastic Pollution Coalition
Includes the lifecycle of plastic in the oceans, plastic hotspots, and other measures.
Life on Land
CON-VIVA: Towards Convivial Conservation
The project is grounded in the premise that conservation is critical to transformations to sustainability but that its practices need to change radically.
The World Database on Protected Areas (WDPA) was established in 1981 after the UN Economic and Social Council called for a list of natural reserves, citing its value for economic, scientific, and conservation.
Global Forest Watch (GFW) provides data and tools for monitoring forests and provides access to near real-time information about where and how forests are changing around the world.
Global Forest Watch
Provides data about forests including land cover, land use, biodiversity metrics and forest change allowing for the monitoring and management of forests.
Provides data on forest ecosystems including tree cover loss and gain rates, restoration opportunities, forest fires and biodiversity hotspots.
Sustainable Nutrition for All
Aimed to improve nutrition through the adoption of agro-biodiversity and improved dietary diversity at the household level in Uganda & Zambia.
The Forest Atlases
Allows users to visualize and analyse data on country specific forest characteristics.
Norway’s International Climate and Forests Initiative (NICFI) makes high-resolution (<5m per pixel) optical satellite imagery of the tropics freely available to all in the pursuit of helping stop deforestation and combat climate change.
The Active Fires product, managed by the National Oceanic and Atmospheric Administration (NOAA), is based on the detection and analysis of active wildfires as received by a sensor.
Features environmental conservation and restoration frameworks for policymakers and private-sector initiatives including infographics, datasets, visualization tools, and more.
Peace, Justice & strong Institutions
he Armed Conflict Location & Event Data Project (ACLED), a disaggregated data collection, analysis, and crisis mapping project, maintains a database of all forms of human conflict from over 50 developing countries.
Datasets Archives – UNICEF DATA
Find data sets on topics including early childhood development, infant mortality, and intimate partner violence.
Internal Displacement Monitoring Centre (IDMC)
Provides data and analysis, and supports partners to identify and implement solutions to internal displacement.
Our World in Data: Human Rights
Tracking human rights abuses over time.
National Geospatial-Intelligence Agency, an agency within the United States Department of Defense, records instances of hostile attacks against ships and mariners via its Anti-Shipping Activity Messages (ASAM) database.
SDG16 Data Initiative
Pulls together data sets in an open format to track SDG16 and provide a snapshot of the current situation, and eventually progress over time.
UCDP – Uppsala Conflict Data Program
Tracks global conflict and violence.
Voluntary National Reviews
The Voluntary National Reviews (VNRs) aim to facilitate the sharing of experiences, including successes, challenges, and lessons learned, with the goal of accelerating the implementation of the 2030 Agenda.
WJP Rule of Law Index
Covering major topics on law and order by country.
Partnerships for the Goals
Methodology | SDG – Human Rights Data Explorer
The Danish Institute developed and trained an algorithm to link human rights recommendations to the corresponding SDG(s).
Official Development Assistance
Official development assistance (ODA) is defined by the OECD Development Assistance Committee as government aid that promotes and targets the economic development and welfare of developing countries.
Compiled by the World Bank, this dataset measures officially-recorded remittance inflows (remittances received) per country in 2020.
Sustainable Development Goals Today (SDG)
Real time data on on-going SDG progress.
THE Impact Rankings 2021 – SDG 17 Methodology
The project looks at the broader ways in which universities can collaborate in support of the SDGs and lists partnerships in a ranking system.
Further Research and Resources
Interview with Achim Rettinger
AI for Good Board Member and Full Professor at Trier University, Achim Rettinger discusses with the AI for Good Foundation Team his work in natural language processing, and how that can impact progress toward the SDGs. According to Professor Rettinger, AI and machine learning can be utilized to understand communication better by analyzing huge quantities of data. The data can help the international community uncover insights on the collective progress toward the 2030 deadline.
The SDG Data Catalogue is structured so that research and data sets can be submitted and shared. Free flow of knowledge and open source data is at the core of our vision.
Contact us to submit your research and to advise on the build out of the search tool.
Association for Computing Machinery
ACM, the world's largest educational and scientific computing society, delivers resources that advance computing as a science and a profession.
The Institute for Operations Research and the Management Sciences is an international society for practitioners in the fields of operations research, management science, and analytics.
University of California, Berkeley
The University of California, Berkeley is a public research university in Berkeley, California.
Share this Page
Join our efforts to unlock AI’s potential towards serving humanity.