Methodology | Climate Displacement Lab

Methodology

What this dataset does

The Climate Displacement Event Database documents climate induced displacement at a granular event level. Each entry captures a specific instance of displacement linked to a climate related hazard, with structured information on its timing, location, drivers, scale, and impacts.

The dataset is designed to move beyond aggregated estimates and instead provide event based evidence that reflects how displacement unfolds in real contexts. It enables analysis of patterns across geographies, hazard types, and movement pathways.

The focus is on capturing the dynamics of displacement rather than only reporting numbers. This includes where people move from, where they move to, and under what conditions.

Origin

→

Temporary Shelter

→

Urban Settlement

→

Return / Secondary Move

What is a displacement event

A displacement event refers to a situation in which individuals or households are forced to leave their place of residence due to climate related hazards or environmental stress.

This includes both sudden onset events such as floods, cyclones, and landslides, as well as slow onset processes such as drought, salinization, erosion, and long term environmental degradation.

An event is defined based on a specific time period and geographic location. Where available, the dataset captures multiple phases of movement, including initial displacement, secondary movement, and return.

What data is collected

Each displacement event is structured across a consistent set of variables to enable comparability and analysis.

Geographic information, including origin and destination locations
Temporal information such as date and duration
Hazard type and contributing environmental drivers
Scale of displacement, where reported
Movement patterns, including direction and type of movement
Conditions at destination locations, including housing, water, sanitation, and food access
Indicators of vulnerability across affected populations
Reported loss and damage, including impacts on housing, livelihoods, and assets

The dataset is designed to accommodate partial information, as complete data is often not available in real world reporting contexts.

Field	Example
Event ID	CDL-IND-2024-001
Location (Origin)	Coastal Odisha
Destination	Bhubaneswar
Hazard	Cyclone
Date	May 2024
Estimated Displacement	12,000 people
Movement	Rural → Urban
Housing	Temporary shelters
Water Access	Limited
Source	ReliefWeb

How data is collected

Data is collected through a multi source approach that combines desk based research with remote engagement.

Verified news reports and media coverage
Government publications and official statements
Humanitarian situation reports produced by NGOs and international agencies
Academic and field based research, where available
Direct conversations with affected communities conducted remotely

Given the absence of continuous field presence, the methodology relies on systematic extraction and structuring of publicly available information.

Remote interactions with affected communities are used, where feasible, to validate and contextualize reported information.

Priority is given to sources that provide specific, time bound, and location referenced data.

News reports

Government publications

Humanitarian reports

Field interactions

Data quality and transparency

All data points are linked to their original sources to ensure traceability. The dataset does not introduce estimates or inferred values where data is not available.

Where information is missing, fields are left unfilled rather than approximated. This approach prioritizes transparency over completeness.

In cases where multiple sources report differing figures or details, the most consistent or clearly attributable information is retained, and discrepancies are noted where necessary.

The dataset is structured to allow users to assess the reliability and origin of each data point.

Limitations

The dataset is subject to several limitations.

Displacement events are often underreported, particularly in rural or remote areas. Media and institutional coverage tends to focus on large scale or sudden onset events, leading to gaps in smaller or slower processes.

Data availability varies significantly across regions and hazard types. As a result, some events may have detailed information while others remain partial.

The reliance on secondary sources and remote data collection limits the ability to verify all aspects of an event on the ground.

Efforts are made to expand direct data collection through field engagement. However, this is dependent on access and available resources.

Confidence and verification levels

Each displacement event is assigned a confidence level based on the quality and consistency of available information.

This is not a score of impact, but a measure of how reliably the event is documented.

High confidence Multiple independent sources report consistent information, with clear references to time, location, and scale.
Medium confidence Information is available from at least one credible source, but may lack completeness or cross verification.
Low confidence Limited or fragmented reporting, where key details such as scale, exact location, or timing are unclear.

Confidence levels are intended to help users interpret the dataset critically, especially when comparing across regions or event types.

High confidence
Medium confidence
Low confidence

Event verification approach

Identifying whether the displacement is directly or indirectly linked to a climate related hazard
Checking for consistency across available sources
Confirming minimum required attributes such as location and time reference
Distinguishing between reported displacement and projected or anticipated displacement

Events based purely on forecasts, projections, or policy discussions are not included unless displacement has actually occurred.

Where possible, remote conversations with affected communities are used to validate or enrich reported information.

Data structure and schema

The dataset is structured at the level of individual displacement events.

Event identification
Geographic attributes
Temporal attributes
Hazard classification
Displacement characteristics
Conditions at destination
Vulnerability indicators
Loss and damage
Source metadata

The schema is designed to be extensible and scalable.