Back to Projects
DataClient Project

HealthGrid

Every healthcare facility in the country — scraped, cleaned, and map-ready

2025DataPython, Scrapy, Pandas, GeoPandas, PostgreSQL, Google Maps API

36 states covered · Multiple source registries unified · Full geocoding pipeline delivered

The Problem

The client needed a single, reliable dataset of every healthcare facility in Nigeria — public and private, primary through tertiary. No such dataset existed in usable form. Official registries were fragmented across 36 state health agencies: some in PDF, some on outdated web portals, none geocoded, and none talking to each other.

What We Built

We built a scraping and ingestion pipeline that pulled from every available official source — federal and state health ministry portals, the NHIA registry, and supplementary public data sources. Records were cleaned, deduplicated using facility name and address fuzzy matching, geocoded via the Google Maps Geocoding API, and structured into a standardised schema with facility type, ownership tier, service level, and coordinates — ready for direct map layer integration.

The Outcome

Delivered a clean, geocoded, map-ready dataset covering healthcare facilities across all 36 states — ready for integration into a national health access platform

What We Learned

Government data in Nigeria lives in unexpected places, and often in formats that were never meant to be processed programmatically. We spent as much time on source discovery as on the pipeline itself. A well-documented source map turned out to be as valuable as the data it pointed to.