Calculating the distance between a set of postcodes and ranking them in order.

Overview

There are sometimes instances where you have a set of locations and you need to rank them in order. A perfect example of this is allowing customers to find out the closest branch of a shop. The example here involves a set of ice cream shops and showing which one is closest to the postcode we specify.

strawberry_ice_cream-t2

You can see the code here in action at: charitiesneed.co.uk. As always any patches to the code are very welcome

Getting started with the example

The first step is to download a copy of the source code and sql script which you can do using git:

git clone https://github.com/vixre/postcodes_ordered_by_distance

or downloading it all in a zip file

Create a new database and import the table with the UK postcodes and their corresponding lat and lon (UK postcode database was provided by freemaptools.com).

mysql> create database example_db;

then exit back into your shell, cd to the source code directory and run:

mysql> mysql -u username -p -h localhost example_db < database.sql

That’s the database all set up so now take a look at the example (search.php).

<?php
	require_once('distance.php');
	date_default_timezone_set('Europe/London');
	// connect to database
	$db_server = 'localhost';
	$db_user = 'root';
	$db_pass = 'password';
	$db_name = 'example_db';
	$conn = new mysqli($db_server, $db_user, $db_pass, $db_name);</code>
	
	if ($conn->connect_error) {
	    trigger_error('Database connection failed: ' . $conn->connect_error, E_USER_ERROR);
	}
	
	// The origin post code
	$_REQUEST['post_code'] = 'YO31 8UB';
	
	// Remove all spaces in post code
	$post_code_search = str_replace(' ', '', $_REQUEST['post_code']);
	
	$page_number = (!empty($_REQUEST['page_number'])) ? $_REQUEST['page_number'] : $page_number = 1;
	
	$distance = new distance;
	
	$distance->entities_per_page = 10;
	$distance->offset = ($page_number - 1) * $distance->entities_per_page;
	
	// Initialize map object and set location origin
	$distance->origin = $post_code_search;
	
	$distance->populate_coordinates();
	$returned_array = $distance->build_result();
	
	// Encoded in json
	echo json_encode($returned_array);
?>

Essentially what this does is takes a postcode (YO31 8UB), looks up the corresponding lat and lon coordinates and runs a SQL query using the haversine formula and returns the ice cream shops in order of distance. So essentially the customer knows where the nearest place to get an ice cream is. Once you have entered your database credentials to search.php you should be able to run:

php search.php

which should return (distance in miles):

  Array
    (
     [id] => 1
     [name] => Archers Ice Cream
     [distance] => 29.00979938205737
    )
    Array
    (
     [id] => 7
     [name] => Brymor Ice Cream
     [distance] => 157.5704259332447
    )
    Array
    (
     [id] => 6
     [name] => Wensleydale Ice Cream
     [distance] => 204.28282924774888
    )
    Array
    (
     [id] => 2
     [name] => Walls factory
     [distance] => 214.04476274067244
    )

There you have it! We can now see that out of all the ice cream shops in the database (as the crow flies) ‘Archers Ice Cream’ is the nearest. Hopefully this example will help you gain an idea of how you can implement this functionality yourself into the projects you are working on.

Need help or not exactly what you are looking for?

Vixre is a consulting and software development company based in Northallerton, England. We’re happy to help you with implementing the distance calculation, data preparation, processing, server setup and such. Just let us know what you would like to achieve and we will see if we can be of service.

Possible improvements

Investigating the possibility of integration with the postcode.io API. This would likely involve porting the existing code (worth chatting with the postcode.io team before doing this).

Currently the class only provides an ‘as the crow flies‘ distance calculation. Using a path routing tool (e.g. OSRM) would be very useful and should allow for distance estimates that take into account roads and terrain.

9 thoughts on “Calculating the distance between a set of postcodes and ranking them in order.”

  1. HI,

    Thanks for this excellent guide. For a complete newbie to SQL, can you point me to an installation process? I will be building a Ubuntu machine for this.

    Thanks!

    Chris

  2. As I understand it, because the “ice_cream_shops” table only holds Postcode information for each shop, not their latitude & longitude, every fresh search needs to determine the latitude & longitude of the shops again by trawling the massive UK Postcodes table, even though these will never change unless the shop moves address.

    Adding latitude & longitude fields to the “ice_cream_shops” table, and populating these fields from a one-off query to the UK Postcodes table will greatly reduce the overall query time, as only the latitude & longitude for the Postcode of the person wanting ice cream needs to reference the UK Postcodes table – all the other coordinates are already recorded.

    1. Thanks for the feedback. Yes, you’re absolutely right, adding lattitude and longitude columns to the ice_cream_shops, checking the two columns, populating the two fields if they are empty and using these instead should be a big improvement performance wise and likely trivial to implement. I’m more than happy to accept patches if you fancy writing the enhancement! Thanks again.

  3. I can see an improvement to the approach. What i can figure out from the implementation is that even if we are looking for only 5 nearest locations, the distance has to be calculated for all locations. Using GIS extensions we can do spatial search for location only within a certain distances. When i implemented this I used an square envelope around the postcode.
    The only problem is knowing reasonable value of this distance from search point so that we can get atleast X nearest locations. This value of distance depends on the density of locations available around the postcode.

    Looking for more ways to optimize the solution.

Leave a Reply to John Williams Cancel reply

Your email address will not be published. Required fields are marked *