On 2020/06/06, I finally made my first Python module. The reason for the creation of this module is that I needed to perform certain calculations regarding distances. I wanted to write an article that would cover the most well-known ways of transmitting data. Namely, via streaming protocols. I ended up logically talking about RabbitMQ and Kafka. Although it was clear to me what I wanted to write about, I wasn't sure what should I base my article on. An idea was to calculate distances in Km on different planets.
Of course, for this, we need pretty good knowledge of how coordinates work. I ended up hastily implementing Python code to give me those calculations. At some point, I did think I was going too far in my article and ended up settling on planet 🌎 Earth 🌍 and making something with trains and trucks.
The old Geo Library can be found there on path bl-demo-server/bl-core-service.
What I want to demonstrate with this blog post is a few concepts about Python, how I implemented the Haversine Formula, what is this formula and where it comes from, and finally, how I made the package currently available on PyPI (Python Package Index)
In the book, Heavenly Mathematics: The Forgotten Art of Spherical Trigonometry, Oxford: Princeton University Press, 2012, we can find very interesting references about the Haversine Theory. It also contains a complete description of it. For us, what's important is to understand its origins and what it is about.
The beginnings of this theory take us back to 1801.
At this time, Josef de Mendoza y Ríos used it for the first time. Josef de Mendonza y Rios, also known as José de Mendonza y Ríos, was a Spanish astronomer and mathematician of the 18th century. He made several publications about navigation and his work was focused on providing mathematical tools to calculate the position of a ship at sea using natural resources and observation instruments. This position was to be measured in degrees and consisted of two dimensions: latitude and longitude.
José gained notoriety by being able to calculate these coordinates by using two altitudes of the Sun for the latitude and the distance of the moon to a celestial body for the longitude. Using tables, Sailors and other members of crews at sea could easily and quickly find approximately where they were. His work earned him an honorable place at the Royal Swedish Academy of Sciences.
However, the first specific publication regarding the Haversine formula only occurred only 4 years later in 1805. At that time, James Andrew published the first known Haversine tables in his ASTRONOMICAL and NAUTICAL tables with PRECEPTS for finding the LONGITUDE and LATITUDE of PLACES By Lunar Distances, Double Altitudes, &c, AND FOR SOLVING OTHER THE MOST USEFUL PROBLEMS IN PRACTICAL ASTRONOMY.
Finally, the actual term haversine
was invented by James Inman(1776–1859).
He was an English mathematician and professor at the Royal Naval College in Portsmouth UK. James Inman was very busy in 1835 with the development of a new tale of haversines
.This is where the term comes from. The new table of haversines
, by James Inman, introduced a very easy way to calculate the distance between two points by using simple spherical trigonometry.
This college has been closed, but it was active between 1733 and 1837.
As we can see through this very very quick introduction to its history, the haversine formula has existed for a long time. It is a centuries-old technology, but it's important to realize that although old, it is the basis of what we are about to see. Not only that, but it is part of the theory that makes today's GPS (Global Positioning System) possible and with that, we can nowadays know how long the train will take to get to us.
We now know when the buses are coming. We know how long it will take from A to B. We even now can have an estimate so precise, that we don't even think about it and take it most of the time for granted. Even when we have to go from A to B through C to D and all the way to Z. Almost everything is calculated with extreme precision.
We are going to see that although we will consider Earth to be a big globe, all the calculations will still match. Earth can also be described as a big potato that is almost round. That small difference for all our distance calculations is negligible. Unless, of course, we need even more extremely precise data for scientific studies. But that is another issue.
Let's first have a look at the meaning of the word Haversine
. I didn't mention this in the introduction just because this term involves quite a lot of mathematical explanation. This is why it doesn't seem to belong to a historical background section. However, it is precisely the end of that section where we will start this one. From the studies of James Inman, we can finally understand what a Haversine
is.
in Heavenly Mathematics: The Forgotten Art of Spherical Trigonometry
From this image, we can see that vers
is basically a versed sine
. We can now make the correct formula:
vers 𝜭 = 1 - cos 𝜭 = 2 sin ² (𝜭/2)
Half of that is
hav 𝜭 = (1 - cos 𝜭)/2 = sin ² (𝜭/2)
And this is what a haversine
formula is about.haversine
just means Half versed sine
.
That small distance become something very attractive for people at sea. The main reason was that it would always result in a positive number. Plus, a haversine
is very easily invertible and works well with small numbers. Rounding numbers isn't really a problem using haversines
.
For our implementation, we will use this base to develop different formulas.
In line with a good understanding of this problem, it is important that we become very much aware of what latitude and longitude actually mean.
One line parallel to the equator has the same latitude on every of its points and it is called a parallel.
Latitude is a measurement in degrees of the angle formed from the equator line of the earth to the location we are determining. The equator line seems like an obvious point to set the latitude at 0°. This is because it cuts Earth in two halves, and it was already responsible for the separation between the Northern and Southern hemispheres.
One line drawn from pole to pole has the same longitude and it is called a meridian.
Longitude is a measurement in degrees from the Prime Meridian to the location we are determining. The Prime Meridian has 0° longitude and goes through very near the Royal Observatory, in Greenwich, England. This has been determined by convention. The symbol for Longitude is λ
We are now going to get our hands on Python. Before we continue we need to be aware of a few things about Python and the libraries we are going to use:
public
protected
. We won't be using these for now.private
. We will use these.python
's math
use radians as arguments. This is why we will be using the radians
functions a lot.sqrt
), and we will need the pi
constant a lot
For version 1.0.0, we will be using some functions within a Coord
class. This makes functions easier to use and makes our code more elegant:
class Coord:
def __init__(self, lat, lon):
self.lat = lat
self.lon = lon
def delta_rads(self, lat, lon):
self.lat += lat
self.lon += lon
def distance_to_in_meters(self, coord, planet_radius_km=6371000):
meters = self.__distance_to(coord, planet_radius_km)
meters = round(meters, 3)
return meters
def distance_to_in_kilometers(self, coord, planet_radius_km=6371000):
meters = self.__distance_to(coord, planet_radius_km)
km = round(meters / 1000, 3)
return km
def __distance_to(self, coord, planet_radius_km):
lon1 = self.lon
lat1 = self.lat
lon2 = coord.lon
lat2 = coord.lat
phi_1 = radians(lat1)
phi_2 = radians(lat2)
delta_phi = radians(lat2 - lat1)
delta_lambda = radians(lon2 - lon1)
a = sin(delta_phi / 2.0) ** 2 + cos(phi_1) * cos(phi_2) * sin(delta_lambda / 2.0) ** 2
c = 2 * atan2(sqrt(a), sqrt(1 - a))
meters = planet_radius_km * c
return meters
def __repr__(self):
return "lat: " + str(self.lat) + " lon: " + str(self.lon)
Let's now break this class down into its subcomponents:
First, a look at the constructor:
def __init__(self, lat, lon):
self.lat = lat
self.lon = lon
This is basically the way a coordinate is defined. We set up the latitude and the longitude. One of the things that may be important to move along from one coordinate to another place is a Δ (delta). For this, I've created a function for that:
def delta_rads(self, lat, lon):
self.lat += lat
self.lon += lon
So now we can easily move step by step if, and only if, we know how many degrees we want to move. Of course, normally, we don't think about the Prime Meridian and we normally talk about kilometers or miles.
This is why I've also created this inner function:
def add_delta_in_km_to_coord(self, d_lat_km, d_lon_km, planet_radius_km=6371):
lat = self.lat
d_lat = (180 / pi) * (d_lat_km / float(planet_radius_km))
d_lon = (180 / pi) * (d_lon_km / float(planet_radius_km)) / cos(radians(lat))
self.delta_rads(d_lat, d_lon)
It's important to notice that we allow, in parameter planet_radius_km
, the radius of a planet. Remember that this module is supposed to be usable on different planets. We first convert our deltas to degrees. The latitude means that if we get a positive delta, we go North. A negative one and we go South. With that in mind, we also want to respect the curvature of the earth.
This is why we have a ratio of 180 degrees / pi
times ΔKlm / radius
. In terms of longitude, we will be going from west to east for a delta-positive number. However, for longitude, the curvature varies. Our parallel
line, depending on our position in the meridian
line, where we are, has a different radius. We compensate for this in our formula by dividing by cos(radians(lat))
. This means that we convert our latitude to radians
so that we can factor our result by its cos
(cosine). Finally, we just call our previous function which receives in degrees.
We now need to look at two parent functions:
def distance_to_in_meters(self, coord, planet_radius_km=6371000):
meters = self.__distance_to(coord, planet_radius_km)
meters = round(meters, 3)
return meters
def distance_to_in_kilometers(self, coord, planet_radius_km=6371000):
meters = self.__distance_to(coord, planet_radius_km)
km = round(meters / 1000, 3)
return km
We will make sure to give only three decimals in this first version for our calculations. It is implicit in both functions that we are trying to get the distance between two coordinates. This is our instance and one injected via params. Also, notice that we are also allowing for a planet_radius
. The only difference is that this function accepts this parameter only in meters
.
We finally reached the subject of our article.The Haversine formula
. According to the literature, the Haversine formula is effectively the source of the formula to calculate distances between two points. These points can be anywhere on a round planet. Of course, planets are more like giant "potatoes" than actual spheres, but again we need to remember that those humps are negligible.
2 r arcsin (√( sin ² ( (ɸ2 - ɸ2) / 2) + cos ɸ1 * cos ɸ2 sin ² ( (λ2 - λ1) / 2 )))
This formula is derived from the formula we saw in our introduction. In code terms, this translates to:
def __distance_to(self, coord, planet_radius_km):
lon1 = self.lon
lat1 = self.lat
lon2 = coord.lon
lat2 = coord.lat
phi_1 = radians(lat1)
phi_2 = radians(lat2)
delta_phi = radians(lat2 - lat1)
delta_lambda = radians(lon2 - lon1)
h = sin(delta_phi / 2.0) ** 2 + cos(phi_1) * cos(phi_2) * sin(delta_lambda / 2.0) ** 2
arc_sin = atan2(sqrt(h), sqrt(1 - h))
d = 2 * arc_sin * planet_radius_km
return d
In our code, we still find these utility methods.
def create_west_random_point(center, radius):
degree = random.randint(-90, 90)
d_lat_km = cos(radians(degree)) * radius
d_lon_km = sin(radians(degree)) * radius
origin = Coord(center.lat, center.lon)
origin = add_delta_in_km_to_coord(origin, -d_lat_km, d_lon_km)
return origin
def create_east_random_point(center, radius):
degree = random.randint(-90, 90)
d_lat_km = cos(radians(degree)) * radius
d_lon_km = sin(radians(degree)) * radius
origin = Coord(center.lat, center.lon)
origin = add_delta_in_km_to_coord(origin, +d_lat_km, d_lon_km)
return origin
What these do is create a random point in a limited square delimiting the circumference of a determined radius
and a certain ceter
Coord
.
In order to publish a library, I followed the rules determined by the example given in the Python Package Index Sample Project.
The important file for this is the setup.py
file:
# -*- coding: utf-8 -*-
from distutils.core import setup
with open("README.txt", "r") as fh:
long_description = fh.read()
setup(
long_description=long_description,
long_description_content_type="text/markdown",
name='geo_calculator',
package_dir={'': 'src'},
py_modules=["geo_calculator"],
version='1.0.0-SNAPSHOT',
description='Multi function Geo Location calculator',
author='João Esperancinha',
author_email='jofisaes@gmail.com',
url='http://joaofilipesabinoesperancinha.nl/main',
download_url='https://github.com/user/reponame/archive/v_01.tar.gz',
keywords=['geo', 'location', 'latitude', 'longitude'],
install_requires=[
# 'math',
# 'random',
],
classifiers=[
'Development Status :: 3 - Alpha',
'Intended Audience :: Developers',
'Topic :: Software Development :: Build Tools',
'License :: OSI Approved :: Apache Software License',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.4',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
],
)
The first line # -*- coding: utf-8 -*-
is important if we are using uncommon characters. In my own name, I do have one of these uncommon characters and that is the Ãẫ
of my name João
.
To configure descriptions, we need:
long_description=long_description,
long_description_content_type="text/markdown",
Note, that although markdown
descriptions are supported, I couldn't get them to work in a way that I like to see. This is why I just ended up reading a simple Readme.txt file instead:
with open("README.txt", "r") as fh:
long_description = fh.read()
We then need to give some self-explanatory parameters:
name='geo_calculator',
package_dir={'': 'src'},
py_modules=["geo_calculator"],
version='1.0.0-SNAPSHOT',
description='Multi function Geo Location calculator',
author='João Esperancinha',
For the rest of the parameters for our module, it is important to consult the publishing documentation.
One parameter that it's important to give better attention to is this:
classifiers=[
'Development Status :: 3 - Alpha',
'Intended Audience :: Developers',
'Topic :: Software Development :: Build Tools',
'License :: OSI Approved :: Apache Software License',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.4',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
],
Here we are specifying a sort of datasheet of our module. At the moment, my module is still an alpha version. Although it’s coming up well, I will still create better versions, faster and with more functionalities.
To get our module published, we need to install twine
:
sudo pip3 install twine
Now that we have twine
, we can finally create our distribution:
python setup.py sdist
Finally, we push this to our repository:
twine upload dist/*
If you are pushing your own module, be sure to have your test and production accounts created.
Note that we are going to perform tests locally without having to use any of these remote repos.
Before sending a package to any of these PyPI repos, we should do a unit test first and make sure that everything works. To do that, we can install our package locally. To do that, we first need to create it.
As seen before, we can create it using the sdist
switch:
python setup.py sdist
Now, let's install it with the command pip3
:
sudo pip3 install dist/geo_calculator-0.1.2.tar.gz
Finally, we can run our unit tests with python3
:
python3 -m unittest geo_calculator_test.py
Now, let's just have a quick look at how the unit tests are constructed. We first create the unit test without any functions:
import unittest
from geo_calculator import Coord
class CoordTest(unittest.TestCase):
def setUp(self):
pass
We are setting up the unit test and then we continue. Let's pick one example where we want to move 100Km Northeast:
def test_add_delta_in_km_to_coord_100_km(self):
coord = Coord(53.32055555555556, -1.7297222222222221)
coord.add_delta_in_km_to_coord(100, 100)
self.assertEqual(54.21987716147429, coord.lat)
self.assertEqual(-0.22417190360281203, coord.lon)
Now, let's test the calculation of a distance from one coordinate to another:
def test_distance_to_in_meters_one_place(self):
coord1 = Coord(52.0673599, 5.1102121)
coord2 = Coord(52.08608282419939, 5.109284354540611)
self.assertEqual(2082.859, coord1.distance_to_in_meters(coord2))
self.assertEqual(2.083, coord1.distance_to_in_kilometers(coord2))
Have a look at the other tests. See if you can see what the coordinates refer to and what they mean! It will be fun and positive 😉!
With this post in my blog, I hope to have been able to help you get a Python package on PyPI.It isn't really a complicated thing to do as you have seen. We also see that the Haversine
formula is really something to be remembered. It is everywhere you need to navigate; it is so widely used. Everything we know today in terms of navigation has a connection to these centuries-old theories, which are important to remember forever. The way to move forward to the future is always to remind ourselves of the lessons from the past.
Not only on Earth do we see this. In the future, we may be looking at other planets and establishing our GPS systems. No matter how small or how big those planets are, the haversines
and the works of these three remarkable mathematicians will always be there to remind us that no matter how far we go into the future, everything is important.
Our past is our connection to our present which makes us look into the future. One day we will need formulas that help us navigate through space with our mobile phones, I guess. In space, between planets, there is no curvature, and we can measure distances directly. We also measure distances in 4D. At one point in time,e Earth is closer than ever to Mars, and at one other point in time, the exact opposite happens.
This article has been written with a lot of positivity just to provide you with some guidelines about how to create a public Python module and to give you an introduction to the Haversine
formula.
If you want to install my module please run:
sudo pip3 install geo_calculator
And if you want to remove it:
pip3 uninstall geo_calculator
The module is available at PyPI: geo-calculator:1.0.0
If you like algorithms, you'll probably find it interesting to know a bit more about tail recursivity, how that led to tail call optimization, and the current state of things. I tell all about that in this video over here: