MiGreedy

A fast, parallel implementations of the iGreedy algorithm for large-scale anycast census

180 150W  120W  90W   60W   30W  000   30E   60E   90E   120E  150E 180
|    |     |     |     |     |    |     |     |     |     |     |     |
+90N-+-----+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+
|          . _..::__:  ,-"-"._       |7       ,     _,.__             |
|  _.___ _ _<_>`!(._`.`-.    /        _._     `_ ,_/  '  '-._.---.-.__|
|.{     " " `-==,',._\{  \  / {)     / _ ">_,-' `                mt-2_|
+ \_.:--.       `._ )`^-. "'      , [_/( G        e      o     __,/-' +
|'"'     \         "    _L       0o_,--'                )     /. (|   |
|         | A  n     y,'          >_.\\._<> 6              _,' /  '   |
|         `. c   s   /          [~/_'` `"(   l     o      <'}  )      |
+30N       \\  a .-.t)          /   `-'"..' `:._        c  _)  '      +
|   `        \  (  `(          /         `:\  > \  ,-^.  /' '         |
|             `._,   ""        |           \`'   \|   ?_)  {\         |
|                `=.---.       `._._ i     ,'     "`  |' ,- '.        |
+000               |a    `-._       |     /          `:`<_|h--._      +
|                  (      l >       .     | ,          `=.__.`-'\     |
|                   `.     /        |     |{|              ,-.,\     .|
|                    |   ,'          \ z / `'            ," a   \     |
+30S                 |  /             |_'                |  __ t/     +
|                    |o|                                 '-'  `-'  i\.|
|                    |/                                        "  n / |
|                    \.          _                              _     |
+60S                            / \   _ __  _   _  ___ __ _ ___| |_   +
|                     ,/       / _ \ | '_ \| | | |/ __/ _` / __| __|  |
|    ,-----"-..?----_/ )      / ___ \| | | | |_| | (_| (_| \__ \ |_ _ |
|.._(                  `----'/_/   \_\_| |_|\__, |\___\__,_|___/\__| -|
+90S-+-----+-----+-----+-----+-----+-----+--___/ /--+-----+-----+-----+
     Based on 1998 Map by Matthew Thomas   |____/ Hacked on 2015 by 8^/

This repository contains a multiprocessing implementation of the iGreedy anycast geolocation algorithm, originally developed by Cicalsese et al. that was published in the paper Latency-Based Anycast Geolocation: Algorithms, Software, and Data Sets.

The goal of this implementation is to reduce processing time for LACeS (an Open, Fast, Responsible and Efficient Longitudinal Anycast Census System). This code is used to produce daily anycast censuses, publicly available. It is designed to run using a single input file (containing latencies from multiple vantage points to targets), outputting a single file with geolocation results.


Running with musl binary

We provide pre-compiled static binaries for Linux (x86_64) using musl.

1. Download binary

curl -LO https://github.com/rhendriks/MiGreedy/releases/latest/download/migreedy-latest-x86_64-unknown-linux-musl.tar.gz

2. Decompress the File

tar -xzvf migreedy-latest-x86_64-unknown-linux-musl.tar.gz

This will extract a single executable file named migreedy.

3. Make it Executable

chmod +x migreedy

4. Run the Program

./migreedy --input path/to/measurements.csv --output path/to/results.csv

Running with Docker

The code can be ran using Docker.

Step 1: Pull the Docker Image

Pull the latest pre-built image from the GitHub Container Registry:

docker pull ghcr.io/rhendriks/migreedy:main

Step 2: Prepare Your Data Directory

You need a local directory containing your input CSV file (e.g., measurements.csv). This directory will be mounted into the Docker container.

# Example: Create a directory and move your data into it
mkdir igreedy_data
mv measurements.csv igreedy_data/

Step 3: Run the Container

Execute the docker run command, which mounts your data directory and passes the necessary arguments to the MiGreedy script.

Linux/MacOS

docker run --rm \
  -v "$(pwd)"/igreedy_data:/app/data \
  ghcr.io/rhendriks/migreedy:main \
  --input /app/data/measurements.csv \
  --output /app/data/results.csv

Windows (PowerShell)

docker run --rm `
  -v "${PWD}\igreedy_data:/app/data" `
  ghcr.io/rhendriks/migreedy:main `
  --input /app/data/measurements.csv `
  --output /app/data/results.csv

After the command finishes, the output file results.csv will appear in your local igreedy_data directory.


Installation (Rust)

We include a Rust implementation (significantly faster than the Python version).

  1. Clone this repository:
    git clone [https://github.com/rhendriks/MiGreedy]
    cd [MiGreedy]
    
  2. Install Rust
     curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
     source $HOME/.cargo/env
     rustup update
    
  3. Build the project using Cargo:
     cd rust_impl
     cargo build --release
    
  4. Run the compiled binary with the required arguments:
     ./target/release/migreedy --input path/to/measurements.csv --output path/to/results.csv --airports ../datasets/airports.csv
    

Installation (python)

  1. Clone this repository:
    git clone [https://github.com/rhendriks/MiGreedy]
    cd [MiGreedy]
    
  2. Install the required Python packages using pip:
    pip install -r requirements.txt
    
  3. Run the script with the required arguments (see below).

     python igreedy.py -i path/to/measurements.csv -o path/to/results.csv -a 1.0 -t 100
    

Command-Line Arguments

Argument Default Description
-i, --input (Required) Path to the input CSV file containing RTT measurements.
-o, --output (Required) Path for the output CSV file where results will be saved.
-a, --alpha 1.0 A float (0.0 to 1.0) to tune the geolocation scoring. A higher alpha prioritizes population density over distance from the disc center.
-t, --threshold None Discards measurements with an RTT greater than this value (in ms) to bound the maximum radius and potential error.

Data Format

Input File Format

The input CSV file must not have a header and should contain the following columns in this specific order:

Column Data Type Description
target string The anycast IP address being measured.
hostname string The hostname or ID of the prober (VP).
lat float The latitude of the prober.
lon float The longitude of the prober.
rtt float The round-trip time (in ms) to the target.

Output File Format

The output CSV file will have a header and contain the following columns:

Column Description
target The anycast IP address.
vp The hostname of the vantage point that defined the disc.
vp_lat The latitude of the vantage point.
vp_lon The longitude of the vantage point.
radius The radius of the disc in kilometers.
pop_iata The IATA code of the geolocated airport. “NoCity” if none found.
pop_lat The latitude of the geolocated airport.
pop_lon The longitude of the geolocated airport.
pop_city The city of the geolocated airport.
pop_cc The country code of the geolocated airport.

Author


Contributing

Issues and pull requests are welcome!

Citation

This code was designed for our paper LACeS. Please use the following citation when using this code.

@misc{hendriks2025lacesopenfastresponsible,
      title={LACeS: an Open, Fast, Responsible and Efficient Longitudinal Anycast Census System}, 
      author={Remi Hendriks and Matthew Luckie and Mattijs Jonker and Raffaele Sommese and Roland van Rijswijk-Deij},
      year={2025},
      eprint={2503.20554},
      archivePrefix={arXiv},
      primaryClass={cs.NI},
      url={https://arxiv.org/abs/2503.20554}, 
}