manycast/
main.rs

1//! # MAnycastR
2//!
3//! MAnycastR (Measuring Anycast Reloaded) is a tool designed to measure anycast infrastructure.
4//!
5//! This includes:
6//!
7//! i) Measuring anycast infrastructure itself
8//! * [Verfploeter](https://ant.isi.edu/~johnh/PAPERS/Vries17b.pdf) (mapping anycast catchments)
9//! * [Site flipping](https://arxiv.org/pdf/2503.14351) (detecting network regions experiencing anycast site flipping)
10//! * Anycast latency (measuring RTT between ping-responsive targets and the anycast infrastructure)
11//! * Optimal deployment (measuring 'best' deployment using unicast latencies from all sites)
12//! * Multi-deployment probing (measure multiple anycast prefixes simultaneously)
13//!
14//! ii) Measuring external anycast infrastructure
15//! * [MAnycast2](https://www.sysnet.ucsd.edu/sysnet/miscpapers/manycast2-imc20.pdf) (measuring anycast using anycast)
16//! * [iGreedy](https://anycast.telecom-paristech.fr/assets/papers/JSAC-16.pdf) (measuring anycast using Great-Circle-Distance latency measurements)
17//!
18//! Both IPv4 and IPv6 measurements are supported, with underlying protocols ICMP, UDP (DNS), and TCP.
19//!
20//! # The components
21//!
22//! Deployment of MAnycastR consists of three components:
23//!
24//! * [Orchestrator](orchestrator) - a central controller orchestrating measurements
25//! * [CLI](cli) - Command-line interface scheduling measurements at the orchestrator and collecting results
26//! * [Worker](worker) - worker deployed on anycast sites, performing measurements
27//!
28//! # Measurement process
29//!
30//! A measurement is started by running the CLI, which can be executed e.g., locally or on a VM.
31//! The CLI sends a measurement definition based on the arguments provided when running the `start` command.
32//! Example commands will be provided in the Usage section.
33//!
34//! Upon receiving a measurement definition, the orchestrator instructs the workers to start the measurement.
35//! Workers perform measurements by sending and receiving probes.
36//!
37//! Workers stream results to the orchestrator, which aggregates and forwards them to the CLI.
38//! The CLI writes results to a CSV file.
39//!
40//! # Measurement types
41//!
42//! Measurements can be;
43//! * `icmp` ICMP ECHO requests
44//! * `dns` UDP DNS A Record requests
45//! * `tcp` TCP SYN/ACK probes
46//! * `chaos` UDP DNS TXT CHAOS requests
47//!
48//! # Measurement parameters
49//!
50//! When creating a measurement you can specify:
51//!
52//! ## Variables
53//! * **Hitlist** - addresses to be probed (can be IP addresses or numbers) (.gz compressed files are supported)
54//! * **Type of measurement** - ICMP, DNS, TCP, or CHAOS
55//! * **Rate** - the rate (packets / second) at which each worker will send out probes (default: 1000)
56//! * **Selective** - specify which workers have to send out probes (all connected workers will listen for packets)
57//! * **Interval** - interval between separate worker's probes to the same target (default: 1s)
58//! * **Address** - source anycast address to use for the probes
59//! * **Source port** - source port to use for the probes (default: 62321)
60//! * **Destination port** - destination port to use for the probes (default: DNS: 53, TCP: 63853)
61//! * **Configuration** - path to a configuration file (allowing for complex configurations of source address, port values used by workers)
62//! * **Query** - specify DNS record to request (TXT (CHAOS) default: hostname.bind, A default: google.com)
63//! * **Responsive** - check if a target is responsive before probing from all workers (unimplemented)
64//! * **Out** - path to file or directory to store measurement results (default: ./)
65//! * **URL** - encode URL in probes (e.g., for providing opt-out information, explaining the measurement, etc.)
66//!
67//! ## Flags
68//! * **Stream** - stream results to the command-line interface (optional)
69//! * **Shuffle** - shuffle the hitlist
70//! * **Unicast** - perform measurement using the unicast address of each worker
71//! * **Divide** - divide-and-conquer Verfploeter catchment mapping
72//!
73//! # Usage
74//!
75//! First, run the central orchestrator.
76//! ```
77//! orchestrator -p [PORT NUMBER]
78//! ```
79//!
80//! Next, run one or more workers.
81//! ```
82//! worker -a [ORC ADDRESS]
83//! ```
84//! Orchestrator address has format IPv4:port (e.g., 187.0.0.0:50001)
85//!
86//! To confirm that the workers are connected, you can run the worker-list command on the CLI.
87//! ```
88//! cli -a [ORC ADDRESS] worker-list
89//! ```
90//!
91//! Finally, you can perform a measurement.
92//! ```
93//! cli -a [ORC ADDRESS] start [parameters]
94//! ```
95//!
96//! ## Examples
97//!
98//! ### Verfploeter catchment mapping using ICMPv4
99//!
100//! ```
101//! cli -a [::1]:50001 start hitlist.txt -t icmp -a 10.0.0.0 -o results.csv
102//! ```
103//!
104//! All workers probe the targets in hitlist.txt using ICMPv4, using source address 10.0.0.0, results are stored in results.csv
105//!
106//! With this measurement each target receives a probe from each worker.
107//! Filtering on sender == receiver allows for calculating anycast RTTs.
108//!
109//! ### Divide-and-conquer Verfploeter catchment mapping using TCPv4
110//!
111//! ```
112//! cli -a [::1]:50001 start hitlist.txt -t tcp -a 10.0.0.0 --divide
113//! ```
114//!
115//! hitlist.txt will be split in equal parts among workers (divide-and-conquer), results are stored in ./
116//!
117//! Enabling divide-and-conquer means each target receives a single probe, whereas before each worker would probe each target.
118//! Benefits are; lower probing burden on targets, less data to process, faster measurements (hitlist split among workers).
119//! Whilst this provides a quick catchment mapping, the downside is that you will not be able to calculate anycast RTTs.
120//!
121//! ### Unicast latency measurement using ICMPv6
122//!
123//! ```
124//! cli -a [::1]:50001 start hitlistv6.txt -t icmp --unicast
125//! ```
126//!
127//! Since the hitlist contains IPv6 addresses, the workers will probe the targets using their IPv6 unicast address.
128//!
129//! This feature gives the latency between all anycast sites and each target in the hitlist.
130//! Filtering on the lowest unicast RTTs indicates the best anycast site for each target.
131//!
132//! # Requirements
133//!
134//! * rustup
135//! * protobuf-compiler
136//! * musl-tools
137//! * gcc
138//!
139//! # Installation
140//!
141//! ## Cargo (static binary)
142//!
143//! ### Install rustup
144//! ```bash
145//! curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
146//! source $HOME/.cargo/env
147//! ```
148//!
149//! ### Install dependencies
150//! ```bash
151//! apt-get install -y protobuf-compiler gcc musl-tools
152//! ```
153//!
154//! ### Install musl target
155//! ```bash
156//! rustup target add x86_64-unknown-linux-musl
157//! ```
158//!
159//! ### Clone the repository
160//! ```bash
161//! git clone <repo>
162//! cd <repo_dir>
163//! ```
164//!
165//! ### Compile the code (16 MB binary)
166//! ```bash
167//! cargo build --release --target x86_64-unknown-linux-musl
168//! ```
169//!
170//! ### Optionally strip the binary (16 MB -> 7.7 MB)
171//! ```bash
172//! strip target/x86_64-unknown-linux-musl/release/manycast
173//! ```
174//!
175//! Next, distribute the binary to the workers.
176//!
177//! Workers need either sudo or the CAP_NET_RAW capability to send out packets.
178//! ```bash
179//! sudo setcap cap_net_raw,cap_net_admin=eip manycast
180//! ```
181//!
182//! ## Docker
183//!
184//! ### Build the Docker image
185//! ```bash
186//! docker build -t manycast .
187//! ```
188//!
189//! Advise is to run the container with network host mode.
190//! Additionally, the container needs the CAP_NET_RAW and CAP_NET_ADMIN capability to send out packets.
191//! ```bash
192//! docker run -it --network host --cap-add=NET_RAW --cap-add=NET_ADMIN manycast
193//! ```
194//!
195//! # Future
196//!
197//! * Responsiveness pre-check
198//! * Anycast traceroute
199//! * Allow feed of targets (instead of a pre-defined hitlist)
200//! * Support multiple packets per <worker, target> pair
201//! * Synchronous unicast and anycast measurements
202//! * Anycast latency using divide-and-conquer (probe 1; assess catching anycast site - probe 2; probe from catching site to obtain latency)
203
204use clap::{value_parser, Arg, ArgAction, ArgMatches, Command};
205
206mod cli;
207mod custom_module;
208mod net;
209mod orchestrator;
210mod worker;
211
212// Measurement type IDs
213pub const ICMP_ID: u8 = 1; // ICMP ECHO
214pub const A_ID: u8 = 2; // UDP DNS A Record
215pub const TCP_ID: u8 = 3; // TCP SYN/ACK
216pub const CHAOS_ID: u8 = 4; // UDP DNS TXT CHAOS
217pub const ALL_ID: u8 = 255; // All measurement types
218
219/// Parse command line input and start MAnycastR orchestrator (orchestrator), worker, or CLI
220///
221/// Sets up logging, parses the command-line arguments, runs the appropriate initialization function.
222fn main() {
223    // Parse the command-line arguments
224    let matches = parse_cmd();
225
226    if let Some(worker_matches) = matches.subcommand_matches("worker") {
227        println!("[Main] Executing worker version {}", env!("GIT_HASH"));
228
229        let rt = tokio::runtime::Builder::new_current_thread()
230            .enable_all()
231            .build()
232            .unwrap();
233
234        let _ = rt.block_on(async { worker::Worker::new(worker_matches).await.expect("Unable to create a worker (make sure the Server address is correct, and that the Server is running)") });
235    }
236    // If the cli subcommand was selected, execute the cli module (i.e. the cli::execute function)
237    else if let Some(cli_matches) = matches.subcommand_matches("cli") {
238        println!("[Main] Executing CLI version {}", env!("GIT_HASH"));
239
240        let _ = cli::execute(cli_matches);
241    } else if let Some(server_matches) = matches.subcommand_matches("orchestrator") {
242        println!("[Main] Executing orchestrator version {}", env!("GIT_HASH"));
243
244        let rt = tokio::runtime::Builder::new_current_thread()
245            .enable_all()
246            .build()
247            .unwrap();
248
249        rt.block_on(async { orchestrator::start(server_matches).await.unwrap() });
250    }
251}
252
253fn parse_cmd() -> ArgMatches {
254    Command::new("MAnycastR")
255        .version(env!("GIT_HASH"))
256        .author("Remi Hendriks <remi.hendriks@utwente.nl>")
257        .about("Performs synchronized Internet measurement from a distributed set of anycast sites")
258        .subcommand(
259            Command::new("orchestrator").about("Launches the MAnycastR orchestrator")
260                .arg(
261                    Arg::new("port")
262                        .long("port")
263                        .short('p')
264                        .value_parser(value_parser!(u16))
265                        .required(false)
266                        .default_value("50001")
267                        .help("Port to listen on")
268                )
269                .arg(
270                    Arg::new("tls")
271                        .long("tls")
272                        .action(ArgAction::SetTrue)
273                        .required(false)
274                        .help("Use TLS for communication with the orchestrator (requires orchestrator.crt and orchestrator.key in ./tls/)")
275                )
276                .arg(
277                    Arg::new("config")
278                        .long("config")
279                        .short('c')
280                        .value_parser(value_parser!(String))
281                        .required(false)
282                        .help("Worker list configuration (mapping hostname to ID)")
283                )
284        )
285        .subcommand(
286            Command::new("worker").about("Launches the MAnycastR worker")
287                .arg(
288                    Arg::new("orchestrator")
289                        .short('a')
290                        .value_parser(value_parser!(String))
291                        .required(true)
292                        .help("address:port of the orchestrator (e.g., 10.0.0.0:50001 or [::1]:50001)")
293                )
294                .arg(
295                    Arg::new("hostname")
296                        .long("hostname")
297                        .short('n')
298                        .value_parser(value_parser!(String))
299                        .required(false)
300                        .help("hostname for this worker (default: $HOSTNAME)")
301                )
302                .arg(
303                    Arg::new("tls")
304                        .long("tls")
305                        .value_parser(value_parser!(String))
306                        .required(false)
307                        .help("Use TLS for communication with the orchestrator (requires orchestrator.crt in ./tls/), takes a FQDN as argument")
308                )
309        )
310        .subcommand(
311            Command::new("cli").about("MAnycastR CLI")
312                .arg(
313                    Arg::new("orchestrator")
314                        .short('a')
315                        .value_parser(value_parser!(String))
316                        .required(true)
317                        .help("address:port of the orchestrator (e.g., 10.0.0.0:50001 or [::1]:50001)")
318                )
319                .arg(
320                    Arg::new("tls")
321                        .long("tls")
322                        .value_parser(value_parser!(String))
323                        .required(false)
324                        .help("Use TLS for communication with the orchestrator (requires orchestrator.crt in ./tls/), takes a FQDN as argument")
325                )
326                .subcommand(Command::new("worker-list").about("retrieves a list of currently connected workers from the orchestrator"))
327                .subcommand(Command::new("start").about("performs MAnycastR on the indicated worker")
328                    .arg(Arg::new("IP_FILE").help("A file that contains IP addresses to probe")
329                        .required(true)
330                        .index(1)
331                    )
332                    .arg(Arg::new("type")
333                        .long("type")
334                        .short('t')
335                        .value_parser(value_parser!(String))
336                        .required(false)
337                        .default_value("icmp")
338                        .help("The type of measurement (icmp, dns, tcp, chaos, all)")
339                    )
340                    .arg(Arg::new("rate")
341                        .long("rate")
342                        .short('r')
343                        .value_parser(value_parser!(u32))
344                        .required(false)
345                        .default_value("1000")
346                        .help("Probing rate at each worker (number of outgoing packets / second)")
347                    )
348                    .arg(Arg::new("selective")
349                        .long("selective")
350                        .short('x')
351                        .value_parser(value_parser!(String))
352                        .required(false)
353                        .help("Specify which workers have to send out probes (all connected workers will listen for packets) [worker_id1,worker_id2,...]")
354                    )
355                    .arg(Arg::new("stream")
356                        .long("stream")
357                        .action(ArgAction::SetTrue)
358                        .required(false)
359                        .help("Stream results to stdout")
360                    )
361                    .arg(Arg::new("shuffle")
362                        .long("shuffle")
363                        .action(ArgAction::SetTrue)
364                        .required(false)
365                        .help("Randomly shuffle the ip file")
366                    )
367                    .arg(Arg::new("unicast")
368                        .long("unicast")
369                        .action(ArgAction::SetTrue)
370                        .help("Probe the targets using the unicast address of each worker (GCD measurement)")
371                    )
372                    .arg(Arg::new("worker_interval")
373                        .long("worker-interval")
374                        .short('w')
375                        .value_parser(value_parser!(u32))
376                        .required(false)
377                        .default_value("1")
378                        .help("Interval between separate worker's probes to the same target")
379                    )
380                    .arg(Arg::new("probe_interval")
381                        .long("probe-interval")
382                        .short('i')
383                        .value_parser(value_parser!(u32))
384                        .required(false)
385                        .default_value("1")
386                        .help("Interval between separate probes to the same target")
387                    )
388                    .arg(Arg::new("number_of_probes")
389                        .long("nprobes")
390                        .short('c')
391                        .value_parser(value_parser!(u32))
392                        .required(false)
393                        .default_value("1")
394                        .help("Number of probes to send to each origin,target pair [NOTE: violates probing rate]")
395                    )
396                    .arg(Arg::new("divide")
397                        .long("divide")
398                        .action(ArgAction::SetTrue)
399                        .required(false)
400                        .help("Divide the hitlist into equal separate parts for each worker (divide-and-conquer)")
401                    )
402                    .arg(Arg::new("address")
403                        .long("addr")
404                        .short('a')
405                        .value_parser(value_parser!(String))
406                        .required(false)
407                        .help("Source address to use for the probes")
408                    )
409                    .arg(Arg::new("source port")
410                        .long("sport")
411                        .short('s')
412                        .value_parser(value_parser!(u16))
413                        .required(false)
414                        .default_value("62321")
415                        .help("Source port to use")
416                    )
417                    .arg(Arg::new("destination port")
418                        .long("dport")
419                        .short('d')
420                        .value_parser(value_parser!(u16))
421                        .required(false)
422                        .help("Destination port to use (default DNS: 53, TCP: 63853)")
423                    )
424                    .arg(Arg::new("configuration")
425                        .long("conf")
426                        .short('f')
427                        .value_parser(value_parser!(String))
428                        .required(false)
429                        .help("Path to the configuration file")
430                    )
431                    .arg(Arg::new("query")
432                        .long("query")
433                        .short('q')
434                        .value_parser(value_parser!(String))
435                        .required(false)
436                        .help("Specify DNS record to request (TXT (CHAOS) default: hostname.bind, A default: google.com)")
437                    )
438                    .arg(Arg::new("responsive")
439                        .long("responsive")
440                        .action(ArgAction::SetTrue)
441                        .required(false)
442                        .help("First check if the target is responsive from a single worker before sending probes from multiple workers/origins")
443                    )
444                    .arg(Arg::new("latency")
445                        .long("latency")
446                        .short('l')
447                        .action(ArgAction::SetTrue)
448                        .required(false)
449                        .help("Measure anycast latencies (first, measure catching PoP; second, measure latency from catching PoP to target) [NOTE: currently violates probing rate at workers with high catchments]")
450                    )
451                    .arg(Arg::new("out")
452                        .long("out")
453                        .short('o')
454                        .value_parser(value_parser!(String))
455                        .required(false)
456                        .help("Optional path and/or filename to store the results of the measurement (default ./)")
457                    )
458                    .arg(Arg::new("url")
459                        .long("url")
460                        .short('u')
461                        .value_parser(value_parser!(String))
462                        .required(false)
463                        .default_value("")
464                        .help("Encode URL in probes (e.g., for providing opt-out information, explaining the measurement, etc.)")
465                    )
466                    .arg(Arg::new("parquet")
467                        .long("parquet")
468                        .action(ArgAction::SetTrue)
469                        .required(false)
470                        .help("Write results as parquet instead of .csv.gz")
471                    )
472                )
473        )
474        .get_matches()
475}