manycast/main.rs
1//! # MAnycastR
2//!
3//! MAnycastR (Measuring Anycast Reloaded) is a tool designed to measure anycast infrastructure.
4//!
5//! This includes:
6//!
7//! i) Measuring anycast infrastructure itself
8//! * [Verfploeter](https://ant.isi.edu/~johnh/PAPERS/Vries17b.pdf) (mapping anycast catchments)
9//! * [Site flipping](https://arxiv.org/pdf/2503.14351) (detecting network regions experiencing anycast site flipping)
10//! * Anycast latency (measuring RTT between ping-responsive targets and the anycast infrastructure)
11//! * Optimal deployment (measuring 'best' deployment using unicast latencies from all sites)
12//! * Multi-deployment probing (measure multiple anycast prefixes simultaneously)
13//!
14//! ii) Measuring external anycast infrastructure
15//! * [MAnycast2](https://www.sysnet.ucsd.edu/sysnet/miscpapers/manycast2-imc20.pdf) (measuring anycast using anycast)
16//! * [iGreedy](https://anycast.telecom-paristech.fr/assets/papers/JSAC-16.pdf) (measuring anycast using Great-Circle-Distance latency measurements)
17//!
18//! Both IPv4 and IPv6 measurements are supported, with underlying protocols ICMP, UDP (DNS), and TCP.
19//!
20//! # The components
21//!
22//! Deployment of MAnycastR consists of three components:
23//!
24//! * [Orchestrator](orchestrator) - a central controller orchestrating measurements
25//! * [CLI](cli) - Command-line interface scheduling measurements at the orchestrator and collecting results
26//! * [Worker](worker) - worker deployed on anycast sites, performing measurements
27//!
28//! # Measurement process
29//!
30//! A measurement is started by running the CLI, which can be executed e.g., locally or on a VM.
31//! The CLI sends a measurement definition based on the arguments provided when running the `start` command.
32//! Example commands will be provided in the Usage section.
33//!
34//! Upon receiving a measurement definition, the orchestrator instructs the workers to start the measurement.
35//! Workers perform measurements by sending and receiving probes.
36//!
37//! Workers stream results to the orchestrator, which aggregates and forwards them to the CLI.
38//! The CLI writes results to a CSV file.
39//!
40//! # Measurement types
41//!
42//! Measurements can be;
43//! * `icmp` ICMP ECHO requests
44//! * `dns` UDP DNS A Record requests
45//! * `tcp` TCP SYN/ACK probes
46//! * `chaos` UDP DNS TXT CHAOS requests
47//!
48//! # Measurement parameters
49//!
50//! When creating a measurement you can specify:
51//!
52//! ## Variables
53//! * **Hitlist** - addresses to be probed (can be IP addresses or numbers) (.gz compressed files are supported)
54//! * **Type of measurement** - ICMP, DNS, TCP, or CHAOS
55//! * **Rate** - the rate (packets / second) at which each worker will send out probes (default: 1000)
56//! * **Selective** - specify which workers have to send out probes (all connected workers will listen for packets)
57//! * **Interval** - interval between separate worker's probes to the same target (default: 1s)
58//! * **Address** - source anycast address to use for the probes
59//! * **Source port** - source port to use for the probes (default: 62321)
60//! * **Destination port** - destination port to use for the probes (default: DNS: 53, TCP: 63853)
61//! * **Configuration** - path to a configuration file (allowing for complex configurations of source address, port values used by workers)
62//! * **Query** - specify DNS record to request (TXT (CHAOS) default: hostname.bind, A default: google.com)
63//! * **Responsive** - check if a target is responsive before probing from all workers (unimplemented)
64//! * **Out** - path to file or directory to store measurement results (default: ./)
65//! * **URL** - encode URL in probes (e.g., for providing opt-out information, explaining the measurement, etc.)
66//!
67//! ## Flags
68//! * **Stream** - stream results to the command-line interface (optional)
69//! * **Shuffle** - shuffle the hitlist
70//! * **Unicast** - perform measurement using the unicast address of each worker
71//! * **Divide** - divide-and-conquer Verfploeter catchment mapping
72//!
73//! # Usage
74//!
75//! First, run the central orchestrator.
76//! ```
77//! orchestrator -p [PORT NUMBER]
78//! ```
79//!
80//! Next, run one or more workers.
81//! ```
82//! worker -a [ORC ADDRESS]
83//! ```
84//! Orchestrator address has format IPv4:port (e.g., 187.0.0.0:50001)
85//!
86//! To confirm that the workers are connected, you can run the worker-list command on the CLI.
87//! ```
88//! cli -a [ORC ADDRESS] worker-list
89//! ```
90//!
91//! Finally, you can perform a measurement.
92//! ```
93//! cli -a [ORC ADDRESS] start [parameters]
94//! ```
95//!
96//! ## Examples
97//!
98//! ### Verfploeter catchment mapping using ICMPv4
99//!
100//! ```
101//! cli -a [::1]:50001 start hitlist.txt -t icmp -a 10.0.0.0 -o results.csv
102//! ```
103//!
104//! All workers probe the targets in hitlist.txt using ICMPv4, using source address 10.0.0.0, results are stored in results.csv
105//!
106//! With this measurement each target receives a probe from each worker.
107//! Filtering on sender == receiver allows for calculating anycast RTTs.
108//!
109//! ### Divide-and-conquer Verfploeter catchment mapping using TCPv4
110//!
111//! ```
112//! cli -a [::1]:50001 start hitlist.txt -t tcp -a 10.0.0.0 --divide
113//! ```
114//!
115//! hitlist.txt will be split in equal parts among workers (divide-and-conquer), results are stored in ./
116//!
117//! Enabling divide-and-conquer means each target receives a single probe, whereas before each worker would probe each target.
118//! Benefits are; lower probing burden on targets, less data to process, faster measurements (hitlist split among workers).
119//! Whilst this provides a quick catchment mapping, the downside is that you will not be able to calculate anycast RTTs.
120//!
121//! ### Unicast latency measurement using ICMPv6
122//!
123//! ```
124//! cli -a [::1]:50001 start hitlistv6.txt -t icmp --unicast
125//! ```
126//!
127//! Since the hitlist contains IPv6 addresses, the workers will probe the targets using their IPv6 unicast address.
128//!
129//! This feature gives the latency between all anycast sites and each target in the hitlist.
130//! Filtering on the lowest unicast RTTs indicates the best anycast site for each target.
131//!
132//! # Requirements
133//!
134//! * rustup
135//! * protobuf-compiler
136//! * musl-tools
137//! * gcc
138//!
139//! # Installation
140//!
141//! ## Cargo (static binary)
142//!
143//! ### Install rustup
144//! ```bash
145//! curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
146//! source $HOME/.cargo/env
147//! ```
148//!
149//! ### Install dependencies
150//! ```bash
151//! apt-get install -y protobuf-compiler gcc musl-tools
152//! ```
153//!
154//! ### Install musl target
155//! ```bash
156//! rustup target add x86_64-unknown-linux-musl
157//! ```
158//!
159//! ### Clone the repository
160//! ```bash
161//! git clone <repo>
162//! cd <repo_dir>
163//! ```
164//!
165//! ### Compile the code (16 MB binary)
166//! ```bash
167//! cargo build --release --target x86_64-unknown-linux-musl
168//! ```
169//!
170//! ### Optionally strip the binary (16 MB -> 7.7 MB)
171//! ```bash
172//! strip target/x86_64-unknown-linux-musl/release/manycast
173//! ```
174//!
175//! Next, distribute the binary to the workers.
176//!
177//! Workers need either sudo or the CAP_NET_RAW capability to send out packets.
178//! ```bash
179//! sudo setcap cap_net_raw,cap_net_admin=eip manycast
180//! ```
181//!
182//! ## Docker
183//!
184//! ### Build the Docker image
185//! ```bash
186//! docker build -t manycast .
187//! ```
188//!
189//! Advise is to run the container with network host mode.
190//! Additionally, the container needs the CAP_NET_RAW and CAP_NET_ADMIN capability to send out packets.
191//! ```bash
192//! docker run -it --network host --cap-add=NET_RAW --cap-add=NET_ADMIN manycast
193//! ```
194//!
195//! # Future
196//!
197//! * Responsiveness pre-check
198//! * Anycast traceroute
199//! * Allow feed of targets (instead of a pre-defined hitlist)
200//! * Support multiple packets per <worker, target> pair
201//! * Synchronous unicast and anycast measurements
202//! * Anycast latency using divide-and-conquer (probe 1; assess catching anycast site - probe 2; probe from catching site to obtain latency)
203
204use clap::{value_parser, Arg, ArgAction, ArgMatches, Command};
205
206mod cli;
207mod custom_module;
208mod net;
209mod orchestrator;
210mod worker;
211
212// Measurement type IDs
213pub const ICMP_ID: u8 = 1; // ICMP ECHO
214pub const A_ID: u8 = 2; // UDP DNS A Record
215pub const TCP_ID: u8 = 3; // TCP SYN/ACK
216pub const CHAOS_ID: u8 = 4; // UDP DNS TXT CHAOS
217pub const ALL_ID: u8 = 255; // All measurement types
218
219/// Parse command line input and start MAnycastR orchestrator (orchestrator), worker, or CLI
220///
221/// Sets up logging, parses the command-line arguments, runs the appropriate initialization function.
222fn main() {
223 // Parse the command-line arguments
224 let matches = parse_cmd();
225
226 if let Some(worker_matches) = matches.subcommand_matches("worker") {
227 println!("[Main] Executing worker version {}", env!("GIT_HASH"));
228
229 let rt = tokio::runtime::Builder::new_current_thread()
230 .enable_all()
231 .build()
232 .unwrap();
233
234 let _ = rt.block_on(async { worker::Worker::new(worker_matches).await.expect("Unable to create a worker (make sure the Server address is correct, and that the Server is running)") });
235 }
236 // If the cli subcommand was selected, execute the cli module (i.e. the cli::execute function)
237 else if let Some(cli_matches) = matches.subcommand_matches("cli") {
238 println!("[Main] Executing CLI version {}", env!("GIT_HASH"));
239
240 let _ = cli::execute(cli_matches);
241 } else if let Some(server_matches) = matches.subcommand_matches("orchestrator") {
242 println!("[Main] Executing orchestrator version {}", env!("GIT_HASH"));
243
244 let rt = tokio::runtime::Builder::new_current_thread()
245 .enable_all()
246 .build()
247 .unwrap();
248
249 rt.block_on(async { orchestrator::start(server_matches).await.unwrap() });
250 }
251}
252
253fn parse_cmd() -> ArgMatches {
254 Command::new("MAnycastR")
255 .version(env!("GIT_HASH"))
256 .author("Remi Hendriks <remi.hendriks@utwente.nl>")
257 .about("Performs synchronized Internet measurement from a distributed set of anycast sites")
258 .subcommand(
259 Command::new("orchestrator").about("Launches the MAnycastR orchestrator")
260 .arg(
261 Arg::new("port")
262 .long("port")
263 .short('p')
264 .value_parser(value_parser!(u16))
265 .required(false)
266 .default_value("50001")
267 .help("Port to listen on")
268 )
269 .arg(
270 Arg::new("tls")
271 .long("tls")
272 .action(ArgAction::SetTrue)
273 .required(false)
274 .help("Use TLS for communication with the orchestrator (requires orchestrator.crt and orchestrator.key in ./tls/)")
275 )
276 .arg(
277 Arg::new("config")
278 .long("config")
279 .short('c')
280 .value_parser(value_parser!(String))
281 .required(false)
282 .help("Worker list configuration (mapping hostname to ID)")
283 )
284 )
285 .subcommand(
286 Command::new("worker").about("Launches the MAnycastR worker")
287 .arg(
288 Arg::new("orchestrator")
289 .short('a')
290 .value_parser(value_parser!(String))
291 .required(true)
292 .help("address:port of the orchestrator (e.g., 10.0.0.0:50001 or [::1]:50001)")
293 )
294 .arg(
295 Arg::new("hostname")
296 .long("hostname")
297 .short('n')
298 .value_parser(value_parser!(String))
299 .required(false)
300 .help("hostname for this worker (default: $HOSTNAME)")
301 )
302 .arg(
303 Arg::new("tls")
304 .long("tls")
305 .value_parser(value_parser!(String))
306 .required(false)
307 .help("Use TLS for communication with the orchestrator (requires orchestrator.crt in ./tls/), takes a FQDN as argument")
308 )
309 )
310 .subcommand(
311 Command::new("cli").about("MAnycastR CLI")
312 .arg(
313 Arg::new("orchestrator")
314 .short('a')
315 .value_parser(value_parser!(String))
316 .required(true)
317 .help("address:port of the orchestrator (e.g., 10.0.0.0:50001 or [::1]:50001)")
318 )
319 .arg(
320 Arg::new("tls")
321 .long("tls")
322 .value_parser(value_parser!(String))
323 .required(false)
324 .help("Use TLS for communication with the orchestrator (requires orchestrator.crt in ./tls/), takes a FQDN as argument")
325 )
326 .subcommand(Command::new("worker-list").about("retrieves a list of currently connected workers from the orchestrator"))
327 .subcommand(Command::new("start").about("performs MAnycastR on the indicated worker")
328 .arg(Arg::new("IP_FILE").help("A file that contains IP addresses to probe")
329 .required(true)
330 .index(1)
331 )
332 .arg(Arg::new("type")
333 .long("type")
334 .short('t')
335 .value_parser(value_parser!(String))
336 .required(false)
337 .default_value("icmp")
338 .help("The type of measurement (icmp, dns, tcp, chaos, all)")
339 )
340 .arg(Arg::new("rate")
341 .long("rate")
342 .short('r')
343 .value_parser(value_parser!(u32))
344 .required(false)
345 .default_value("1000")
346 .help("Probing rate at each worker (number of outgoing packets / second)")
347 )
348 .arg(Arg::new("selective")
349 .long("selective")
350 .short('x')
351 .value_parser(value_parser!(String))
352 .required(false)
353 .help("Specify which workers have to send out probes (all connected workers will listen for packets) [worker_id1,worker_id2,...]")
354 )
355 .arg(Arg::new("stream")
356 .long("stream")
357 .action(ArgAction::SetTrue)
358 .required(false)
359 .help("Stream results to stdout")
360 )
361 .arg(Arg::new("shuffle")
362 .long("shuffle")
363 .action(ArgAction::SetTrue)
364 .required(false)
365 .help("Randomly shuffle the ip file")
366 )
367 .arg(Arg::new("unicast")
368 .long("unicast")
369 .action(ArgAction::SetTrue)
370 .help("Probe the targets using the unicast address of each worker (GCD measurement)")
371 )
372 .arg(Arg::new("worker_interval")
373 .long("worker-interval")
374 .short('w')
375 .value_parser(value_parser!(u32))
376 .required(false)
377 .default_value("1")
378 .help("Interval between separate worker's probes to the same target")
379 )
380 .arg(Arg::new("probe_interval")
381 .long("probe-interval")
382 .short('i')
383 .value_parser(value_parser!(u32))
384 .required(false)
385 .default_value("1")
386 .help("Interval between separate probes to the same target")
387 )
388 .arg(Arg::new("number_of_probes")
389 .long("nprobes")
390 .short('c')
391 .value_parser(value_parser!(u32))
392 .required(false)
393 .default_value("1")
394 .help("Number of probes to send to each origin,target pair [NOTE: violates probing rate]")
395 )
396 .arg(Arg::new("divide")
397 .long("divide")
398 .action(ArgAction::SetTrue)
399 .required(false)
400 .help("Divide the hitlist into equal separate parts for each worker (divide-and-conquer)")
401 )
402 .arg(Arg::new("address")
403 .long("addr")
404 .short('a')
405 .value_parser(value_parser!(String))
406 .required(false)
407 .help("Source address to use for the probes")
408 )
409 .arg(Arg::new("source port")
410 .long("sport")
411 .short('s')
412 .value_parser(value_parser!(u16))
413 .required(false)
414 .default_value("62321")
415 .help("Source port to use")
416 )
417 .arg(Arg::new("destination port")
418 .long("dport")
419 .short('d')
420 .value_parser(value_parser!(u16))
421 .required(false)
422 .help("Destination port to use (default DNS: 53, TCP: 63853)")
423 )
424 .arg(Arg::new("configuration")
425 .long("conf")
426 .short('f')
427 .value_parser(value_parser!(String))
428 .required(false)
429 .help("Path to the configuration file")
430 )
431 .arg(Arg::new("query")
432 .long("query")
433 .short('q')
434 .value_parser(value_parser!(String))
435 .required(false)
436 .help("Specify DNS record to request (TXT (CHAOS) default: hostname.bind, A default: google.com)")
437 )
438 .arg(Arg::new("responsive")
439 .long("responsive")
440 .action(ArgAction::SetTrue)
441 .required(false)
442 .help("First check if the target is responsive from a single worker before sending probes from multiple workers/origins")
443 )
444 .arg(Arg::new("latency")
445 .long("latency")
446 .short('l')
447 .action(ArgAction::SetTrue)
448 .required(false)
449 .help("Measure anycast latencies (first, measure catching PoP; second, measure latency from catching PoP to target) [NOTE: currently violates probing rate at workers with high catchments]")
450 )
451 .arg(Arg::new("out")
452 .long("out")
453 .short('o')
454 .value_parser(value_parser!(String))
455 .required(false)
456 .help("Optional path and/or filename to store the results of the measurement (default ./)")
457 )
458 .arg(Arg::new("url")
459 .long("url")
460 .short('u')
461 .value_parser(value_parser!(String))
462 .required(false)
463 .default_value("")
464 .help("Encode URL in probes (e.g., for providing opt-out information, explaining the measurement, etc.)")
465 )
466 .arg(Arg::new("parquet")
467 .long("parquet")
468 .action(ArgAction::SetTrue)
469 .required(false)
470 .help("Write results as parquet instead of .csv.gz")
471 )
472 )
473 )
474 .get_matches()
475}