Skip to content

Latest commit

 

History

History
36 lines (26 loc) · 925 Bytes

README.md

File metadata and controls

36 lines (26 loc) · 925 Bytes

Rust-Warc

crates.io

A high performance and easy to use Web Archive (WARC) file reader

use rust_warc::WarcReader;

use std::io;

fn main() {
    // we're taking input from stdin here, but any BufRead will do
    let stdin = io::stdin();
    let handle = stdin.lock();

    let warc = WarcReader::new(handle);

    let mut response_counter = 0;
    let mut response_size = 0;

    for item in warc {
        let record = item.unwrap(); // could be IO/malformed error

        // header names are case insensitive
        if record.header.get(&"WARC-Type".into()) == Some(&"response".into()) {
            response_counter += 1;
            response_size += record.content.len();
        }
    }

    println!("response records: {}", response_counter);
    println!("response size: {} MiB", response_size >> 20);
}
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy