Skip to content
@bitextor

Bitextor Team

Translation memories generator

Pinned Loading

  1. bitextor bitextor Public

    Bitextor generates translation memories from multilingual websites

    Python 294 42

  2. bicleaner bicleaner Public

    Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.

    Python 158 22

  3. bifixer bifixer Public

    Tool to fix bitexts and tag near-duplicates for removal

    Python 30 3

  4. biroamer biroamer Public

    Utility that will help you to ROAM (Random Omit Anonymize and Mix) your parallel corpus.

    Python 10 2

  5. pdf-extract pdf-extract Public

    PDF parser and converter to HTML

    Java 86 13

  6. warc2text warc2text Public

    Extracts plain text, language identification and more metadata from WARC records

    C++ 23 6

Repositories

Showing 10 of 29 repositories

Top languages

Loading…

Most used topics

Loading…

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy