Skip to content

ropensci/charlatan

Repository files navigation

charlatan

Project Status: Active – The project has reached a stable, usable state and is being actively developed. R-check cran checks cran status rstudio mirror downloads R-CMD-check

charlatan makes fake data, inspired from and borrowing some code from Python's faker (https://github.com/joke2k/faker)

Make fake data for:

  • person names
  • jobs
  • phone numbers
  • colors: names, hex, rgb
  • credit cards
  • DOIs
  • numbers in range and from distributions
  • gene sequences
  • geographic coordinates
  • emails
  • URIs, URLs, and their parts
  • IP addresses
  • more coming ...

Possible use cases for charlatan:

  • Students in a classroom setting learning any task that needs a dataset.
  • People doing simulations/modeling that need some fake data
  • Generate fake dataset of users for a database before actual users exist
  • Complete missing spots in a dataset
  • Generate fake data to replace sensitive real data with before public release
  • Create a random set of colors for visualization
  • Generate random coordinates for a map
  • Get a set of randomly generated DOIs (Digital Object Identifiers) to assign to fake scholarly artifacts
  • Generate fake taxonomic names for a biological dataset
  • Get a set of fake sequences to use to test code/software that uses sequence data

Reasons to use charlatan:

  • Light weight, few dependencies
  • Relatively comprehensive types of data, and more being added
  • Comprehensive set of languages supported, more being added
  • Useful R features such as creating entire fake data.frame's

Installation

cran version

install.packages("charlatan")

dev version

remotes::install_github("ropensci/charlatan")
library("charlatan")
set.seed(12345)

high level function

... for all fake data operations

x <- fraudster()
x$job()
#> [1] "Corporate investment banker"
x$name()
#> [1] "Dr. Garey Hamill"
x$color_name()
#> [1] "Ivory"

locale support

ch_job(locale = "fr_FR", n = 3)
#> [1] "Tailleur de pierre" "Soigneur"           "Ingénieur"
ch_job(locale = "hr_HR", n = 3)
#> [1] "Stalni sudski vještak" "Viši muzejski pedagog" "Kozmetičar"
ch_job(locale = "uk_UA", n = 3)
#> [1] "Льотчик"  "Дипломат" "Педагог"
ch_job(locale = "zh_TW", n = 3)
#> [1] "行政人員"     "珠心算老師"   "飯店工作人員"

generate a dataset

ch_generate()
#> # A tibble: 10 × 3
#>    name                    job                                      phone_number
#>    <chr>                   <chr>                                    <chr>       
#>  1 Deana Mraz DDS          Printmaker                               +25(2)69696…
#>  2 Delina Kilback          Administrator, charities/voluntary orga… 04380296996 
#>  3 Mandi Bailey PhD        Systems analyst                          09381790856 
#>  4 Ms. Trista Jacobson DVM Pharmacist, hospital                     214-956-893…
#>  5 King Bartoletti         Teacher, music                           1-312-788-3…
#>  6 Dr. Ike Gerhold         Audiological scientist                   743.877.3448
#>  7 Dr. Tatyanna Blanda DVM Manufacturing systems engineer           09691101846 
#>  8 Antione Grant           Regulatory affairs officer               (406)994-27…
#>  9 Michal Gutmann          Chartered management accountant          (576)667-99…
#> 10 Ross Cartwright PhD     Video editor                             07913227887
ch_generate("job", "phone_number", n = 30)
#> # A tibble: 30 × 2
#>    job                               phone_number        
#>    <chr>                             <chr>               
#>  1 Scientist, research (medical)     +63(0)0054265468    
#>  2 Contracting civil engineer        +97(1)8445952277    
#>  3 Geneticist, molecular             167-865-4109x84457  
#>  4 Equities trader                   737.695.1498x1212   
#>  5 Interior and spatial designer     +49(7)9909862225    
#>  6 Geophysical data processor        1-884-863-2289x58137
#>  7 Ophthalmologist                   060-919-7672x6069   
#>  8 Engineer, agricultural            180-370-0811x1948   
#>  9 Dealer                            1-838-787-0534      
#> 10 Environmental health practitioner 884.224.4881        
#> # ℹ 20 more rows

job

ch_job()
#> [1] "Set designer"
ch_job(10)
#>  [1] "Actuary"                                    
#>  [2] "Public house manager"                       
#>  [3] "Orthoptist"                                 
#>  [4] "Broadcast engineer"                         
#>  [5] "Scientist, research (physical sciences)"    
#>  [6] "Nature conservation officer"                
#>  [7] "Camera operator"                            
#>  [8] "Psychologist, prison and probation services"
#>  [9] "Engineer, communications"                   
#> [10] "IT sales professional"

credit cards

ch_credit_card_provider()
#> [1] "JCB 15 digit"
ch_credit_card_provider(n = 4)
#> [1] "VISA 16 digit"               "Voyager"                    
#> [3] "JCB 15 digit"                "Diners Club / Carte Blanche"
ch_credit_card_number(n = 10)
#>  [1] "3009338214996378"    "4713530558707"       "3158362208111956356"
#>  [4] "53355347405525029"   "3720351812179086"    "3044619385256147"   
#>  [7] "3789072424345968"    "4208219491023"       "3096893682997724534"
#> [10] "4419344554874021"
ch_credit_card_security_code()
#> [1] "866"
ch_credit_card_security_code(10)
#>  [1] "351"  "462"  "439"  "1922" "497"  "879"  "998"  "368"  "280"  "337"

Documentation

All providers have documentation available through the help functions. All providers of the same locales, are linked together, and for every language we have a generic page, for example?`dutch-language` .

There are three vignettes, about contributing to this project, what {charlatan} does and a more in depth vignette about creating realistic data.

Usage in the wild

Contributors

If you would like to contribute, see CONTRIBUTING (on github)

similar art

Meta

  • Please report any issues or bugs.
  • License: MIT
  • Get citation information for charlatan in R doing citation(package = 'charlatan')
  • Please note that this package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy