Content-Length: 201271 | pFad | http://github.com/USPTO/PatentPublicData/issues/83

A1 Transform to CSV · Issue #83 · USPTO/PatentPublicData · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transform to CSV #83

Open
bgfeldm opened this issue Mar 5, 2019 · 0 comments
Open

Transform to CSV #83

bgfeldm opened this issue Mar 5, 2019 · 0 comments
Assignees

Comments

@bgfeldm
Copy link
Contributor

bgfeldm commented Mar 5, 2019

Transform into CSV of two variations

  1. Exploded CSV to load into Relational Database Tables
  • Handling multiple values
    • multiple files are created, named by their type
    • Each individual entity becomes a record row to load
  1. Flat CSV to load into Solr Index
  • Handling multiple values
    • Individual entities are grouped together as multiple values delimited by a pipe "|"
  • Fields names, by default, will use Solr's default dynamic field endings
    • create a solr core/collection and index, with less setup time, eliminating the initial need to create a solr schema

Both formats will be useful for big data processing. And most all databases have native support for loading CSV.

@bgfeldm bgfeldm self-assigned this Mar 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/USPTO/PatentPublicData/issues/83

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy