tech.v3.dataset.io.univocity
Bindings to univocity. Transforms csv's, tsv's into sequences
of string arrays that are then passed into tech.v3.dataset.io.string-row-parser
methods.
create-csv-parser
(create-csv-parser {:keys [header-row? num-rows column-whitelist column-blacklist column-allowlist column-blocklist separator n-initial-skip-rows], :or {header-row? true}, :as options})Create an implementation of univocity csv parser.
csv->dataset
(csv->dataset input options)(csv->dataset input)Non-lazily and serially parse the columns. Returns a vector of maps of { :name column-name :missing long-reader of in-order missing indexes :data typed reader/writer of data :metadata - optional map with unparsed-indexes and unparsed-values } Supports a subset of tech.v3.dataset/->dataset options: :column-allowlist in preference to :column-whitelist :column-blocklist in preference to :column-blacklist :n-initial-skip-rows :num-rows :header-row? :separator :parser-fn :parser-scan-len
csv->rows
(csv->rows input options)(csv->rows input)Given a csv, produces a sequence of rows. The csv options from ->dataset apply here.
options:
:column-allowlist- either sequence of string column names or sequence of column indices of columns to whitelist. In preference to:column-whitelist.:column-blocklist- either sequence of string column names or sequence of column indices of columns to blacklist. In preference to:column-blacklist.:num-rows- Number of rows to read:separator- Add a character separator to the list of separators to auto-detect.:max-chars-per-column- Defaults to 4096. Columns with more characters that this will result in an exception.:max-num-columns- Defaults to 8192. CSV,TSV files with more columns than this will fail to parse. For more information on this option, please visit: https://github.com/uniVocity/univocity-parsers/issues/301
PApplyWriteOptions
protocol
members
apply-write-options!
(apply-write-options! settings options)raw-row-iterable
(raw-row-iterable input parser)(raw-row-iterable input)Returns an iterable that produces string[]'s
rows->csv!
(rows->csv! output header-string-array row-string-array-seq)(rows->csv! output header-string-array row-string-array-seq {:keys [separator], :or {separator \tab}, :as options})Given an something convertible to an output stream, an optional set of headers as string arrays, and a sequence of string arrows, write a CSV or a TSV file.
Options:
:separator- Defaults to ab.:quoted-columns- For csv, specify which columns should always be quoted regardless of their data.