tech.v3.dataset.column-filters
Queries to select column subsets that have various properites such as all numeric columns, all feature columns, or columns that have a specific datatype.
Further a few set operations (union, intersection, difference) are provided to further manipulate subsets of columns.
All functions are transformations from dataset to dataset.
The functions in this namespace use the metadata on the columns of the dataset, wich can be inspected via clojure.core/meta
categorical
(categorical dataset)
Return a dataset containing only the categorical columns.
column-filter
(column-filter dataset filter-fn)
Return a dataset with only the columns for which the filter function returns a truthy value.
difference
(difference lhs-ds rhs-ds)
(difference lhs-ds)
Return the columns in lhs which do not have an equivalently named column in rhs.
feature
(feature dataset)
Return a dataset container only the columns which have not been marked as inference columns.
intersection
(intersection lhs-ds rhs-ds)
Return only columns for rhs for which an equivalently named column exists in lhs.
metadata-filter
(metadata-filter dataset filter-fn)
Return a dataset with only the columns for which, given the column metadata, the filter function returns a truthy value.
no-missing
(no-missing dataset)
Return a dataset with only columns that have no missing values.
of-datatype
(of-datatype dataset datatype)
Return a dataset containing only the columns of a specific datatype.
probability-distribution
(probability-distribution dataset)
Return the columns of the dataset that comprise the probability distribution after classification.
target
(target dataset)
Return a dataset containing only the columns that have been marked as inference targets.
union
(union lhs-ds rhs-ds)
Return all columns of lhs along with any columns in rhs which have names that do not exist in lhs.