
Bindings to make working with tribuo more straight forward when using datasets.

;; Classification

tech.v3.dataset.tribuo-test> (def ds (classification-example-ds 10000))
tech.v3.dataset.tribuo-test> (def model (tribuo/train-classification (org.tribuo.classification.xgboost.XGBoostClassificationTrainer. 6) ds :label))
tech.v3.dataset.tribuo-test> (ds/head (tribuo/predict-classification model (ds/remove-columns ds [:label])))
_unnamed [5 3]:

| :prediction |        red |      green |
|         red | 0.92524981 | 0.07475022 |
|       green | 0.07464883 | 0.92535114 |
|       green | 0.07464883 | 0.92535114 |
|         red | 0.92525917 | 0.07474083 |
|       green | 0.07464883 | 0.92535114 |

  ;; Regression
tech.v3.dataset.tribuo-test> (def ds (ds/->dataset "test/data/winequality-red.csv" {:separator \;}))
tech.v3.dataset.tribuo-test> (def model (tribuo/train-regression (org.tribuo.regression.xgboost.XGBoostRegressionTrainer. 50) ds "quality"))
tech.v3.dataset.tribuo-test> (ds/head (tribuo/predict-regression model (ds/remove-columns ds ["quality"])))
_unnamed [5 1]:

| :prediction |
|  5.01974726 |
|  5.02164841 |
|  5.22696543 |
|  5.79519272 |
|  5.01974726 |


(classification-predictions->dataset predictions)

Given the list of predictions from a classification model return a dataset that will include probability distributions when possible. The actual prediction will be in the :prediction column.


(evaluate-regression model ds inf-col-name)

Evaluate a regression model against this model. Returns map of {:rmse :mae :r2}


(make-classification-datasource ds)(make-classification-datasource ds inf-col-name)

Make a single label classification datasource.


(make-regression-datasource ds)(make-regression-datasource ds inf-col-name)

Make a regression datasource from a dataset.


(predict-classification model ds)

Use this model to predict every row of this dataset returning a new dataset containing at least a :prediction column. If this classifier is capable of predicting probability distributions those will be returned as per-label as separate columns.


(predict-regression model ds)

Use a regression model to predict each column of the dataset returning a dataset with at least one column named :prediction.


(train-classification trainer ds & [inf-col-name])

Train a single label classification model. Returns the model.


(train-regression trainer ds & [inf-col-name])

Train a regression model on a dataset returning the model.


(trainer config-components trainer-name)

Creates a tribuo trainer from a list of config components follwing OLCUT convention. One of the components should be a trainer, which is the looked-up by trainer-name and returned.