tvm-clj.schedule

After describing the algorithm, the user creates a ‘schedule’ for the algorithm which involve transformations to the algorithm that are guaranteed not to change the results such as the tiling a computation across a tensor.

->stage

(->stage stage-or-schedule operation)

view source

create-schedule

(create-schedule op-seq)

view source

inline-op

(inline-op schedule src-op dst-op rel-axis)

Inline an operation on the axis given. If axis is a number, then positive numbers increment left-to-right while negative numbers increment right-to-left in python semantics of the destination axis.

rel-axis defaults to -1, or the most-rapidly-changing index.

view source

parallelize-axis

(parallelize-axis schedule op rel-axis)

view source

schedule-cache-read

(schedule-cache-read schedule tensor cache-type readers)

view source

schedule-cache-write

(schedule-cache-write schedule tensor cache-type)

Returns a new tensor

view source

stage-bind

(stage-bind stage iter-var thread-ivar)

Bind an iter-var to a stage variable

view source

stage-bind-gpu

(stage-bind-gpu stage block-axis-seq thread-axis-seq)

Bind the gpu-defined axis to the tvm axis. GPU (cuda, opencl) define a roughly level stage breakdown of axis: block and thread. Threads run on the same block and can share a special kind of memory (called shared memory). There can be up to 3 tvm axis per block or thread and these are labeled (outer iterator to inner iterator): [z y x]

view source