Clojure
machine learning
frameworks
programming languages
data science

Does anybody know any Clojure machine learning framework?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Yes, there are Clojure machine-learning options, but the ecosystem looks different from Python’s. Instead of one dominant package, Clojure users typically combine the Scicloj data-science stack with JVM machine-learning libraries such as Smile, Tribuo, Weka, or DJL.

The Practical Answer in 2026

If you are starting today, the most active Clojure data-science community is around Scicloj and Noj. Older libraries such as clj-ml still exist and can be useful for Weka-based workflows, but the center of gravity has moved toward:

  • 'noj for the current batteries-included data-science stack'
  • 'tech.ml.dataset and tablecloth for data handling'
  • 'metamorph and scicloj.ml style pipelines for model workflows'
  • direct interop with JVM ML libraries when needed

That matters because the right answer is less “find the one Clojure ML framework” and more “use Clojure as a clean host language for a JVM ML stack.”

A Current Pipeline-Style Example

The older scicloj.ml quickstart is still a good way to understand the style, and Noj builds on the same ecosystem. Here is a small pipeline example using the documented API shape:

clojure
;; deps.edn
{:deps
 {scicloj/scicloj.ml {:mvn/version "0.3"}}}
clojure
1(ns demo.core
2  (:require [scicloj.ml.core :as ml]
3            [scicloj.ml.metamorph :as mm]
4            [scicloj.ml.dataset :as ds]))
5
6(def titanic-train
7  (ds/dataset
8   "https://github.com/scicloj/metamorph-examples/raw/main/data/titanic/train.csv"
9   {:key-fn keyword :parser-fn :string}))
10
11(def pipe-fn
12  (ml/pipeline
13   (mm/select-columns [:Survived :Pclass])
14   (mm/add-column :Survived
15                  (fn [dataset]
16                    (map #(case % "1" "yes" "0" "no" "")
17                         (:Survived dataset))))
18   (mm/categorical->number [:Survived :Pclass])
19   (mm/set-inference-target :Survived)
20   {:metamorph/id :model}
21   (mm/model {:model-type :smile.classification/logistic-regression})))
22
23(def trained
24  (pipe-fn {:metamorph/data titanic-train
25            :metamorph/mode :fit}))

This is a good illustration of how Clojure ML often feels: data preparation and pipeline composition are written idiomatically in Clojure, while the actual model implementation may come from Smile or another JVM library underneath.

Older but Still Relevant Options

clj-ml

clj-ml wraps Weka and gives you a more direct “Clojure wrapper over Java ML” experience. It is older, but it is still a reasonable answer for classical algorithms such as decision trees, Naive Bayes, clustering, and evaluation.

Conceptually, the flow looks like this:

clojure
1(use 'clj-ml.classifiers)
2
3(def classifier (make-classifier :decission-tree :c45))
4;; load or create a dataset, set the class column, then train
5;; the classifier on that dataset

If your need is tabular classification with mature JVM tooling rather than modern notebook-style data science, that style can still be productive.

Neanderthal

Neanderthal is not a full ML framework by itself. It is a high-performance numerical computing library. It becomes useful when you are building custom linear algebra or optimization code and want native-level performance from Clojure.

DJL, Tribuo, Smile, and Other JVM Libraries

Clojure’s JVM interop is strong, so using Java libraries directly is often the right move. That is especially true for deep learning, natural language processing, or specialized production inference where the Clojure wrapper ecosystem may be thinner than the Java ecosystem below it.

How to Choose

Use this rule of thumb:

  • Want a modern Clojure-first data workflow: start with Noj and the Scicloj stack.
  • Want classical ML with an existing wrapper: look at clj-ml.
  • Want maximum model coverage on the JVM: call Smile, Tribuo, DJL, or another Java library directly.
  • Want custom numeric kernels: use Neanderthal as a building block.

In other words, Clojure is a strong orchestration language for data and ML, even when the final estimator comes from outside a purely Clojure package.

Common Pitfalls

The first mistake is expecting a one-to-one clone of the Python ecosystem. Clojure’s strength is composition and interop, not cloning scikit-learn package names.

Another mistake is choosing a library based only on age or novelty. Some older wrappers are stable and perfectly usable, while some current projects are better for data wrangling than for model breadth.

Developers also underestimate how useful direct Java interop is. On the JVM, “not written in Clojure” does not mean “hard to use from Clojure.”

Finally, distinguish between data-science tooling and model libraries. Not every library that helps with datasets, notebooks, or visualization is itself an ML framework.

Summary

  • Yes, Clojure has machine-learning options, but they form an ecosystem rather than one dominant framework.
  • The current community momentum is around Scicloj and Noj.
  • 'clj-ml remains a viable older wrapper for Weka-style classical ML.'
  • Clojure works especially well as a host language for JVM ML libraries such as Smile, Tribuo, and DJL.
  • Pick the stack based on your workflow: data-science pipelines, classical ML, deep learning, or custom numerical code.

Course illustration
Course illustration

All Rights Reserved.