A Java library for machine learning

  • Provides a framework for clustering (both online and batch, both supervised and unsupervised). Currently implements K-means, UPGMA, and Kohonen Self-Organizing Maps.
  • Implements various Monte Carlo methods, including Metropolis-coupled MCMC.
  • Implements various statistical models of strings, e.g. Markov models, variable-memory Markov models and the closely related Probabilistic Suffix Automata (PSAs) and Trees (PSTs),

This project has some goals in common with Weka and RapidMiner (aka YALE), but is far less developed.

(This has nothing to do with the ml programming language)

Documentation

Download

Maven is by far the easiest way to make use of ml. Just add these to your pom.xml:

<repositories>
	<repository>
		<id>dev.davidsoergel.com releases</id>
		<url>http://dev.davidsoergel.com/artifactory/repo</url>
		<snapshots>
			<enabled>false</enabled>
		</snapshots>
	</repository>
	<repository>
		<id>dev.davidsoergel.com snapshots</id>
		<url>http://dev.davidsoergel.com/artifactory/repo</url>
		<releases>
			<enabled>false</enabled>
		</releases>
	</repository>
</repositories>

<dependencies>
	<dependency>
		<groupId>edu.berkeley.compbio</groupId>
		<artifactId>ml</artifactId>
		<version>0.9</version>
	</dependency>
</dependencies>

If you really want just the jar, you can get it here: ml-0.9.jar (x KB) May 9, 2008

Or get the latest stable build from the continuous integration server.

You can also browse the source, or get the source with svn:

svn co http://svn.davidsoergel.com/repos/ml/trunk ml

Support