This package forms a complete gradient descent machine learning library. Modules support vector machines in classification and regression, ensemble models such as bagging or adaboost, non-parametric models such as K-nearest neighbors, Parzen regression, and Parzen density estimation. Includes speech recognition tools. Written in C++ [BSD]
A repository of databases, domain theories and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms at the University of California at Irvine. [Free]
A library of C code useful for writing statistical text analysis, language modeling, and information retrieval programs. The current distribution includes the library, as well as front-ends for document classification (rainbow), document retrieval (arrow) and document clustering (crossbow). [LGPL]
A collection of tools that implement decision trees and tables, rule learners, Naive Bayes, support vector machines, voted perceptrons, multi-layer perceptron. Meta schemes include bagging, stacking, and boosting. Written in Java. [GPL]
A system of research planning and learning utilizing explanation-based learning, partial evaluation, experimentation, graphical knowledge acquisition, automatic abstraction, mixed-initiative planning, and case-based reasoning. [Free]
A general approach to the problem of inducing natural language parsers. It uses an annotated corpus, and produces a parser by using ILP for inducing the rules that control the actions of a shift-reduce parser. [Free]
Hidden Markov Models software library from the Center of Applied Informatics, Cologne. Includes algorithms such as Viterbi, Baum-Welch, and Forward-Backward. [GPL]
Computational model of human language acquisition written in Java; currently acquires a protolanguage of nouns and verbs language based on visual perception. [BSD]
Software for counting and analyzing word n-grams in text. This package provides standard tests of association for identifying word n-grams in large corpora and allows users to implement other tests with minimal scripting knowledge. Written in Perl. [GPL]
A program implementing several memory-based learning techniques. These learners store representation of the training set explicitly, and classifies new cases by extrapolation from the most similar stored cases. [AFL]
A windows-based program that classifies text based on trained material. Designed for automated essay scoring, BETSY can be applied to any text classification task. [GPL]
A software package to discover motifs (highly conserved regions) in groups of related DNA or protein sequences and, search sequence databases using motifs. [Commercial]
A library of tools for constructing maximum entropy (maxent) model in either python or C++. Some program features are L-BFGS and GIS parameter estimation, and gaussian prior smoothing. [GPL]
The Rapid Miner toolset is an environment for machine learning through use of nested operators. Multiple experiments can be arbitrarily nested together through use of a graphical XML based user interface. (Formerly YALE) [GPL]
The aim of this project is to develop a Computational Environment for integrating the design and use of knowledge extraction models from data using evolutionary algorithms. Genetic learning may also be applied to the model. [GPL]
This library allows probabilistic sequence models to be constructed through use of Hidden Markov models (HMMs) and Hierarchical Markov models HMMs (HHMMs) in Ocaml programming language. [GPL]
An object orientated environment for machine learning in Matlab. Algorithms can be plugged together and can be compared with (e.g. model selection, statistical tests and visual plots). Algorithms may be downloaded separately. [GPL]
A suite of Java libraries for the linguistic analysis of human language which can link entity mentions to database entries, uncover relations, cluster documents, and discover significant trends. [GPL]
An integrated collection of Java code useful for statistical natural language processing, document classification, clustering, information extraction, and other machine learning applications to text. [GPL]
Software which allows one to navigate (fly) through the data tree, zoom in on interesting nodes, click on bars to get counts, and mark interesting places in the tree. Includes datasets for automobiles, voting, produce, and medical research. Uses LEDA, ([AFL] licensed only). [GPL]
Using algorithms to address issues of searching and matching strings and more complicated patterns such as trees, regular expressions, graphs, point sets, and arrays. [GPL]
Programs to cluster similar contexts together using unsupervised knowledge-lean methods for word sense discrimination, email categorization, and name discrimination. Written in Perl. [GNU]
Various software packages from the staff and alumni of The University of Texas at Austin which include inductions, a partial lazy evaluation, and a derivative of the Theo system. [GPL]
Programmatically isolate similarities between scattered classes of genes. Expression driven. Utilizes a voting method along with a k-Nearest-Neighbors classification. Very rich graphical interface. Samples of an unknown class are possible given enough data. Fully functional demo. [Commercial]
This tool implements Hidden Markov Models and application to part-of-speech tagging. Also available; a multivariate hypothesis testing software for gaussian data, and a groundtruth/metadata editing and visualizing toolkit for OCR. [GPL]
A program which discovers interesting and repetitive subgraphs in labeled graph representations using the minimum description length principle. Includes applications to molecular biology. [Free]
JProGraM is a machine learning library which supports learning and inference algorithms for Bayesian networks, Markov random fields, hybrid random fields, probabilistic decision trees, dependency networks, and Parzen windows. [GPL]
A machine learning library for classification, regression, ranking and reinforcement learning. It implements several well-known algorithms and is specially designed for large-scale applications. [GPL]
A Markov Logic Interpreter that focuses on efficient MAP inference and Online Learning featuring MAP inference using Cutting Planes combined with Max-Walk-Sat programming, parametrized weights, a shell interpreter, and cardinality constraints. [GPL]
An algorithm engine which will calculate everything from symmetry, torsion angles, polar fraction through protein analysis and bond angles. Online version only. [Free]
A community effort listing of reproducible research via open source software, open access to data and results, and open standards for interchange. [FREE]
The Carnegie Mellon University School of Computer Science select list of a few very good machine learning systems. A list of 4 FTP repository links is also listed for exploring. [GPL]
Provides GMDH-based machine learning technology for classification, continuous value prediction and time series forecasting. The software uses multi-core processors and HPC Linux clusters. [Commercial]
A generic SVN object interface with many implementations including OCAS, Liblinear, LibSVM, SVMLight, SVMLin, and GPDT. Each SVN provides implementations of the most common kernals such as Linear, Ploynomial, Gaussian, and Sigmond Kernel. Implemented in C++ and interfaces to Matlab, R, Octave and Python. [GPL]
generalized Single-hidden Layer Feed forward Networks (SLFNs) and how to build them. Since work as universal approximators with adjustable hidden parameters, all parameters of ELMs can be analytically determined instead of being tuned. Written in Matlab. [GPL]
A program compiler with built-in machine learning to find the most efficient compilation possible based upon the processor the program is run on. A collective optimization database, predictor web-service, and frameworks are available in order to make suggestions and further streamline the efficiency quotient of the user. This is the open source compiler that IBM has created press releases on using. [GPL]
A high performance Python package for predictive modeling. Fast N-dimentional array manipulation is performed via numpy using C code. New features include: OLS, Ridge Regression, Kernel Redge Regression, LASSO, LARS, Gradient descent for Regression, and K-Means. [GPL]
Fuzzy machine learning framework is a library of ADA packages and a GUI front-end based on graph-schemes, intuitionistic fuzzy sets and the possibility theory. Sources can be used on any platform where an Ada 2005 compiler is available. [GPL]
A database of cases described by a combination of real and discrete valued attributes, and automatically finds the natural classes in that data. It can be seen as a Naive Bayes classifier where the class node is hidden. [Free]
SMILE (Structural Modeling, Inference, and Learning Engine) is a fully portable library of C++ classes implementing graphical decision-theoretic methods, such as Bayesian net-works and influence diagrams, directly amenable to inclusion in intelligent systems. Its Windows user interface, GeNIe is a versatile and user-friendly development environment for graphical decision-theoretic models. Both modules, developed at the Decision Systems Laboratory, University of Pittsburgh. Registration is required for download. [GPL]
Software toolkit for building and using motif-based hidden Markov models of DNA and proteins. There is an online interactive version. Source written in C. [GPL]
Small portable online hand recognition system based on Support Vector Machines. Provides a relatively small machine model running at 50-100 char per second recognition speed. [BSD]
Several algoritms with papers on Fast kernel density estimation, Improved Fast Gauss Transformation, and Fast ranking. Some unpublished papers are also included. [GPL]
Supports several inference algorithms and learning algorithms. Allows simulation of static and dynamic networks, including HMMs, IOHMMs, and Kalman filters. [GPL]
A generic framework for the evolutionary search algorithm mPOEMS. It was designed to solve optimisation problems, with an unrestricted number of objectives. This site provides all sources and some exemplary implementations, e.g. of the n-hard knapsack problem. [AFL]
An open source Python library that implements a range of machine learning, preprocessing, cross-validation and visualization algorithms using a unified interface. Provides documentation and source code.