Course Information
Announcements
Course
Material
Assignments
Class
Project
Online Resources
Home
Comments/Suggestions
|
-
WEKA
WEKA is an open-source data mining package containing a full collection of machine learning algorithms for solving various
data mining problems. It is written in Java and runs on almost any
platform. The algorithms can either be applied directly to a dataset or
called from your own Java code. It includes several implemented schemes
for classification, association rule discovery, clustering, prediction,
etc. The full distribution of WEKA as well as additional information and supporting material can be found at the official
WEKA Web site. The site also includes
additional data (from the UCI data repository) already converted into the ARFF format
which is used by WEKA.
-
Clustering and Profile Generation Tools
This is set of programs developed here for clustering and generation of
profiles based
on the results of clustering. The set also includes some programs to
assist in
characterizing the generated clusters. The documentation for each
program and some
example data sets are included in the distribution. All of these
programs and the
documentation are included in a single Zip Archive.
-
Magnum Opus
Magnum Opus is a tool for finding association rules from data. It uses
a highly efficient search algorithm for fast association rule
discovery and does not rely on sparse data for efficient processing.
More information on Magnum Opus as well as an evaluation download
version can be found from the
G.I. Webb & Associates.
-
See5/C5.0
See5 is the commercial version of the C4.5 decision tree algorithm
developed by Ross Quinlan. See5/C5.0 classifiers are expressed as
decision trees or sets of if-then rules. RuleQuest provides C source
code so that classifiers constructed by See5/C5.0 can be embedded in
your own systems. More information on See5/C5.0 as well as an evaluation download
version can be found from the RuleQuest Site.
-
Cubist
Yet another program from RuleQuest. Cubist builds rule-based predictive
models that output values, complementing See5/C5.0 that predicts
categories. For instance, See5/C5.0 might classify the yield from some
process as "high", "medium", or "low", whereas Cubist would output a
number such as 73%. Information on
Cubist and the evaluation download version can be found from the RuleQuest Site -
-
CBA
CBA is a data mining tool for the discovery of association rules and
for classification. The classification technique used in CBA is based
on using a subset of associations discovered. CBA implements two
versions of Apriori algorithm (one using a single minimum support
parameter, and another using multiple minimum support at different
levels). It also includes features for visualizing association rules
using a tree structure.
Back to Online Resources
|