Example usage for org.apache.mahout.clustering.fuzzykmeans FuzzyKMeansDriver run

List of usage examples for org.apache.mahout.clustering.fuzzykmeans FuzzyKMeansDriver run

Introduction

In this page you can find the example usage for org.apache.mahout.clustering.fuzzykmeans FuzzyKMeansDriver run.

Prototype

public static void run(Path input, Path clustersIn, Path output, double convergenceDelta, int maxIterations,
        float m, boolean runClustering, boolean emitMostLikely, double threshold, boolean runSequential)
        throws IOException, ClassNotFoundException, InterruptedException 

Source Link

Document

Iterate over the input vectors to produce clusters and, if requested, use the results of the final iteration to cluster the input vectors.

Usage

From source file:DisplayFuzzyKMeans.java

License:Apache License

private static void runSequentialFuzzyKClusterer(Configuration conf, Path samples, Path output,
        DistanceMeasure measure, int maxIterations, float m, double threshold)
        throws IOException, ClassNotFoundException, InterruptedException {
    Path clustersIn = new Path(output, "random-seeds");
    RandomSeedGenerator.buildRandom(conf, samples, clustersIn, 3, measure);
    FuzzyKMeansDriver.run(samples, clustersIn, output, threshold, maxIterations, m, true, true, threshold,
            true);//from   w w w  .  ja  v  a  2  s .  c  o m

    loadClustersWritable(output);
}

From source file:org.conan.mymahout.clustering.syntheticcontrol.fuzzykmeans.Job.java

License:Apache License

/**
 * Run the kmeans clustering job on an input dataset using the given distance measure, t1, t2 and iteration
 * parameters. All output data will be written to the output directory, which will be initially deleted if it exists.
 * The clustered points will reside in the path <output>/clustered-points. By default, the job expects the a file
 * containing synthetic_control.data as obtained from
 * http://archive.ics.uci.edu/ml/datasets/Synthetic+Control+Chart+Time+Series resides in a directory named "testdata",
 * and writes output to a directory named "output".
 * /* www .j  a  va2  s. co m*/
 * @param input
 *          the String denoting the input directory path
 * @param output
 *          the String denoting the output directory path
 * @param t1
 *          the canopy T1 threshold
 * @param t2
 *          the canopy T2 threshold
 * @param maxIterations
 *          the int maximum number of iterations
 * @param fuzziness
 *          the float "m" fuzziness coefficient
 * @param convergenceDelta
 *          the double convergence criteria for iterations
 */
public static void run(Configuration conf, Path input, Path output, DistanceMeasure measure, double t1,
        double t2, int maxIterations, float fuzziness, double convergenceDelta) throws Exception {
    Path directoryContainingConvertedInput = new Path(output, DIRECTORY_CONTAINING_CONVERTED_INPUT);
    log.info("Preparing Input");
    InputDriver.runJob(input, directoryContainingConvertedInput,
            "org.apache.mahout.math.RandomAccessSparseVector");
    log.info("Running Canopy to get initial clusters");
    Path canopyOutput = new Path(output, "canopies");
    CanopyDriver.run(new Configuration(), directoryContainingConvertedInput, canopyOutput, measure, t1, t2,
            false, 0.0, false);
    log.info("Running FuzzyKMeans");
    FuzzyKMeansDriver.run(directoryContainingConvertedInput, new Path(canopyOutput, "clusters-0-final"), output,
            convergenceDelta, maxIterations, fuzziness, true, true, 0.0, false);
    // run ClusterDumper
    ClusterDumper clusterDumper = new ClusterDumper(new Path(output, "clusters-*-final"),
            new Path(output, "clusteredPoints"));
    clusterDumper.printClusters(null);
}