List of usage examples for org.apache.mahout.math.random Sampler interface-usage
From source file com.mapr.synth.distributions.ChineseRestaurant.java
/**
* Generates samples from a generalized Chinese restaurant process (or Pittman-Yor process).
* <p/>
* The number of values drawn exactly once will asymptotically be equal to the discount parameter
* as the total number of draws T increases without bound. The number of unique values sampled will
* increase as O(alpha * log T) if discount = 0 or O(alpha * T^discount) for discount > 0.
From source file com.mapr.synth.distributions.LongTail.java
/** * Samples from a set of things based on a long-tailed distribution. This converts * the ChineseRestaurant distribution from a distribution over integers into a distribution * over more plausible looking things like words. */ public abstract class LongTail<T> implements Sampler<T> {
From source file com.mapr.synth.distributions.TermGenerator.java
/**
* Generate words at random from a specialized vocabulary. Every term generator's
* frequency distribution has a common basis, but each will diverge after initialization.
*
* Thread safe for sampling
*/
From source file com.mapr.synth.LogGenerator.java
/** * Generates kind of realistic log lines consisting of a user id (a cookie), an IP address and a query. */ public class LogGenerator implements Sampler<LogLine> { private PriorityQueue<LogLine> eventBuffer = Queues.newPriorityQueue(); private PriorityQueue<User> users = Queues.newPriorityQueue();
From source file com.mapr.synth.samplers.FieldSampler.java
@JsonTypeInfo(use = JsonTypeInfo.Id.NAME, include = JsonTypeInfo.As.PROPERTY, property = "class") @JsonSubTypes({ @JsonSubTypes.Type(value = AddressSampler.class, name = "address"), @JsonSubTypes.Type(value = DateSampler.class, name = "date"), @JsonSubTypes.Type(value = ArrivalSampler.class, name = "event"), @JsonSubTypes.Type(value = ForeignKeySampler.class, name = "foreign-key"), @JsonSubTypes.Type(value = IdSampler.class, name = "id"),
From source file com.mapr.synth.samplers.SchemaSampler.java
/** * Samples from a specified schema to generate reasonably interesting data. */ public class SchemaSampler implements Sampler<JsonNode> { private final JsonNodeFactory nodeFactory = JsonNodeFactory.withExactBigDecimals(false);
From source file org.apache.drill.synth.ChineseRestaurant.java
/**
*
* Generates samples from a generalized Chinese restaurant process (or Pittman-Yor process).
*
* The number of values drawn exactly once will asymptotically be equal to the discount parameter
* as the total number of draws T increases without bound. The number of unique values sampled will
From source file org.apache.drill.synth.LogGenerator.java
/** * Generates kind of realistic log lines consisting of a user id (a cookie), an IP address and a query. */ public class LogGenerator implements Sampler<LogLine> { private LongTail<InetAddress> ipGenerator = new LongTail<InetAddress>(1, 0.5) { Random gen = new Random();
From source file org.apache.drill.synth.LongTail.java
/** * Samples from a set of things based on a long-tailed distribution. This converts * the ChineseRestaurant distribution from a distribution over integers into a distribution * over more plausible looking things like words. */ public abstract class LongTail<T> implements Sampler<T> {
From source file org.apache.drill.synth.sampler.ChineseRestaurant.java
/**
* Generates samples from a generalized Chinese restaurant process (or Pittman-Yor process).
* <p/>
* The number of values drawn exactly once will asymptotically be equal to the discount parameter
* as the total number of draws T increases without bound. The number of unique values sampled will
* increase as O(alpha * log T) if discount = 0 or O(alpha * T^discount) for discount > 0.