Packages

case class TDigest(delta: Double, maxDiscrete: Int, nclusters: Int, clusters: TDigestMap) extends Serializable with Product

A t-digest sketch of sampled numeric data, as described in: Computing Extremely Accurate Quantiles Using t-Digests, Ted Dunning and Otmar Ertl, https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf

import org.isarnproject.sketches.TDigest
val data = Vector.fill(10000) { scala.util.Random.nextGaussian() }
// sketch of some Gaussian data
val sketch = TDigest.sketch(data)
// the cumulative distribution function of the sketch; cdf(x) at x = 0
val cdf = sketch.cdf(0.0)
// inverse of the CDF, evaluated at q = 0.5
val cdfi = sketch.cdfInverse(0.5)
Linear Supertypes
Product, Equals, Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. TDigest
  2. Product
  3. Equals
  4. Serializable
  5. Serializable
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new TDigest(delta: Double, maxDiscrete: Int, nclusters: Int, clusters: TDigestMap)

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. def +[N1, N2](xw: (N1, N2))(implicit num1: Numeric[N1], num2: Numeric[N2]): TDigest

    Returns a new t-digest with new pair (x, w) included in its sketch.

    Returns a new t-digest with new pair (x, w) included in its sketch.

    xw

    A pair (x, w) where x is the numeric value and w is its weight

    returns

    the updated sketch

    Note

    This implements 'algorithm 1' from: Computing Extremely Accurate Quantiles Using t-Digests, Ted Dunning and Otmar Ertl, https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf

  4. def +[N](x: N)(implicit num: Numeric[N]): TDigest

    Returns a new t-digest with value x included in its sketch; td + x is equivalent to td + (x, 1).

    Returns a new t-digest with value x included in its sketch; td + x is equivalent to td + (x, 1).

    x

    The numeric data value to include in the sketch

    returns

    the updated sketch

  5. def ++(that: TDigest): TDigest

    Add this digest to another

    Add this digest to another

    that

    The right-hand t-digest operand

    returns

    the result of combining left and right digests

  6. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  7. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  8. def cdf[N](x: N)(implicit num: Numeric[N]): Double

    Compute a cumulative probability (CDF) for a numeric value, from the estimated probability distribution represented by this t-digest sketch.

    Compute a cumulative probability (CDF) for a numeric value, from the estimated probability distribution represented by this t-digest sketch.

    x

    a numeric value

    returns

    the cumulative probability that a random sample from the distribution is <= x

  9. def cdfDiscrete[N](x: N)(implicit num: Numeric[N]): Double

    Compute a cumulative probability (CDF) for a numeric value, from the estimated probability distribution represented by this t-digest sketch, assuming sketch is "discrete" (e.g.

    Compute a cumulative probability (CDF) for a numeric value, from the estimated probability distribution represented by this t-digest sketch, assuming sketch is "discrete" (e.g. if number of clusters <= maxDiscrete setting)

    x

    a numeric value

    returns

    the cumulative probability that a random sample from the distribution is <= x

  10. def cdfDiscreteInverse[N](q: N)(implicit num: Numeric[N]): Double

    Compute the inverse cumulative probability (inverse-CDF) for a quantile value, from the estimated probability distribution represented by this t-digest sketch, assuming the sketch is "discrete" (e.g.

    Compute the inverse cumulative probability (inverse-CDF) for a quantile value, from the estimated probability distribution represented by this t-digest sketch, assuming the sketch is "discrete" (e.g. if number of clusters <= maxDiscrete setting)

    q

    a quantile value. The value of q is expected to be on interval [0, 1]

    returns

    the smallest value x such that q <= cdf(x)

  11. def cdfInverse[N](q: N)(implicit num: Numeric[N]): Double

    Compute the inverse cumulative probability (inverse-CDF) for a quantile value, from the estimated probability distribution represented by this t-digest sketch.

    Compute the inverse cumulative probability (inverse-CDF) for a quantile value, from the estimated probability distribution represented by this t-digest sketch.

    q

    a quantile value. The value of q is expected to be on interval [0, 1]

    returns

    the value x such that cdf(x) = q

  12. def clone(): AnyRef
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @native() @throws( ... )
  13. val clusters: TDigestMap
  14. val delta: Double
  15. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  16. def finalize(): Unit
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  17. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  18. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  19. val maxDiscrete: Int
  20. val nclusters: Int
  21. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  22. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  23. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  24. def sample: Double

    Perform a random sampling from the distribution as sketched by this t-digest, using "discrete" (PMF) mode if the number of clusters <= maxDiscrete setting, and "density" (PDF) mode otherwise.

    Perform a random sampling from the distribution as sketched by this t-digest, using "discrete" (PMF) mode if the number of clusters <= maxDiscrete setting, and "density" (PDF) mode otherwise.

    returns

    A random number sampled from the sketched distribution

    Note

    uses the inverse transform sampling method

  25. def samplePDF: Double

    Perform a random sampling from the distribution as sketched by this t-digest, in "probability density" mode.

    Perform a random sampling from the distribution as sketched by this t-digest, in "probability density" mode.

    returns

    A random number sampled from the sketched distribution

    Note

    uses the inverse transform sampling method

  26. def samplePMF: Double

    Perform a random sampling from the distribution as sketched by this t-digest, in "probability mass" (i.e.

    Perform a random sampling from the distribution as sketched by this t-digest, in "probability mass" (i.e. discrete) mode.

    returns

    A random number sampled from the sketched distribution

    Note

    uses the inverse transform sampling method

  27. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  28. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  29. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  30. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @throws( ... )

Inherited from Product

Inherited from Equals

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped