TDigest

Companion object TDigest

case class TDigest(delta: Double, maxDiscrete: Int, nclusters: Int, clusters: TDigestMap) extends Serializable with Product

A t-digest sketch of sampled numeric data, as described in: Computing Extremely Accurate Quantiles Using t-Digests, Ted Dunning and Otmar Ertl, https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf

import org.isarnproject.sketches.TDigest
val data = Vector.fill(10000) { scala.util.Random.nextGaussian() }
// sketch of some Gaussian data
val sketch = TDigest.sketch(data)
// the cumulative distribution function of the sketch; cdf(x) at x = 0
val cdf = sketch.cdf(0.0)
// inverse of the CDF, evaluated at q = 0.5
val cdfi = sketch.cdfInverse(0.5)

Linear Supertypes

Product, Equals, Serializable, Serializable, AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

TDigest
Product
Equals
Serializable
Serializable
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Instance Constructors

new TDigest(delta: Double, maxDiscrete: Int, nclusters: Int, clusters: TDigestMap)

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
def +[N1, N2](xw: (N1, N2))(implicit num1: Numeric[N1], num2: Numeric[N2]): TDigest
Returns a new t-digest with new pair (x, w) included in its sketch.
Returns a new t-digest with new pair (x, w) included in its sketch.
xw
A pair (x, w) where x is the numeric value and w is its weight
returns
the updated sketch

Note
This implements 'algorithm 1' from: Computing Extremely Accurate Quantiles Using t-Digests, Ted Dunning and Otmar Ertl, https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf
def +[N](x: N)(implicit num: Numeric[N]): TDigest
Returns a new t-digest with value x included in its sketch; td + x is equivalent to td + (x, 1).
Returns a new t-digest with value x included in its sketch; td + x is equivalent to td + (x, 1).
x
The numeric data value to include in the sketch
returns
the updated sketch
def ++(that: TDigest): TDigest
Add this digest to another
Add this digest to another
that
The right-hand t-digest operand
returns
the result of combining left and right digests
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def cdf[N](x: N)(implicit num: Numeric[N]): Double
Compute a cumulative probability (CDF) for a numeric value, from the estimated probability distribution represented by this t-digest sketch.
Compute a cumulative probability (CDF) for a numeric value, from the estimated probability distribution represented by this t-digest sketch.
x
a numeric value
returns
the cumulative probability that a random sample from the distribution is <= x
def cdfDiscrete[N](x: N)(implicit num: Numeric[N]): Double
Compute a cumulative probability (CDF) for a numeric value, from the estimated probability distribution represented by this t-digest sketch, assuming sketch is "discrete" (e.g.
Compute a cumulative probability (CDF) for a numeric value, from the estimated probability distribution represented by this t-digest sketch, assuming sketch is "discrete" (e.g. if number of clusters <= maxDiscrete setting)
x
a numeric value
returns
the cumulative probability that a random sample from the distribution is <= x
def cdfDiscreteInverse[N](q: N)(implicit num: Numeric[N]): Double
Compute the inverse cumulative probability (inverse-CDF) for a quantile value, from the estimated probability distribution represented by this t-digest sketch, assuming the sketch is "discrete" (e.g.
Compute the inverse cumulative probability (inverse-CDF) for a quantile value, from the estimated probability distribution represented by this t-digest sketch, assuming the sketch is "discrete" (e.g. if number of clusters <= maxDiscrete setting)
q
a quantile value. The value of q is expected to be on interval [0, 1]
returns
the smallest value x such that q <= cdf(x)
def cdfInverse[N](q: N)(implicit num: Numeric[N]): Double
Compute the inverse cumulative probability (inverse-CDF) for a quantile value, from the estimated probability distribution represented by this t-digest sketch.
Compute the inverse cumulative probability (inverse-CDF) for a quantile value, from the estimated probability distribution represented by this t-digest sketch.
q
a quantile value. The value of q is expected to be on interval [0, 1]
returns
the value x such that cdf(x) = q
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@native() @throws( ... )
val clusters: TDigestMap
val delta: Double
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
val maxDiscrete: Int
val nclusters: Int
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
def sample: Double
Perform a random sampling from the distribution as sketched by this t-digest, using "discrete" (PMF) mode if the number of clusters <= maxDiscrete setting, and "density" (PDF) mode otherwise.
Perform a random sampling from the distribution as sketched by this t-digest, using "discrete" (PMF) mode if the number of clusters <= maxDiscrete setting, and "density" (PDF) mode otherwise.
returns
A random number sampled from the sketched distribution

Note
uses the inverse transform sampling method
def samplePDF: Double
Perform a random sampling from the distribution as sketched by this t-digest, in "probability density" mode.
Perform a random sampling from the distribution as sketched by this t-digest, in "probability density" mode.
returns
A random number sampled from the sketched distribution

Note
uses the inverse transform sampling method
def samplePMF: Double
Perform a random sampling from the distribution as sketched by this t-digest, in "probability mass" (i.e.
Perform a random sampling from the distribution as sketched by this t-digest, in "probability mass" (i.e. discrete) mode.
returns
A random number sampled from the sketched distribution

Note
uses the inverse transform sampling method
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@native() @throws( ... )

Packages

TDigest

Companion object TDigest

case class TDigest(delta: Double, maxDiscrete: Int, nclusters: Int, clusters: TDigestMap) extends Serializable with Product

Instance Constructors

Value Members

Inherited from Product

Inherited from Equals

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

TDigest 

Companion object TDigest

case class TDigest(delta: Double, maxDiscrete: Int, nclusters: Int, clusters: TDigestMap) extends Serializable with Product

Instance Constructors

Value Members

Inherited from Product

Inherited from Equals

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped

TDigest