probstructs

ProbStructs collection of probabilistic data structures.

C++: https://probstructs.readthedocs.io/en/stable/

Classes

CountMinSketch Count–min sketch (CM sketch) is a probabilistic data structure that serves as a frequency table of events in a stream of data.
ExponentialCountMinSketch Exponential count-min sketch (ECM-Sketch) combines CM-Sketch with EH to count number of different elements in the last N elements in the stream.
ExponentialHistorgram Exponential histogram (EH) is a probabilistic data structure that serves as a frequency counter for specific elements in the last N elements from stream.
Hash Hashing function - MurMurHash3
class probstructs.CountMinSketch

Count–min sketch (CM sketch) is a probabilistic data structure that serves as a frequency table of events in a stream of data.

It uses hash functions to map events to frequencies, but unlike a hash table uses only sub-linear space, at the expense of overcounting some events due to collisions.

C++: https://probstructs.readthedocs.io/en/stable/classes.html#clscountminsketch

__init__(self: probstructs.CountMinSketch, width: int, depth: int) → None

Create CM sketch with width {width} and depth {depth}.

get(self: probstructs.CountMinSketch, key: str) → int

Get count for {key}.

inc(self: probstructs.CountMinSketch, key: str, delta: int) → None

Increase counter for {key} by {delta}.

class probstructs.ExponentialCountMinSketch

Exponential count-min sketch (ECM-Sketch) combines CM-Sketch with EH to count number of different elements in the last N elements in the stream.

C++: https://probstructs.readthedocs.io/en/stable/classes.html#exponentialcountminsketch

__init__(self: probstructs.ExponentialCountMinSketch, width: int, depth: int, window: int) → None

Create ECM-Sketch with width {width}, depth {depth} to count elmenets in the last {window} elements.

get(self: probstructs.ExponentialCountMinSketch, key: str, window: int, tick: int) → int

Get counter for {key}for last {window} elements when on the position {tick} in the stream.

inc(self: probstructs.ExponentialCountMinSketch, key: str, tick: int, delta: int) → None

Increase counter for {key} by {delta} when on the position {tick} in the stream.

class probstructs.ExponentialHistorgram

Exponential histogram (EH) is a probabilistic data structure that serves as a frequency counter for specific elements in the last N elements from stream.

C++: https://probstructs.readthedocs.io/en/stable/classes.html#exponentialhistorgram

__init__(self: probstructs.ExponentialHistorgram, window: int) → None

Create exponential histogram for last {window} elements.

get(self: probstructs.ExponentialHistorgram, window: int, tick: int) → int

Get the counter for last {window} elements when on the position {tick} in the stream.

inc(self: probstructs.ExponentialHistorgram, tick: int, delta: int) → None

Increase counter by {delta} when on the position {tick} in the stream.

class probstructs.Hash

Hashing function - MurMurHash3

C++: https://probstructs.readthedocs.io/en/stable/classes.html#hash

__init__(self: probstructs.Hash, seed: int) → None

Create hashing function with {seed}.

hash(self: probstructs.Hash, key: str) → int

Hash {key}.