lightonml.encoding

lightonml.encoding.base

Encoders

These modules contains implementations of Encoders that can transform data in the binary uint8 format required by the OPU.

class BinaryThresholdEncoder(threshold_enc=25, greater_is_one=True)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements binary encoding using a threshold function.

Parameters
  • threshold_enc (int) – Threshold for the binary encoder. Must be in the interval [0, 255]

  • greater_is_one (bool) – If True, above threshold is 1 and below 0. Vice versa if False.

threshold_enc

Threshold for the binary encoder. Must be in the interval [0, 255]

Type

int

greater_is_one

If True, above threshold is 1 and below 0. Vice versa if False.

Type

bool

fit(X, y=None)[source]

No-op. This method doesn’t do anything. It exists purely for compatibility with the scikit-learn transformer API.

Parameters
  • X (np.ndarray,) – the input data to encode.

  • y (np.ndarray,) – the targets data.

Returns

self

Return type

BinaryThresholdEncoding

transform(X, y=None)[source]

Transform a uint8 array in [0, 255] in a uint8 binary array of [0, 1].

Parameters

X (np.ndarray of uint8,) – the input data to encode.

Returns

X_enc – the encoded data.

Return type

np.ndarray of uint8,

class ConcatenatingBitPlanDecoder(n_bits=8, decoding_decay=0.5)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements a decoding that works by concatenating bitplanes.

n_bits MUST be the same value used in SeparatedBitPlanEncoder. Read more in the Examples section.

Parameters
  • n_bits (int, defaults to 8,) – number of bits used during the encoding.

  • decoding_decay (float, defaults to 0.5,) – decay to apply to the bits during the decoding.

n_bits

number of bits used during the encoding.

Type

int,

decoding_decay

decay to apply to the bits during the decoding.

Type

float, defaults to 0.5,

fit(X, y=None)[source]

No-op. This method doesn’t do anything. It exists purely for compatibility with the scikit-learn transformer API.

Parameters
  • X (np.ndarray) –

  • y (np.ndarray, optional, defaults to None.) –

Returns

self

Return type

MixingBitPlanDecoder

transform(X, y=None)[source]

Performs the decoding.

Parameters

X (2D np.ndarray of uint8 or uint16,) – input data to decode.

Returns

X_dec – decoded data.

Return type

2D np.ndarray of floats

class Float32Encoder(sign_bit=True, exp_bits=8, mantissa_bits=23)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements an encoding that works by separating bitplans and selecting how many bits to keep for sign, mantissa and exponent of the float32.

Parameters
  • sign_bit (bool, defaults to True,) – if True keeps the bit for the sign.

  • exp_bits (int, defaults to 8,) – number of bits of the exponent to keep.

  • mantissa_bits (int, defaults to 23,) – number of bits of the mantissa to keep.

sign_bit

if True keeps the bit for the sign.

Type

bool, defaults to True,

exp_bits

number of bits of the exponent to keep.

Type

int, defaults to 8,

mantissa_bits

number of bits of the mantissa to keep.

Type

int, defaults to 23,

n_bits

total number of bits to keep.

Type

int,

indices

list of the indices of the bits to keep.

Type

list,

fit(X, y=None)[source]

No-op. This method doesn’t do anything. It exists purely for compatibility with the scikit-learn transformer API.

Parameters
  • X (2D np.ndarray) –

  • y (1D np.ndarray) –

Returns

self

Return type

SeparatedBitPlanEncoder

transform(X)[source]

Performs the encoding.

Parameters

X (2D np.ndarray of uint8, 16, 32 or 64 [n_samples, n_features],) – input data to encode.

Returns

X_enc – encoded input data.

Return type

2D np.ndarray of uint8 [n_samples*n_bits, n_features],

class MixingBitPlanDecoder(n_bits=8, decoding_decay=0.5)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements a decoding that works by mixing bitplanes.

n_bits MUST be the same value used in SeparatedBitPlanEncoder. Read more in the Examples section.

Parameters
  • n_bits (int, defaults to 8,) – number of bits used during the encoding.

  • decoding_decay (float, defaults to 0.5,) – decay to apply to the bits during the decoding.

n_bits

number of bits used during the encoding.

Type

int,

decoding_decay

decay to apply to the bits during the decoding.

Type

float, defaults to 0.5,

fit(X, y=None)[source]

No-op. This method doesn’t do anything. It exists purely for compatibility with the scikit-learn transformer API.

Parameters
  • X (np.ndarray) –

  • y (np.ndarray, optional, defaults to None.) –

Returns

self

Return type

MixingBitPlanDecoder

transform(X, y=None)[source]

Performs the decoding.

Parameters

X (2D np.ndarray of uint8 or uint16,) – input data to decode.

Returns

X_dec – decoded data.

Return type

2D np.ndarray of floats

class NoDecoding[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements a No-Op Decoding class for API consistency.

class NoEncoding[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements a No-Op Encoding class for API consistency.

class SeparatedBitPlanEncoder(n_bits=8, starting_bit=0)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements an encoding that works by separating bitplans.

n_bits + starting_bit must be lower than the bitwidth of data that are going to be fed to the encoder. E.g. if X.dtype is uint8, then n_bits + starting_bit must be lower than 8. If instead X.dtype is uint32, then n_bits + starting_bit must be lower than 32.

Read more in the Examples section.

Parameters
  • n_bits (int, defaults to 8,) – number of bits to keep during the encoding. Must be positive.

  • starting_bit (int, defaults to 0,) – bit used to start the encoding, previous bits will be thrown away. Must be positive.

n_bits

number of bits to keep during the encoding.

Type

int,

starting_bit

bit used to start the encoding, previous bits will be thrown away.

Type

int,

fit(X, y=None)[source]

No-op. This method doesn’t do anything. It exists purely for compatibility with the scikit-learn transformer API.

Parameters
  • X (2D np.ndarray) –

  • y (1D np.ndarray) –

Returns

self

Return type

SeparatedBitPlanEncoder

transform(X)[source]

Performs the encoding.

Parameters

X (2D np.ndarray of uint8, 16, 32 or 64 [n_samples, n_features],) – input data to encode.

Returns

X_enc – encoded input data.

Return type

2D np.ndarray of uint8 [n_samples*n_bits, n_features]

class SequentialBaseTwoEncoder(n_gray_levels=16)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements a base 2 encoding.

E.g. \(5\) is written \(101\) in base 2: \(1 * 2^2 + 0 * 2^1 + 1 * 2^0\) = (1)*4 +(0)*2 +(1)*1, so the encoder will give 1111001.

Parameters

n_gray_levels (int,) – number of values that can be encoded. Must be a power of 2.

n_gray_levels

number of values that can be encoded. Must be a power of 2.

Type

int,

n_bits

number of bits needed to encode n_gray_levels values.

Type

int,

offset

value to subtract to get the minimum to 0.

Type

float,

scale

scaling factor to normalize the data.

Type

float,

fit(X, y=None)[source]

Computes parameters for the normalization.

Must be run only on the training set to avoid leaking information to the dev/test set.

Parameters
  • X (np.ndarray of uint [n_samples, n_features],) – the input data to encode.

  • y (np.ndarray,) – the targets data.

Returns

self

Return type

SequentialBaseTwoEncoder.

normalize(X)[source]

Normalize the data in the right range before the integer casting.

Parameters

X (np.ndarray of uint [n_samples, n_features],) – the input data to normalize.

Returns

X_norm – normalized data.

Return type

np.ndarray of uint8 [n_samples, n_features],

transform(X, y=None)[source]

Performs the encoding.

Parameters

X (2D np.ndarray of uint [n_samples, n_features],) – input data to encode.

Returns

X_enc – encoded input data.

Return type

2D np.ndarray of uint8 [n_samples, n_features*(n_gray_levels-1)

lightonml.encoding.utils

This module contains all the functions that are used to implement the data formatting. Coordinates are (y, x).

compute_indices_1d_macro_pixels(n_features, rectangle_shape, feature_shape='square')[source]

Computes indices of the macro pixels in a linear way. It means that it is using the whole width of the rectangle even if the quotient (rectangle_width / macro_pixel_width) is not an integer.

Parameters
  • n_features (int,) – number of features for.

  • rectangle_shape (tuple of 2 integer,) – shape of the rectangle area.

  • feature_shape (string,) – shape of the macropixels (‘rectangle’ or ‘square’)

Returns

  • indices (np.ndarray of int64/int32,) – indices in the rectangle.

  • factor (int,) – size of the macropixels.

compute_indices_2d_macro_pixels(n_features, rectangle_shape)[source]

Computes indices of the macro pixels as a classical zoom in of the squared features.

Parameters
  • n_features (int,) – number of features for.

  • rectangle_shape (tuple of 2 integer,) – shape of the rectangle area.

Returns

  • indices (np.ndarray of int64/int32,) – indices in the rectangle.

  • factor (int,) – size of the macropixels.

compute_indices_centered(n_features, rectangle_shape)[source]

Computes indices in order to have the feature values in a square centered in the area.

Parameters
  • n_features (int,) – number of features for.

  • rectangle_shape (tuple of 2 integer,) – shape of the rectangle area.

Returns

  • indices (np.ndarray of int64/int32,) – indices in the rectangle.

  • factor (int,) – size of the macropixels.

compute_indices_lined(n_features, rectangle_shape)[source]

Computes indices in order to have the feature values positioned in line.

Parameters
  • n_features (int,) – number of features for.

  • rectangle_shape (tuple of 2 integer,) – shape of the rectangle area.

Returns

  • indices (np.ndarray of int64/int32,) – indices in the rectangle.

  • factor (int,) – size of the macropixels.

compute_new_indices_greater_rectangle(indices, old_rectangle_shape, old_rectangle_position, new_rectangle_shape)[source]

Computes new indices from an old rectangle to a new and larger one. It is useful in the case where the region of interest is smaller than the DMD area.

Parameters
  • indices (np.ndarray of uint64/uint32,) – indices in the old rectangle.

  • old_rectangle_shape (tuple of 2 integer,) – shape of the old rectangle.

  • old_rectangle_position (tuple of 2 integer,) – position of the origin of the old rectangle inside the new one (distance from top, distance from left margin)

  • new_rectangle_shape (tuple of 2 integer,) – shape of the new rectangle.

Returns

new_indices – indices in the new rectangle.

Return type

np.ndarray of uint64/uint32,

get_formatting_function(n_features, position='2d_macro_pixels', roi_shape=(1140, 912), roi_position=(0, 0), dmd_shape=(1140, 912))[source]

Returns a formatting function that takes feature vectors and returns the dmd formatted vectors .

Parameters
  • n_features (int,) – number of features for.

  • position (str,) – type of formatting.

  • roi_shape (tuple of ints,) – shape of the ROI on the DMD.

  • roi_position (tuple of ints,) – position of the ROI on the DMD.

  • dmd_shape (tuple of ints,) – shape of the DMD.

Returns

  • formatting_func (callable,) – callable to perform the formatting of the input arrays.

  • factor (int,) – size of the macropixels.