Get started with the OPU

There are three ways to process data with an OPU:

For each of these ways there’s the possibility to run the OPU in a simulated manner, without access to a real OPU. Check the last section of this notebook for details.

[1]:
import numpy as np
import torch
[4]:
numpy_data = np.random.randint(0, 2, size=(3000, 10000), dtype=np.uint8)
torch_data = torch.randint(0, 2, size=(3000, 10000), dtype=torch.uint8)

Lightonopu with numpy arrays or torch tensors

The OPU class is low-level, and it is used internally in lightonml classes. This class does not offer fancy features for compatibility with third-party frameworks, but it is more versatile and can accept both numpy.ndarray and torch.Tensor.

[ ]:
from lightonml import OPU
[8]:
opu = OPU(n_components=10000)

In the case of OPU, the user needs to call .fit1d if the input data is a collection of vectors, or .fit2d if it is a collection of matrices, at least once before calling .transform. Not doing so will result in an error.

[12]:
try:
    opu.transform(numpy_data)
except AssertionError as e:
    print(e)
Call fit1d or fit2d before transform
[13]:
opu.fit1d(numpy_data)
[15]:
y_np = opu.transform(numpy_data)
y_np
[15]:
ContextArray([[ 7, 13, 24, ...,  9, 10, 10],
              [ 1,  5,  2, ...,  9,  2,  6],
              [ 0,  1,  2, ..., 10, 12, 22],
              ...,
              [ 1,  2,  1, ..., 10, 12,  9],
              [ 6,  5,  2, ..., 13,  7, 17],
              [19,  8,  0, ..., 15, 36, 46]], dtype=uint8)

When OPU processes numpy arrays, it returns a ContextArray, a simple subclass of np.ndarray, with a context attribute displaying the parameters chosen by fit. It can be turned into a numpy.ndarray by calling np.array on it.

[16]:
y_np.context.as_dict()
[16]:
{'exposure_us': 400,
 'frametime_us': 500,
 'output_roi': ((0, 512), (2040, 64)),
 'start': datetime.datetime(2020, 10, 9, 10, 27, 21, 223001),
 'gain_dB': 0.0,
 'end': datetime.datetime(2020, 10, 9, 10, 27, 22, 906459),
 'input_roi': ((0, 0), (912, 1140)),
 'n_ones': 514731,
 'fmt_type': 'lined',
 'fmt_factor': 103}
[22]:
np.array(y_np)
[22]:
array([[ 1,  3,  7, ..., 21,  7,  2],
       [13, 24, 24, ..., 12, 19, 23],
       [ 5,  2,  2, ..., 13,  8, 13],
       ...,
       [ 4,  4,  6, ..., 15, 18, 19],
       [10,  7, 14, ..., 12,  5,  3],
       [14, 14, 11, ...,  6,  1, 14]], dtype=uint8)

The two calls can be combined with the methods fit_transform1d or fit_transform2d. There is no difference between the API for numpy arrays and torch Tensors, but transform will return a tensor and not ContextArray for the latter.

[17]:
y_torch = opu.fit_transform1d(torch_data)
y_torch
[17]:
tensor([[ 4,  5,  4,  ...,  5,  5, 21],
        [ 8,  5,  3,  ...,  3,  5,  7],
        [ 7, 11, 15,  ..., 16, 28, 33],
        ...,
        [ 4,  3,  1,  ..., 10,  5, 12],
        [ 2, 11,  2,  ..., 12,  6,  1],
        [ 0,  3,  2,  ..., 16, 33, 54]], dtype=torch.uint8)
[18]:
data_2d = np.random.randint(0, 2, size=(3000, 900, 900), dtype=np.uint8)
opu.fit2d(data_2d)
y = opu.transform(data_2d)
y
[18]:
ContextArray([[ 1,  3,  7, ..., 21,  7,  2],
              [13, 24, 24, ..., 12, 19, 23],
              [ 5,  2,  2, ..., 13,  8, 13],
              ...,
              [ 4,  4,  6, ..., 15, 18, 19],
              [10,  7, 14, ..., 12,  5,  3],
              [14, 14, 11, ...,  6,  1, 14]], dtype=uint8)
[19]:
y.context.as_dict()
[19]:
{'exposure_us': 400,
 'frametime_us': 500,
 'output_roi': ((0, 512), (2040, 64)),
 'start': datetime.datetime(2020, 10, 9, 10, 28, 27, 450896),
 'gain_dB': 0.0,
 'end': datetime.datetime(2020, 10, 9, 10, 28, 30, 819065),
 'input_roi': ((6, 120), (900, 900)),
 'n_ones': 634685,
 'fmt_type': 'macro_2d',
 'fmt_factor': 1}

Remember to release the resources when you are done with them (you can also use a context manager).

[25]:
opu.close()

Lightonml with numpy arrays

There is an OPUMap class in lightonml.projections.sklearn that can process numpy.ndarrays and is built to be scikit-learn compatible: it can be embedded in pipelines, cross-validated, etc.

In OPUMap classes, .fit automatically dispatches to .fit1d or .fit2d. It is also provided with the classical fit_transform method of the sklearn API.

[26]:
from lightonml.projections.sklearn import OPUMap
[27]:
opumap_np = OPUMap(n_components=10000)
[28]:
output = opumap_np.fit_transform(numpy_data)
output
[28]:
array([[ 7, 12, 22, ...,  8, 10, 11],
       [ 2,  6,  2, ...,  9,  1,  6],
       [ 1,  1,  3, ..., 10, 13, 23],
       ...,
       [ 1,  2,  1, ...,  9, 11,  9],
       [ 6,  5,  2, ..., 12,  7, 16],
       [19,  8,  1, ..., 15, 39, 47]], dtype=uint8)

Since we are going to use a different object to “talk” with the OPU, we have to release the resource.

[29]:
opumap_np.close()

Lightonml with torch tensors

A second OPUMap interface is available in lightonml.projections.torch. In this case OPUMap behaves as a torch.nn.Module: the object can be called on data.

Note that the optical processing is not differentiable, so this operation will break the computational graph: gradients are not propagated through the optical transform. The fit method can be called explicitly, or it will be run on the first batch of data automatically.

[30]:
from lightonml.projections.torch import OPUMap
[31]:
opumap_torch = OPUMap(n_components=10000)
OPU output is detached from the computational graph.
[32]:
output = opumap_torch(torch_data)
output
OPUMap was not fit to data. Performing fit on the first batch with default parameters...
[32]:
tensor([[ 3,  5,  5,  ...,  5,  5, 21],
        [ 8,  7,  4,  ...,  3,  5,  7],
        [ 8, 14, 18,  ..., 18, 29, 35],
        ...,
        [ 3,  4,  1,  ..., 12,  6, 13],
        [ 2, 11,  4,  ..., 13,  7,  1],
        [ 1,  3,  3,  ..., 16, 40, 56]], dtype=torch.uint8)
[33]:
opumap_torch.close()

Simulating an OPU

If you don’t have access to an OPU, you can simulate it on any machine, but keep in mind that the dimensions must be kept low, for example n_components=1000 and max_n_features=1000 will already use 1 GB of RAM.

A real OPU doesn’t these limitations because of the analogic nature of the transform matrix, it takes no compute memory at all.

For the OPU and OPUMap classes, instantiate it with the following code:

[ ]:
opu = OPU(n_components=1000, max_n_features=1000, simulated=True)