Transfer learning on the OPU

Computing the convolutional features

import warnings
import time
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:80% !important; }</style>"))

import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import torch
from torch.autograd import Variable
from import DataLoader
import torchvision.datasets as datasets
import torchvision.models as models
import torchvision.transforms as transforms

Dataset: STL10

The STL10 dataset is already downloaded on our servers. It can be loaded with any of your favorite frameworks from <ml_data_path>/STL10. It’s a dataset for image classification that consists in 5k training and 8k test examples. There are also 100k unlabelled ones. The images are RGB with size 96x96 and there are 10 classes.

# Data preparation
from lightonml.utils import get_ml_data_dir_path
normalize = transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
transform = transforms.Compose([transforms.Resize(224),
data_path = str(get_ml_data_dir_path() / "STL10")
train_data = datasets.STL10(root=data_path, split='train', transform=transform, download=True)
test_data = datasets.STL10(root=data_path, split='test', transform=transform, download=True)
train_loader = DataLoader(train_data, batch_size=64, shuffle=False, num_workers=12, drop_last=False)
test_loader = DataLoader(test_data, batch_size=64, shuffle=False, num_workers=12, drop_last=False)
Files already downloaded and verified
Files already downloaded and verified
f, axarr = plt.subplots(2, 2)
axarr[0, 0].imshow(np.transpose([9], (1, 2, 0)))
axarr[0, 0].set_title(train_data.classes[train_data.labels[9]])
axarr[0, 1].imshow(np.transpose([15], (1, 2, 0)))
axarr[0, 1].set_title(train_data.classes[train_data.labels[15]])
axarr[1, 0].imshow(np.transpose([5], (1, 2, 0)))
axarr[1, 0].set_title(train_data.classes[train_data.labels[5]])
axarr[1, 1].imshow(np.transpose([10], (1, 2, 0)))
axarr[1, 1].set_title(train_data.classes[train_data.labels[10]])

Model VGG16

Details about this architecture are in the following paper:

Very Deep Convolutional Networks for Large-Scale Image Recognition
K. Simonyan, A. Zisserman

We extract the features from the last convolutional layer.

Example of a VGG architecture

use_cuda = torch.cuda.is_available()

def compute_features(loader, model):
    conv_features = []
    labels = []
    for i, (images, targets) in enumerate(loader):
        if use_cuda:
            images = Variable(images.cuda())
            images = Variable(images)

        outputs = model.features(images)
        conv_features.append(, -1).numpy())
    return np.concatenate(conv_features), np.concatenate(labels)
# Load the model
vgg16 = models.vgg16(pretrained=True)
# Move model to GPU
if use_cuda:
train_conv_features, train_labels = compute_features(train_loader, vgg16)
test_conv_features, test_labels = compute_features(test_loader, vgg16)

np.savez_compressed('conv_features.npz', train=train_conv_features, test=test_conv_features)
np.savez_compressed('labels.npz', train=train_labels, test=test_labels)
CPU times: user 28.5 s, sys: 28.4 s, total: 56.9 s
Wall time: 1min

Learning a new classifier

from sklearn.linear_model import RidgeClassifier

from lightonml.encoding.base import Float32Encoder, MixingBitPlanDecoder
from lightonml.random_projections.opu import OPURandomMapping
from lightonopu.opu import OPU

Load data

conv_features = np.load('conv_features.npz')
labels = np.load('labels.npz')

n_components = 315000

train_conv_features = conv_features['train']
test_conv_features = conv_features['test']
train_labels = labels['train']
test_labels = labels['test']

OPU Pipeline

We encode the data in a binary format using the Float32Encoder and keeping the first 2 bits of the exponent. We project these features to a much higher dimensional space (size = \(315000\)). We decode the result using the MixingBitPlanDecoder and we learn a linear classifier (RidgeClassifier) on the random features.

encoder = Float32Encoder(sign_bit=False, exp_bits=2, mantissa_bits=0)
since = time.time()
encoded_conv_train_features = encoder.fit_transform(train_conv_features)
encoded_conv_test_features = encoder.fit_transform(test_conv_features)
encoding_time = time.time() - since
print('Time for encoding: {:.4f}'.format(encoding_time))
Time for encoding: 4.5280
opu = OPU(500, 200)
opu.cam_ROI = ([270, 480], [540, 960])
since = time.time()
opu_mapping = OPURandomMapping(opu, n_components=n_components, position='1d_square_macro_pixels',
                               roi_shape=(650, 650), roi_position=(245, 131), disable_pbar=True)
train_random_features = opu_mapping.fit_transform(encoded_conv_train_features)
test_random_features = opu_mapping.fit_transform(encoded_conv_test_features)
projection_time = time.time() - since
print('Time taken by RP on OPU: {:.4f}'.format(projection_time))
OPU: random projections of an array of size (10000,25088)
OPU: random projections of an array of size (16000,25088)
Time taken by RP on OPU: 40.0988
decoder = MixingBitPlanDecoder(n_bits=2)
since = time.time()
train_random_features = decoder.fit_transform(train_random_features)
test_random_features = decoder.fit_transform(test_random_features)
decoding_time = time.time() - since
print('Time for decoding: {:.4f}'.format(decoding_time))
Time for decoding: 26.1764
clf = RidgeClassifier(alpha=10)
since = time.time(), train_labels)
fit_time = time.time() - since
print('Time for fit: {:.4f}'.format(fit_time))
Time for fit: 32.4227

Model score

since = time.time()
train_accuracy = clf.score(train_random_features, train_labels)
test_accuracy = clf.score(test_random_features, test_labels)
score_time = time.time() - since
print('Time for score (s): {:.4f}'.format(score_time))
print('OPU Train accuracy (%): {:.4f}'.format(train_accuracy*100))
print('OPU Test accuracy (%): {:.4f}'.format(test_accuracy*100))
print('Total time: {:.4f}'.format(encoding_time + projection_time + decoding_time + fit_time + score_time))
Time for score (s): 6.0189
OPU Train accuracy (%): 100.0000
OPU Test accuracy (%): 88.2000
Total time: 109.2449