Tutorial 3 - How Create Your Own Trasform

In this tutorial, we will show you how DASF organize the structure APIs to generate code for targeted to each architecture.

We will also show you how you can create your own object to and generate code dynamically to each platform.

For this, let’s use the same code we had used in Tutorial 2. Check how you can create data.npy before continue.

Then, we need to define our dataset.

[1]:
from dasf.datasets import DatasetArray

dataset = DatasetArray(name="My Saved NPY", root="data.npy")

Here, we want to create a transform to multiple the data by the same data.

First, let’s inpect how the data looks like. We are using a GPU, so it will require to fetch data from GPU to CPU. If you are using a CPU, you just need to print the data.

[2]:
dataset.load()

dataset[:2, :2, 0].get()
[2]:
array([[0.22139306, 0.18095083],
       [0.78598473, 0.28964964]])

Now, let’s create our own transform called Multiply. To generate the code targeted to the running platform, we need to import and set the respective decorator. So, the code will generate the function transform for us dynamically. To clarigy even more, we can include some a print call in each function.

[3]:
from dasf.transforms import Transform


class Multiply(Transform):
    def _lazy_transform_cpu(self, X):
        print("Lazy CPU")
        return X * X

    def _lazy_transform_gpu(self, X):
        print("Lazy GPU")
        return X * X

    def _transform_cpu(self, X):
        print("CPU")
        return X * X

    def _transform_gpu(self, X):
        print("GPU")
        return X * X

multiply = Multiply()

Now, we can transform our dataset and see what happens.

[4]:
result = multiply.transform(dataset)
GPU

See it triggered the GPU local function. Now, let’s see and compare what is the content of result variable.

[5]:
result[:2, :2, 0].get()
[5]:
array([[0.04901489, 0.0327432 ],
       [0.61777199, 0.08389691]])

See that the result is exactly the dataset multiplied by itself. The values confirm that. Now, what happens if I would like to run CPU code instead of GPU? If I want that, I need to call directly each protected method directly.

[6]:
dataset._load_cpu()

result = multiply._transform_cpu(dataset)

result[:2, :2, 0]
CPU
[6]:
array([[0.04901489, 0.0327432 ],
       [0.61777199, 0.08389691]])

See now that the code triggered the CPU function obviously.

Actually, if you pay attention, the implementation of each function are equal. Then, this class can be reduced to:

[7]:
class Multiply2(Transform):
    def transform(self, X):
        return X * X

Without decorator and all the other functions. The reason why we have all the diferentiations is that we know we will have different data manipulation for most cases.