Static Kernels#

A common approach when using signature kernels is to first lift the underlying ambient space to a new feature space by means of a feature map, and then consider the signature kernel in this feature space. Practically, this can be achieved by modifying the signature kernel PDE to use a static kernel on the ambient space. Recall that the (standard) signature kernel \(k_{x,y}\) is the solution to the Goursat PDE

\[\frac{\partial^2 k_{x,y}}{\partial s \partial t} = \langle \dot{x}_s, \dot{y}_t \rangle k_{x,y}, \quad k_{x,y}(u, \cdot) = k_{x,y}(\cdot, v) = 1,\]

where a first order finite difference approximation yields

\[\frac{\partial^2 k_{x,y}}{\partial s \partial t} = \left( \langle x_s, y_t \rangle - \langle x_{s-1}, y_t \rangle - \langle x_s, y_{t-1} \rangle + \langle x_{s-1}, y_{t-1} \rangle \right) k_{x,y}, \quad k_{x,y}(u, \cdot) = k_{x,y}(\cdot, v) = 1.\]

If instead one considers a static kernel \(\kappa\) on the ambient space, the equation becomes

\[\frac{\partial^2 k_{x,y}}{\partial s \partial t} = \left( \kappa(x_s, y_t) - \kappa(x_{s-1}, y_t) - \kappa(x_s, y_{t-1}) + \kappa(x_{s-1}, y_{t-1}) \right) k_{x,y}, \quad k_{x,y}(u, \cdot) = k_{x,y}(\cdot, v) = 1.\]

pysiglib functions which utilise signature kernels accept \(\kappa\) as an optional parameter (static_kernel). By default, the standard linear kernel will be used. pysiglib provides implementations of the linear kernel, scaled linear kernel, RBF kernel, polynomial kernel, Matern-1/2 kernel, Matern-3/2 kernel, Matern-5/2 kernel and rational quadratic kernel, which are documented below. In addition, one may define custom kernels.

import torch
import pysiglib

X = torch.rand((32, 100, 5))
Y = torch.rand((32, 100, 5))

# Default behaviour - linear kernel
ker = pysiglib.sig_kernel(X, Y, dyadic_order=1)

# Explicitly passed linear kernel - same as default behaviour
static_kernel = pysiglib.LinearKernel()
ker = pysiglib.sig_kernel(X, Y, dyadic_order=1, static_kernel=static_kernel)

# RBF kernel
static_kernel = pysiglib.RBFKernel(0.5)
ker = pysiglib.sig_kernel(X, Y, dyadic_order=1, static_kernel=static_kernel)

Standard Kernels#

class LinearKernel[source]#

The linear kernel, defined by \(\kappa(x, y) = \langle x, y \rangle\).

class ScaledLinearKernel(scale=1.0)[source]#

The scaled linear kernel, defined by \(\kappa(x, y) = \langle \alpha x, \alpha y \rangle = \alpha^2 \langle x, y \rangle\), where \(\alpha\) is given by the parameter scale. A choice of scale=1.0 corresponds to the standard linear kernel.

class RBFKernel(sigma)[source]#

The RBF kernel, defined by \(\kappa(x, y) = \exp\left( -\frac{\lVert x - y \rVert^2}{\sigma} \right)\).

class PolynomialKernel(degree=3.0, gamma=1.0, scale=1.0)[source]#

The polynomial kernel, defined by \(\kappa(x, y) = \text{scale} \cdot \left( \langle x, y \rangle + \gamma \right)^d\), where \(d\) is the degree parameter.

class Matern12Kernel(sigma)[source]#

The Matern-1/2 kernel (exponential kernel), defined by \(\kappa(x, y) = \exp\left( -\frac{\lVert x - y \rVert}{\sigma} \right)\).

class Matern32Kernel(sigma)[source]#

The Matern-3/2 kernel, defined by \(\kappa(x, y) = \left(1 + \frac{\sqrt{3} \lVert x - y \rVert}{\sigma}\right) \exp\left( -\frac{\sqrt{3} \lVert x - y \rVert}{\sigma} \right)\).

class Matern52Kernel(sigma)[source]#

The Matern-5/2 kernel, defined by \(\kappa(x, y) = \left(1 + \frac{\sqrt{5} \lVert x - y \rVert}{\sigma} + \frac{5 \lVert x - y \rVert^2}{3\sigma^2}\right) \exp\left( -\frac{\sqrt{5} \lVert x - y \rVert}{\sigma} \right)\).

class RationalQuadraticKernel(sigma, alpha=1.0)[source]#

The rational quadratic kernel, defined by \(\kappa(x, y) = \left(1 + \frac{\lVert x - y \rVert^2}{2 \alpha \sigma^2}\right)^{-\alpha}\).

Custom Kernels#

In addition to the provided kernels, one can use a custom kernel by defining a child class of the abstract base class pysiglib.StaticKernel and using the methods of pysiglib.Context to save objects for re-use in backpropagation. For example, an implementation of pysiglib.LinearKernel is given below. When writing custom kernels, it is very important to make them as efficient as possible, as computation of the static kernel makes up a significant proportion of the overall computational cost of signature kernels.

from pysiglib import StaticKernel

class LinearKernel(StaticKernel):

def __call__(self, ctx, x, y):
    dx = torch.diff(x, dim=1)
    dy = torch.diff(y, dim=1)
    ctx.save_for_backward(dx, dy)
    return torch.bmm(dx, dy.permute(0, 2, 1))

def grad_x(self, ctx, derivs):
    dx, dy = ctx.saved_tensors
    out = torch.empty((dx.shape[0], dx.shape[1] + 1, dy.shape[1]), dtype=torch.float64, device=derivs.device)
    out[:, 0, :] = 0
    out[:, 1:, :] = derivs
    out[:, :-1, :] -= derivs
    return torch.bmm(out, dy)

def grad_y(self, ctx, derivs):
    dx, dy = ctx.saved_tensors
    out = torch.empty((dx.shape[0], dx.shape[1], dy.shape[1] + 1), dtype=torch.float64, device=derivs.device)
    out[:, :, 0] = 0
    out[:, :, 1:] = derivs
    out[:, :, :-1] -= derivs
    return torch.bmm(out.permute(0, 2, 1), dx)
class Context[source]#

Provides context for backpropagation through static kernels. It is not generally necessary to create instances of this class manually; documentation for this class is provided purely for reference when constructing custom-made static kernels.

save_for_backward(*args)[source]#

Save objects from the forward pass to be re-used on the backward pass.

save_for_grad_y(*args)[source]#

Save objects from the computation of the gradient with respect to x to be re-used for that of the gradient with respect to y.

class StaticKernel[source]#
abstractmethod __call__(ctx, x, y)[source]#

Returns the gram matrix of static kernels:

\[\{ \kappa(x_s, y_t) - \kappa(x_{s-1}, x_t) - \kappa(x_s, y_{t-1}) + \kappa(x_{s-1}, y_{t-1}) \}_{0 \leq s \leq L_1, 0 \leq t \leq L_2}\]

as a tensor of shape (batch_size, length_1 - 1, length_2 - 1), where length_1 is the length of \(x\) and length_2 is the length of \(y\).

Parameters:
  • ctx (pysiglib.Context) – pysiglib.Context object for backpropagation

  • x (torch.Tensor) – Path \(x\) of shape (batch_size, length_1, dimension).

  • y (torch.Tensor) – Path \(y\) of shape (batch_size, length_2, dimension).

Returns:

Batch of gram matrices of shape (batch_size, length_1 - 1, length_2 - 1).

Return type:

torch.Tensor

abstractmethod grad_x(ctx, derivs)[source]#

Backpropagates derivs through the static kernel computation and returns the derivatives with respect to the path \(x\).

Parameters:
  • ctx (pysiglib.Context) – pysiglib.Context object for backpropagation

  • derivs – Derivatives with respect to the gram matrices outputted by __call__, of shape (batch_size, length_1 - 1, length_2 - 1).

Returns:

Derivatives with respect to the path \(x\) of shape (batch_size, length_1, dimension).

Return type:

torch.Tensor

abstractmethod grad_y(ctx, derivs)[source]#

Backpropagates derivs through the static kernel computation and returns the derivatives with respect to the path \(y\).

Parameters:
  • ctx (pysiglib.Context) – pysiglib.Context object for backpropagation

  • derivs – Derivatives with respect to the gram matrices outputted by __call__, of shape (batch_size, length_1 - 1, length_2 - 1).

Returns:

Derivatives with respect to the path \(y\) of shape (batch_size, length_2, dimension).

Return type:

torch.Tensor


Citation#

If you found this library useful in your research, please consider citing the paper:

@article{shmelev2025pysiglib,
  title={pySigLib-Fast Signature-Based Computations on CPU and GPU},
  author={Shmelev, Daniil and Salvi, Cristopher},
  journal={arXiv preprint arXiv:2509.10613},
  year={2025}
}