minerva.utils.position_embedding

Functions

get_1d_sincos_pos_embed_from_grid(embed_dim, pos)

embed_dim: output dimension for each position

get_2d_sincos_pos_embed(embed_dim, grid_size[, cls_token])

grid_size: int of the grid height and width

get_2d_sincos_pos_embed_from_grid(embed_dim, grid)

interpolate_pos_embed(model, checkpoint_model[, ...])

Module Contents

minerva.utils.position_embedding.get_1d_sincos_pos_embed_from_grid(embed_dim, pos)[source]

embed_dim: output dimension for each position pos: a list of positions to be encoded: size (M,) out: (M, D)

minerva.utils.position_embedding.get_2d_sincos_pos_embed(embed_dim, grid_size, cls_token=False)[source]

grid_size: int of the grid height and width return: pos_embed: [grid_size*grid_size, embed_dim] or [1+grid_size*grid_size, embed_dim] (w/ or w/o cls_token)

Parameters:
  • embed_dim (int)

  • grid_size (int)

minerva.utils.position_embedding.get_2d_sincos_pos_embed_from_grid(embed_dim, grid)[source]
minerva.utils.position_embedding.interpolate_pos_embed(model, checkpoint_model, newsize1=None, newsize2=None)[source]