minerva.utils.position_embedding¶
Functions¶
|
embed_dim: output dimension for each position |
|
grid_size: int or tuple/list of (grid_h, grid_w) |
|
|
|
Module Contents¶
- minerva.utils.position_embedding.get_1d_sincos_pos_embed_from_grid(embed_dim, pos)[source]¶
embed_dim: output dimension for each position pos: a list of positions to be encoded: size (M,) out: (M, D)
- minerva.utils.position_embedding.get_2d_sincos_pos_embed(embed_dim, grid_size, cls_token=False)[source]¶
- grid_size: int or tuple/list of (grid_h, grid_w)
If int, the grid is square
If tuple/list, grid_h and grid_w define a rectangular grid
- return:
pos_embed: [grid_h*grid_w, embed_dim] or [1+grid_h*grid_w, embed_dim] (w/ cls_token)