pypto.experimental.gather_in_l1#

产品支持情况#

产品	是否支持
Ascend 950PR/Ascend 950DT	√
Atlas A3 训练系列产品/Atlas A3 推理系列产品	√
Atlas A2 训练系列产品/Atlas A2 推理系列产品	√

功能说明#

该接口为定制接口，约束较多。不保证稳定性。

从GM上的Tensor离散搬运指定行的数据，同时每行搬运前size个数据至L1。

函数原型#

gather_in_l1(src: Tensor, indices: Tensor, block_table: Tensor, block_size: int,
                 size: int, is_b_matrix: bool, is_trans: bool) -> Tensor

参数说明

参数名	输入/输出	说明
src	输入	源操作数。支持的数据类型为：DT_FP32, DT_FP16, DT_BF16, DT_INT8。不支持空Tensor，支持两维。
indices	输入	源操作数的行偏移。支持的数据类型为：DT_INT32, DT_INT64。不支持空Tensor，支持两维。 Shape形状为[1,n]。
block_table	输入	源操作数。支持的数据类型为DT_INT32。不支持空Tensor，支持两维。在实际使用中表示为 Page Attention 中的页表，形状为[1,block_table_size]，其中block_table_size表示页表的长度。
block_size	输入	源操作数。 int 类型。表示 Page Attention 中一个块可以放多少个token。
size	输入	每行搬运的数据数。数据数要小于源操作数的列数。
is_b_matrix	输入	搬运后的结果，即输出Tensor是否作为matmul的B矩阵。
is_trans	输入	搬运后的结果，即输出Tensor是否转置。

返回值说明#

返回输出Tensor

调用示例#

src = pypto.tensor([16, 32], pypto.DT_FP32, "tensor_src")
offset = pypto.tensor([1, 32], pypto.DT_INT32, "tensor_offset")
out = pypto.experimental.gather_in_l1(src , offset, 20, false, false)