Relative Positional encoding RQ 발표

Relative Positional encoding RQ 발표

2024. 10. 5. 20:04ㆍAI/LLM

Relative positional encoding uses a word’s relative location in sentence rather than its absolute location.

In the first picture, RPE bias which is based on the relative distance between token i and j were added to Key and Value.

Clipping is conducted after certain distance k for the computing efficiency

Second picture shows an example of RPE Bias table of length 512.

Each token’s bias itself is 0 and bias increases or decreases with the distance.

There were several approaches to implement RPE.

First line is the result of expanding absolute positional encoding.

X is content vector and p is positional encoding bias

Some approaches changed absolute bias to relative bias and some made bias itself trainable.

RPE can preserve more context information because it uses relative locational information between tokens.

It is also resilient to take an variable length of input.

But RPE may requires more computing resource than APE like when using trainable bias.

Reasoning and Planning - Paper 발표(Let’s Verify Step by Step) (0)	2024.10.05
Instruction finetuning - RQ (1)	2024.10.05
Instruction Finetuning(SELF-INSTRUCT)- paper 발표 (1)	2024.10.05
Decoder model vs Encoder-Decoder model RQ 발표 (0)	2024.10.05

규동이의 여행일기