[Learning notes] reading "Attention is all you need" paper
Abstract Encoder: is just the part we studied about RNN that reads the input sequence and "digest" it, the part responsible for creating and updating the hidden state, in a way it's like a person r...
![[Learning notes] reading "Attention is all you need" paper](https://media2.dev.to/dynamic/image/width=1200,height=627,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcob1nmdqzf7d2gbvknu4.png)
Source: DEV Community
Abstract Encoder: is just the part we studied about RNN that reads the input sequence and "digest" it, the part responsible for creating and updating the hidden state, in a way it's like a person reading something and keeping the "gist" of it in mind Decoder: is the part responsible for using that "gist", that hidden state, the mathematical vector, and use it to produce an output Attention mechanism: hidden state in case RNN ig Dispensing with: getting rid of Pros of transformers: Better results Parallelization Less time to train "Speedometer"reading for AI translation ability: BLEU (Bilingual Evaluation Understudy): so it seems to be some math formula, used as a metric to grade a machine's translation, comparing it with a translation written by a professional, a human ofc 0.0 would mean there was no matching and the model produced a horribly wrong result 100.0 or 1.0 would mean it was a perfect match, but ofc this can never be the case, we can say the exact same thing using lots of di