Lecture Attention Neural Networks
Lecture Attention Neural Networks
Xavier Bresson
https://twitter.com/xbresson
Xavier Bresson 1
2
Outline
Language Models
Memory Networks
Transformers
Sequence-To-Sequence Transformers
Transfer Learning
Conclusion
Xavier Bresson 2
3
Outline
Language Models
Memory Networks
Transformers
Sequence-To-Sequence Transformers
Transfer Learning
Conclusion
Xavier Bresson 3
4
Language models
Input: Output:
Sequence of words Probability distribution over
Xavier Bresson the dictionary/vocabulary 4
5
Data structure
Input is an ordered sequence.
Input length and output length can be variable.
RNNs are designed for sequences.
They learn a representation of sequence independently of its length.
Recurrence formula summarizes the sequence with a vector h :
<latexit sha1_base64="cIALSGUUxbizcHRyTDEQX3cYWoU=">AAACAnicbVDLSsNAFJ3UV42vqCtxM1iEClKSIuqy4MZlBfuANoTJdNIMnUzCzEQtobjxV9y4UMStX+HOv3HSZqGtBy4czrmXe+/xE0alsu1vo7S0vLK6Vl43Nza3tnes3b22jFOBSQvHLBZdH0nCKCctRRUj3UQQFPmMdPzRVe537oiQNOa3apwQN0JDTgOKkdKSZx2YIewzEigkRHwPA69TDU8fTkzT9KyKXbOngIvEKUgFFGh61ld/EOM0IlxhhqTsOXai3AwJRTEjE7OfSpIgPEJD0tOUo4hIN5u+MIHHWhnAIBa6uIJT9fdEhiIpx5GvOyOkQjnv5eJ/Xi9VwaWbUZ6kinA8WxSkDKoY5nnAARUEKzbWBGFB9a0Qh0ggrHRqeQjO/MuLpF2vOec15+as0qgXcZTBITgCVeCAC9AA16AJWgCDR/AMXsGb8WS8GO/Gx6y1ZBQz++APjM8f2R2VFg==</latexit>
h fW (h, x)
Xavier Bresson 5
6
Performance
Significant progress in NLP but not a breakthrough.
Dominant in NLP for Machine Translation (MT), Q&A, summarization up to 2018.
Limitation
RNNs cannot learn long-term dependencies (no more than 50 steps).
Hard to train because they are non-linear dynamical systems
Any small perturbation can amplify or vanish.
Slow to train because of their sequential nature (due to recurrence mechanism).
Important limitation when training on large-scale datasets.
Xavier Bresson 6
7
Outline
Language Models
Memory Networks
Transformers
Sequence-To-Sequence Transformers
Transfer Learning
Conclusion
Xavier Bresson 7
8
Memory networks
Xavier Bresson 8
9
Memory networks
Memory networks (Weston-Chopra-Bordes, Meta 2015) are designed to response to the
questions with multi-step word matching :
Joe went to the kitchen
Joe picked up milk
Joe went to the bathroom Time
(3) (4)
Joe put down the milk
(2)
Joe went to the bedroom
(1)
Question : Where is the milk?
Answer : bathroom
This matching process is also called multi-hop attention.
Multi-hop attention is a mechanism that performs multi-step reasoning.
Xavier Bresson 9
10
Joe went to the kitchen. Joe picked up milk. Joe went to the bathroom. Joe put down the milk. Joe went to the bedroom. Query: Where is the milk? Layer 4
(4)
Joe went to the kitchen. Joe picked up milk. Joe went to the bathroom. Joe put down the milk. Joe went to the bedroom. Query: Where is the milk? Layer 3
(3)
Joe went to the kitchen. Joe picked up milk. Joe went to the bathroom. Joe put down the milk. Joe went to the bedroom. Query: Where is the milk? Layer 2
(2)
Joe went to the kitchen. Joe picked up milk. Joe went to the bathroom. Joe put down the milk. Joe went to the bedroom. Query: Where is the milk? Layer 1
(1)
Joe went to the kitchen. Joe picked up milk. Joe went to the bathroom. Joe put down the milk. Joe went to the bedroom. Query: Where is the milk? Layer 0
Xavier Bresson 10
11
Implementation
Attention Mechanism
(q changes at each
Learnable
memory write)
Write parameters
Weighted memory X
Task
<latexit sha1_base64="m9wuDDWrHAqL3VKSqVylm8AydOw=">AAACCnicbVDLSsNAFJ34rPEVdelmtAh1E5Ii1Y1QcOOygn1AE8JkOmmHziRxZiKW0rUbf8WNC0Xc+gXu/BsnbRbaeuDA4Zx7mbknTBmVynG+jaXlldW19dKGubm1vbNr7e23ZJIJTJo4YYnohEgSRmPSVFQx0kkFQTxkpB0Or/K8fU+EpEl8q0Yp8TnqxzSiGCltBdbRHbyEnsx4QCHS9KCtGQWtykNAT6FpmoFVdmxnCrgo3EKUQYFGYH15vQRnnMQKMyRl13VS5Y+RUBQzMjG9TJIU4SHqk66WMeJE+uPpKRN4op0ejBKhGSs4dX9vjBGXcsRDPcmRGsj5LDf/y7qZii78MY3TTJEYzx6KMgZVAvNeYI8KghUbaYGwoPqvEA+QQFjp9vIS3PmTF0Wrars1u3ZzVq5XizpK4BAcgwpwwTmog2vQAE2AwSN4Bq/gzXgyXox342M2umQUOwfgD4zPH45FlvA=</latexit>
q= ai . fV (xi ) MLP
sum response
i After
multiple
{a0 , ..., an 1 }, ai 2R
<latexit sha1_base64="MjDOKLvX7jOj70cDzB3f0b2hQII=">AAACFHicbZDLSsNAFIYn9VbrLerSzWARBGtIilSXBTcuq9gLNCFMptN26GQSZiZCCXkIN76KGxeKuHXhzrdx0mahrT8MfPznHOacP4gZlcq2v43Syura+kZ5s7K1vbO7Z+4fdGSUCEzaOGKR6AVIEkY5aSuqGOnFgqAwYKQbTK7zeveBCEkjfq+mMfFCNOJ0SDFS2vLNMzeFyLdr0LIsWNOY8nMng26WM3Upd0OkxkGQ3mUV36zalj0TXAangCoo1PLNL3cQ4SQkXGGGpOw7dqy8FAlFMSNZxU0kiRGeoBHpa+QoJNJLZ0dl8EQ7AziMhH5cwZn7eyJFoZTTMNCd+YpysZab/9X6iRpeeSnlcaIIx/OPhgmDKoJ5QnBABcGKTTUgLKjeFeIxEggrnWMegrN48jJ06pbTsBq3F9VmvYijDI7AMTgFDrgETXADWqANMHgEz+AVvBlPxovxbnzMW0tGMXMI/sj4/AEZl5vA</latexit>
Value layers/hops
vectors
Attention
weights w/ Loop process/
Softmax Multi-hop attention
layers
{qfK (x0 )T , ..., qfK (xn T
<latexit sha1_base64="wBMsU8WiI+LbOqQ9oLQfUeyaeZ4=">AAACMHicbZDNSsNAFIUn/hv/qi7dDBahQg1JkepScKHgRsVqoYlhMp20QyeTODMRS8gjufFRdKOgiFufwkmtoK0HBg7fvZe59wQJo1LZ9osxMTk1PTM7N28uLC4tr5RW1y5lnApMGjhmsWgGSBJGOWkoqhhpJoKgKGDkKugdFvWrWyIkjfmF6ifEi1CH05BipDTyS0duBm9C/6Ry59vb1xdVaFkWrP6gjO84ucbQzavQdX8w1cil3I2Q6gZBdp6bpl8q25Y9EBw3ztCUwVCnfunRbcc4jQhXmCEpW46dKC9DQlHMSG66qSQJwj3UIS1tOYqI9LLBwTnc0qQNw1joxxUc0N8TGYqk7EeB7ix2lKO1Av5Xa6Uq3PcyypNUEY6/PwpTBlUMi/RgmwqCFetrg7CgeleIu0ggrHTGRQjO6Mnj5rJmOXWrfrZbPqgN45gDG2ATVIAD9sABOAanoAEwuAdP4BW8GQ/Gs/FufHy3ThjDmXXwR8bnF0qbpYs=</latexit>
1) },
Memory/Query
qfK (xi )T 2 R
Dot
q 2 R1⇥d
<latexit sha1_base64="cVmRyZJL/ZIak0Ufm5diOceywTU=">AAACCnicbVBNS8NAEJ3Ur1q/oh69rBbBU0mKqMeCF49V7Ac0sWw223bpZhN3N0IJPXvxr3jxoIhXf4E3/42btgdtfTDweG+GmXlBwpnSjvNtFZaWV1bXiuuljc2t7R17d6+p4lQS2iAxj2U7wIpyJmhDM81pO5EURwGnrWB4mfutByoVi8WtHiXUj3BfsB4jWBupax/eI48J5EVYD4IguxnfZa6nWUQVCselHF277FScCdAicWekDDPUu/aXF8YkjajQhGOlOq6TaD/DUjPC6bjkpYommAxxn3YMFdgs87PJK2N0bJQQ9WJpSmg0UX9PZDhSahQFpjM/Wc17ufif10l178LPmEhSTQWZLuqlHOkY5bmgkElKNB8Zgolk5lZEBlhiok16eQju/MuLpFmtuGcV9/q0XKvO4ijCARzBCbhwDjW4gjo0gMAjPMMrvFlP1ov1bn1MWwvWbGYf/sD6/AE/mpie</latexit>
Product Read
memory Hidden state
controlling/learning
attention mechanism
1⇥d
2R
<latexit sha1_base64="83q9tfjgssd8WJV1jp3fxVcFTYA=">AAACLHicbVDLSsNAFJ34rPUVdelmsAgVakiKqMtCN4KbKlaFJobJZNIOnUzCzEQsIR/kxl8RxIUibv0OJ20XWj0wcOace7n3niBlVCrbfjfm5hcWl5YrK9XVtfWNTXNr+1ommcCkixOWiNsAScIoJ11FFSO3qSAoDhi5CYbt0r+5J0LShF+pUUq8GPU5jShGSku+2XbzyD+vP/j2QQNalgUbcPLP+aFTHEC3aMAHn0KXcujGSA2CIL8s7nLHVTQmEoZFteqbNduyx4B/iTMlNTBFxzdf3DDBWUy4wgxJ2XPsVHk5EopiRoqqm0mSIjxEfdLTlCM9ycvHxxZwXyshjBKhH1dwrP7syFEs5SgOdGW5r5z1SvE/r5ep6NTLKU8zRTieDIoyBlUCy+RgSAXBio00QVhQvSvEAyQQVjrfMgRn9uS/5LppOcfW8cVRrdWcxlEBu2AP1IEDTkALnIEO6AIMHsEzeAPvxpPxanwYn5PSOWPaswN+wfj6Bup1pGo=</latexit>
2 R1⇥d
<latexit sha1_base64="VngTe44eqt3fy3cr0o81Tg4xGmg=">AAACInicbZBLS8NAEMc3Pmt9VT16WSyChxqSIj5uBS8eq9gHNDVsNtt26WYTdjdiWfJZvPhVvHhQ1JPgh3HT9qCtAwM//jPDzPyDhFGpHOfLWlhcWl5ZLawV1zc2t7ZLO7tNGacCkwaOWSzaAZKEUU4aiipG2okgKAoYaQXDy7zeuidC0pjfqlFCuhHqc9qjGCkj+aULT8MH36lA27ZhxaDmx24GvSxnCj3KoRchNQgCfZPdaddTNCIShlnRL5Ud2xkHnAd3CmUwjbpf+vDCGKcR4QozJGXHdRLV1UgoihnJil4qSYLwEPVJxyBHZlFXj1/M4KFRQtiLhUmu4Fj9PaFRJOUoCkxnfq6creXif7VOqnrnXU15kirC8WRRL2VQxTD3C4ZUEKzYyADCgppbIR4ggbAyruYmuLMvz0Ozaruntnt9Uq5Vp3YUwD44AEfABWegBq5AHTQABo/gGbyCN+vJerHerc9J64I1ndkDf8L6/gGuDaFV</latexit>
Formalization
1}
xi = fE (wi ) 2 Rd
<latexit sha1_base64="5wCg24oP7WAdNMbC6ouD8V2R43g=">AAACJ3icbVDLSsNAFJ34Nr6iLt0MFqFCDYmIulEKIrisYm2hiWEynbSDk0mYmagl5G/c+CtuBBXRpX/ipHbh68DAmXPu5d57wpRRqRzn3Rgbn5icmp6ZNefmFxaXrOWVC5lkApMmTlgi2iGShFFOmooqRtqpICgOGWmFV0el37omQtKEn6tBSvwY9TiNKEZKS4F16OXwNnBq0LZtWNM051tuAb2iBj39owdRcFy9Cegm9CiHXoxUPwzzs+Iy7xamaQZWxbGdIeBf4o5IBYzQCKwnr5vgLCZcYYak7LhOqvwcCUUxI4XpZZKkCF+hHuloylFMpJ8P7yzghla6MEqEflzBofq9I0exlIM41JXlovK3V4r/eZ1MRft+TnmaKcLx16AoY1AlsAwNdqkgWLGBJggLqneFuI8EwkpHW4bg/j75L7nYtt1de/d0p1LfHsUxA9bAOqgCF+yBOjgBDdAEGNyBB/AMXox749F4Nd6+SseMUc8q+AHj4xOc36Ik</latexit>
Hidden features :
2
<latexit sha1_base64="gXJzKmezbtnRhQOThSrEdZ3meCY=">AAACVnicbVFNa9wwEJWdpEmdfjjtsReRJZAcutghtL0UAr0EcklLNwmsXCPL410RWTbSOHQx/pPJJfkpvZTKuz40HwOCx3szvJmnrFbSYhTde/7a+saLza2Xwfar12/ehjvvzm3VGAETUanKXGbcgpIaJihRwWVtgJeZgovs6luvX1yDsbLSP3FRQ1LymZaFFBwdlYblKf1KmYICpwHLYCZ1y43hi64VoguK9HT/dxodMBaw67xC68CKa/XHuDsIGOh8GAiYkbM5JgGTmrKS4zzL2h/dr1YzlCVYmndBGo6icbQs+hTEAxiRoc7S8IbllWhK0CgUt3YaRzUmzhGlUOA8Gws1F1d8BlMHNXdGSbuMpaN7jslpURn3NNIl+/9Ey0trF2XmOvt17WOtJ5/Tpg0WX5JW6rpB0GJlVDSKYkX7jGkuDQhUCwe4MNLtSsWcGy7Q/UQfQvz45Kfg/HAcfxrH349Gx4dDHFvkA9kl+yQmn8kxOSFnZEIEuSV/PN9b8+68v/6Gv7lq9b1h5j15UH74D5gbsqY=</latexit>
3 2 3
fK (x0 ) fV (x0 )
6 .. 7 n⇥d 6 .. 7
52R q 2 R1⇥d 52R
n⇥d
<latexit sha1_base64="mqB7ZmK3xC12Ywwmu41TTnWDnZA=">AAACDXicbVBNS8NAEJ3Ur1q/qh69LFbBU0mKqMeCF49V7Ac0sWw223bpZhN3N0IJ/QNe/CtePCji1bs3/42bNgdtfTDweG+GmXl+zJnStv1tFZaWV1bXiuuljc2t7Z3y7l5LRYkktEkiHsmOjxXlTNCmZprTTiwpDn1O2/7oMvPbD1QqFolbPY6pF+KBYH1GsDZSr3x0j1wmkBtiPfT99GZylzquZiFVKJiUcvTKFbtqT4EWiZOTCuRo9MpfbhCRJKRCE46V6jp2rL0US80Ip5OSmygaYzLCA9o1VGCzz0un30zQsVEC1I+kKaHRVP09keJQqXHom87sajXvZeJ/XjfR/QsvZSJONBVktqifcKQjlEWDAiYp0XxsCCaSmVsRGWKJiTYBZiE48y8vklat6pxVnevTSr2Wx1GEAziEE3DgHOpwBQ1oAoFHeIZXeLOerBfr3fqYtRasfGYf/sD6/AH9GJja</latexit>
K=4 . V =4 .
Key fK (xn Query Value
1) <latexit sha1_base64="x+jW5nczloF/dJbrOd82aXE8liE=">AAACVnicbVFNa9wwEJWdpEnUL7c59iKyFJJDFzsUkksg0EuPaehuAivXyPJ4V0SWjTQOWYz/ZHtpf0ovpdpdH9KkA4LHezO8mae80cphHP8Kwq3tnWe7e/v0+YuXr15Hb95OXd1aCRNZ69re5MKBVgYmqFDDTWNBVLmG6/z200q/vgPrVG2+4rKBtBJzo0olBXoqiyo6ZeeMayhxRnkOc2U6Ya1Y9p2UPS2z6dF9Fh9zTvldUaPzYMN15kPSH1MOphgGKLdqvsCUcmUYrwQu8ry76r91hqOqwLGiz6JRPI7XxZ6CZAAjMtRlFn3nRS3bCgxKLZybJXGDqTdEJTV4y9ZBI+StmMPMQyO8T9qtY+nZe88UrKytfwbZmn040YnKuWWV+87Vtu6xtiL/p81aLM/STpmmRTByY1S2mmHNVhmzQlmQqJceCGmV35XJhbBCov8J6kNIHp/8FExPxkk8Tr58HF2cDHHskXfkkByRhJySC/KZXJIJkeQH+R2EwVbwM/gT7oS7m9YwGGYOyD8VRn8Br/CywQ==</latexit>
fV (xn 1)
a
q aV = Softmax(qK T )V 2 R1⇥d
s = MLP(q) 2 RV
<latexit sha1_base64="5dzF+lAomRn037W1AVbq43Vuz40=">AAACDHicbVDLSgMxFM3UV62vqks3wSLUTZkpUt0IBTcuFKrYB3RqyaRpG5rJjMkdsQz9ADf+ihsXirj1A9z5N2baLrT1QOBwzrnk3uOFgmuw7W8rtbC4tLySXs2srW9sbmW3d2o6iBRlVRqIQDU8opngklWBg2CNUDHie4LVvcFZ4tfvmdI8kDcwDFnLJz3Ju5wSMFI7m9P4FLvAHkD58eVFZZS/O3S5dH0Cfc+Lr0e3tYxJ2QV7DDxPnCnJoSkq7eyX2wlo5DMJVBCtm44dQismCjgVbJRxI81CQgekx5qGSuIz3YrHx4zwgVE6uBso8yTgsfp7Iia+1kPfM8lkST3rJeJ/XjOC7kkr5jKMgEk6+agbCQwBTprBHa4YBTE0hFDFza6Y9okiFEx/SQnO7MnzpFYsOKVC6eooVy5O60ijPbSP8shBx6iMzlEFVRFFj+gZvaI368l6sd6tj0k0ZU1ndtEfWJ8/f7uanA==</latexit>
Output :
Xavier Bresson 12
13
Properties
This model is seen as differentiable memory computers, i.e. memory operations read and write
can be differentiable and thus be used with backpropagation.
This network can update its memory by stacking multiple-hop attention layers to perform
multi-step reasoning.
This model is based on the principle that intelligence requires an adaptive long-term memory,
unlike RNNs which is limited to short-term memory.
This technique is a precursor of Transformers.
Xavier Bresson 13
14
Outline
Language Models
Memory Networks
Transformers
Sequence-To-Sequence Transformers
Transfer Learning
Conclusion
Xavier Bresson 14
15
Xavier Bresson 15
16
Self-attention mechanism
H = X 2 Rn⇥d
<latexit sha1_base64="AGgqVkFcaKeSvwE3pVF8gc+lZTE=">AAACC3icbVDLSsNAFJ3UV42vqEs3Q4vgqiRF0I1QcNNlFfuAJpbJZNIOnUzCzEQoIXs3/oobF4q49Qfc+TdO2iy09cCFwzn3cu89fsKoVLb9bVTW1jc2t6rb5s7u3v6BdXjUk3EqMOnimMVi4CNJGOWkq6hiZJAIgiKfkb4/vS78/gMRksb8Ts0S4kVozGlIMVJaGlm1NryCA+hSDt0IqYnvZ7f5fcZdRSMiYZCb5siq2w17DrhKnJLUQYnOyPpygxinEeEKMyTl0LET5WVIKIoZyU03lSRBeIrGZKgpR3qTl81/yeGpVgIYxkIXV3Cu/p7IUCTlLPJ1Z3GvXPYK8T9vmKrw0ssoT1JFOF4sClMGVQyLYGBABcGKzTRBWFB9K8QTJBBWOr4iBGf55VXSazYcu+HcnNdbzTKOKjgBNXAGHHABWqANOqALMHgEz+AVvBlPxovxbnwsWitGOXMM/sD4/AGO+Jlt</latexit>
Repeat K layers :
Softmax(QK T )V 2 Rn⇥d
<latexit sha1_base64="Z17LKnHzChkDZTU2dN73abf35uw=">AAACM3icbVBNSyNBEO3xY41ZXaMevTSGBb2EGVlWj4KXoBd1TRQyMfR0arSxP4buml3DMP/Ji3/EgyB7WBGv+x/siTm4ug8aXr9XRVW9JJPCYRg+BFPTM7Of5mrz9c8Li1+WGssrXWdyy6HDjTT2LGEOpNDQQYESzjILTCUSTpOrvco//QnWCaNPcJRBX7ELLVLBGXpp0Nhv05jGElJk1ppf1QfhGq0qfpgUFbsuN44Ozk82Ke1WntA0Vgwvk6Q4Ls8LHaNQ4OiwpPX6oNEMW+EY9COJJqRJJjgcNO7ioeG5Ao1cMud6UZhhv2AWBZdQ1uPcQcb4FbuAnqea+VH9YnxzSb96ZUhTY/3TSMfq246CKedGKvGV1cLuvVeJ//N6OaY7/ULoLEfQ/HVQmkuKhlYB0qGwwFGOPGHcCr8r5ZfMMo4+5iqE6P3JH0l3qxV9b0VH35q7W5M4amSNrJMNEpFtskva5JB0CCc35J78IY/BbfA7eAqeX0ungknPKvkHwd8XCA+pQw==</latexit>
H
Self-attention layer
Context-to-word representation
context
H
H
Softmax(QKT)V
The new data representation is a sum of all input data weighted by the
pairwise matching (or attention) scores.
Attention/
The subset of data with non-zero attention scores forms the context. Transformer layer
Xavier Bresson 17
18
Memory
Computational cost
RNN cell
vector
RNNs
Pattern
centered at Same pattern
word “drives” centered at
word “the”
Kernel size k Pairwise
CNNs matching
scores he drives the car
Computational cost
Xavier Bresson 19
20
Outline
Language Models
Memory Networks
Transformers
Sequence-To-Sequence Transformers
Transfer Learning
Conclusion
Xavier Bresson 20
21
Positional Positional
Encoding Encoding
g1
<latexit sha1_base64="00Y0Og0iMWYqzRUKiiljeo3gU6Y=">AAAH0HicjVVLb9QwEA7PheVV4MglsK1UpLLaFAm4VKrooVxABdGHtF5VjuNNojp2ZHvbXUyEuHLnCj+B38O/YZxkX84iiLTK+Pu+mbHHM5swZ6nSvd7vS5evXL12vXXjZvvW7Tt3763df3CkxEgSekgEE/IkxIqylNNDnWpGT3JJcRYyehye7Vn++JxKlQr+UU9yOshwzNNhSrAGqL++3kbxadCG9+lap9ftlY/fNILa6Hj1c3B6//ovFAkyyijXhGGl+kEv1wODpU4Jo0UbjRTNMTnDMe2DyXFG1cCUey78DUAifygk/Lj2S3TRw+BMqUkWgjLDOlEuZ8FVXH+kh68GJuX5SFNOqkTDEfO18G0B/CiVlGg2AQMTmcJefZJgiYmGMrXdNDrJtupsW/WOtqoTLCktr4VgaktpgCiP4T4KiMbpBRFZhnlk0LgwBoVDf1wUy0RSE4lLqJpQLhHXRFy4SeDyWWGQ3VAYmg+uY170g4FBjA71ZidAMo0T/dTVyDqAzMoksHYUOFTzOJ+ncT47Ki5kNpehmQ65QmtAE4mRLuVuETJWRVFwXblW6SdqOkGjVDPZUAjNhaZ/EeZShLZ6BDPfP3BZaNEZ+dYlL5JU0zKJ3TM0gakQv5klZKNlZQmsEkpxwZeVJbJCGktKl6UVskIqabQktOsVskhoKEcl9eGpLophHjMKcgCqC5MVAt4RHdriHFcNApY5Lmr4JOW6E9REIlJC2wadWBRFqcoZnig9YRRpOtalVV7Oxkw0w6cXvUKzwEztf2n/wwM9RuAwPUZ1js525zmUT1EdinFvByXwMutwPkvAfFiHdVuRcwKtS6WZKoAGvEBnVPJn3RfoIupNKxRFWCXgaMpa7UzRRfCZOxowE7Nh3IOFw58LNuePYNEYLR4tBLArtwHxQoTXuBEBGmQ8F5QrV5EX83ZoTBRfOoL9P3lHmufIG6qDFar9hUT7jemctnFzOH2/sO0Nn7jA/aA1jaPtbtDrBu+3O7uv64/dDe+R98Tb9ALvpbfrvfEOvEOPeML77v3wfrY+tMatL62vlfTypdrnobf0tL79AT451hU=</latexit>
g2
<latexit sha1_base64="VJ2euAYwwf8YHQQUAtPa/r4zbWQ=">AAAH0HicjVVLb9QwEA7PheVV4MglsK1UpLLaFAm4VKrooVxABdGHtF5VjuNNojp2ZHvbXUyEuHLnCj+B38O/YZxkX84iiLTK+Pu+mbHHM5swZ6nSvd7vS5evXL12vXXjZvvW7Tt3763df3CkxEgSekgEE/IkxIqylNNDnWpGT3JJcRYyehye7Vn++JxKlQr+UU9yOshwzNNhSrAGqL++3kbx6XYb3qdrnV63Vz5+0whqo+PVz8Hp/eu/UCTIKKNcE4aV6ge9XA8MljoljBZtNFI0x+QMx7QPJscZVQNT7rnwNwCJ/KGQ8OPaL9FFD4MzpSZZCMoM60S5nAVXcf2RHr4amJTnI005qRINR8zXwrcF8KNUUqLZBAxMZAp79UmCJSYaytR20+gk26qzbdU72qpOsKS0vBaCqS2lAaI8hvsoIBqnF0RkGeaRQePCGBQO/XFRLBNJTSQuoWpCuURcE3HhJoHLZ4VBdkNhaD64jnnRDwYGMTrUm50AyTRO9FNXI+sAMiuTwNpR4FDN43yexvnsqLiQ2VyGZjrkCq0BTSRGupS7RchYFUXBdeVapZ+o6QSNUs1kQyE0F5r+RZhLEdrqEcx8/8BloUVn5FuXvEhSTcskds/QBKZC/GaWkI2WlSWwSijFBV9WlsgKaSwpXZZWyAqppNGS0K5XyCKhoRyV1IenuiiGecwoyAGoLkxWCHhHdGiLc1w1CFjmuKjhk5TrTlATiUgJbRt0YlEUpSpneKL0hFGk6ViXVnk5GzPRDJ9e9ArNAjO1/6X9Dw/0GIHD9BjVOTrbnedQPkV1KMa9HZTAy6zD+SwB82Ed1m1Fzgm0LpVmqgAa8AKdUcmfdV+gi6g3rVAUYZWAoylrtTNFF8Fn7mjATMyGcQ8WDn8u2Jw/gkVjtHi0EMCu3AbECxFe40YEaJDxXFCuXEVezNuhMVF86Qj2/+QdaZ4jb6gOVqj2FxLtN6Zz2sbN4fT9wrY3fOIC94PWNI62u0GvG7zf7uy+rj92N7xH3hNv0wu8l96u98Y78A494gnvu/fD+9n60Bq3vrS+VtLLl2qfh97S0/r2B0Wf1hY=</latexit>
g3
<latexit sha1_base64="U/joC5aM7ldT90xu7lzV6UgqZTU=">AAAH0HicjVVLb9QwEE55bVleLRy5BLZIIJVq00rApVIFh3IBFUQf0npVOY43ierYke1td3EjxJU7V/gJ/B7+DeMk+3IWQaRVxt/3zYw9ntmEOUuV7nZ/r1y5eu36jdbqzfat23fu3ltbv3+kxFASekgEE/IkxIqylNNDnWpGT3JJcRYyehyevbH88TmVKhX8kx7ntJ/hmKeDlGANUG9jo43i0502vE/XOt2tbvn4TSOojY5XPwen6zd+oUiQYUa5Jgwr1Qu6ue4bLHVKGC3aaKhojskZjmkPTI4zqvqm3HPhPwEk8gdCwo9rv0TnPQzOlBpnISgzrBPlchZcxvWGevCqb1KeDzXlpEo0GDJfC98WwI9SSYlmYzAwkSns1ScJlphoKFPbTaOTbLPOtlnvaLM6wYLS8loIpjaVBojyGO6jgGicXhCRZZhHBo0KY1A48EdFsUgkNZG4hKoJ5RJxTcSFmwQunxUG2Q2FofnoOuZFL+gbxOhAP+0ESKZxop+5GlkHkFmZBNaOAodqFudyEufSUXEhs5kMTXXIFVoDmkgMdSl3i5CxKoqC68q1Sj9T0wkapZrKBkJoLjT9izCXIrTVI5j5/oHLQotOyXcueZGkmpZJ7J6hCUyF+M0sIRsuKktgmVCKC76oLJEl0lhSuiitkCVSSaMFoV0vkUVCQzkqqQ9PdVEM85hRkANQXZisEPCO6MAW57hqELDMcVHDJynXnaAmEpES2jboxKIoSlXO8FjpMaNI05EurfJynkxFU3xy0Us0c8zE/pf2PzzQIwQOk2NU5+hsd3agfIrqUIy6uyiBl9mA81kC5sM6bNiKnBNoXSrNRAE04AU6o5I/33qBLqLupEJRhFUCjqas1e4EnQefu6MBMzEdxjewcPhzwWb8ESwao8WjuQB25TYgnovwGjciQIOMZoJy5SryYtYOjYniC0ew/yfvSfMceUN1sES1P5dovzGdkzZuDqfvF7a94RMXuB+0pnG0vRV0t4IP25291/XHbtV76D32nnqB99Lb8956B96hRzzhffd+eD9bH1uj1pfW10p6ZaX2eeAtPK1vfwBNBdYX</latexit>
g4
<latexit sha1_base64="uHnRM4zytLtp8psjMB9nWobHQdU=">AAAH0HicjVVLb9QwEA6vblleLRy5BLZIILXVpiDgUqmCA1xABdGHtF5VjuNNojp2ZHvbXdwIceXOFX4Cv4d/wzjJvpxFEGmV8fd9M2OPZzZhzlKlu93fly5fuXptpbV6vX3j5q3bd9bW7x4qMZSEHhDBhDwOsaIs5fRAp5rR41xSnIWMHoWnry1/dEalSgX/pMc57Wc45ukgJVgD1NvYaKP45Fkb3idrne52t3z8phHURsern/2T9ZVfKBJkmFGuCcNK9YJurvsGS50SRos2GiqaY3KKY9oDk+OMqr4p91z4jwCJ/IGQ8OPaL9F5D4MzpcZZCMoM60S5nAWXcb2hHrzsm5TnQ005qRINhszXwrcF8KNUUqLZGAxMZAp79UmCJSYaytR20+gk26yzbdY72qxOsKC0vBaCqU2lAaI8hvsoIBqn50RkGeaRQaPCGBQO/FFRLBJJTSQuoWpCuURcE3HhJoHLZ4VBdkNhaD66jnnRC/oGMTrQjzsBkmmc6CeuRtYBZFYmgbWjwKGaxbmYxLlwVFzIbCZDUx1yhdaAJhJDXcrdImSsiqLgunKt0s/UdIJGqaaygRCaC03/IsylCG31CGa+v++y0KJT8p1LnieppmUSu2doAlMhfjNLyIaLyhJYJpTinC8qS2SJNJaULkorZIlU0mhBaNdLZJHQUI5K6sNTXRTDPGYU5ABUFyYrBLwjOrDFOaoaBCxzVNTwccp1J6iJRKSEtg06tiiKUpUzPFZ6zCjSdKRLq7ycR1PRFJ9c9BLNHDOx/6X9Dw/0AIHD5BjVOTo7nadQPkV1KEbdXZTAy2zA+SwB82EdNmxFzgi0LpVmogAa8AKdUsm3tp+j86g7qVAUYZWAoylrtTtB58EtdzRgJqbD+BoWDn8m2Iw/hEVjtHg0F8Cu3AbEcxFe4UYEaJDRTFCuXEVezNqhMVF84Qj2/+Q9aZ4jb6j2l6jezCV605jOSRs3h9P3C9ve8IkL3A9a0zjc2Q6628GHnc7eq/pjt+rd9x56j73Ae+HteW+9fe/AI57wvns/vJ+tj61R60vrayW9fKn2uectPK1vfwBUa9YY</latexit>
sha1_base64="rg2bC7kLVucqcZBxOLpyGf0aF/s=">AAAHt3icjVXbbtQwEE25bVkKtM+8BLaVitRWSZGAl0qISpQXUEH0ItWrynG8m6iOHdlOu4ubH+CVD+F7+BvGSfbmLAJLqx2fczxjj2fiKGep0kHwe+XO3Xv3H3RWH3YfrXUfP3m6vnaqRCEJPSGCCXkeYUVZyumJTjWj57mkOIsYPYuuDi1/dk2lSgX/psc57Wd4yNNBSrAG6PhyvRfsBdXw20bYGD2vGZcbD36hWJAio1wThpW6CINc9w2WOiWMll1UKJpjcoWH9AJMjjOq+qbaZ+lvARL7AyHhx7VfofMrDM6UGmcRKDOsE+VyFlzGXRR68LZvUp4XmnJSBxoUzNfCt4f241RSotkYDExkCnv1SYIlJhpS03XD6CTbaaLtNDvaqU+woLS8FoKpHaUBonwId1CCN05viMgyzGODRqUxKBr4o7JcJJKGSFxCNYRyiWFDDEs3CFw4Kw2yG4oi89VdmJcXYd8gRgd6uxcimQ4T/dLVyMaBzKogMHcUOFIzP7cTP7eOiguZzWRoqkOu0BpQRKLQldxNQsZqLwquK9cq/U5NL2ylaiobCKG50PQvwlyKyGaPYOb7xy4LJTolP7nkTZJqWgWxe4YiMDXit6NErFhUVsAyoRQ3fFFZIUukQ0nporRGlkgljReEdr5EFgsN6ailPoz6ohjmQ0ZBDkB9YbJGYHVMBzY5Z3WBgGXOygY+T7nuhQ2RiJTQrkHnFkVxqnKGx0qPGUWajnRlVZezNRVN8clFL9HMMRP7X9r/WIGeI1gwOUZ9jt5+7xWkT1EdiVFwgBL4M5twPktAf9gFmzYj1wRKl0ozUQANeImuqOS7e6/RTRxMMhTHWCWw0FS5Opig8+Cu2xrQE9NmPISJw18LNuNPYdJqLR7PObAztwDxnIf3uOUBCmQ0E1QzV5GXs3JodRRfOIL9nnwm7XPkLdXxEtXRXKCjVndOyrjdnL5f2vKGFy5037O2cbq/FwZ74ZfAW/WeeS+8bS/03njvvI/esXfiES/2fng/Ox86rKPql/DOSvMkbngLo1P8AfvQ0YI=</latexit>
sha1_base64="1/ymqKuJw60FflTbqdoSVy4So7M=">AAAHxXicjVXbbtw2EGWu625udl7zomZtIAEcQ3KLJC8GguYheWnhGPEFWC4MiuJKgilSILn2bhih6Ef0tfmEfk//pkNJe6M2aAksNDzncIYczizjkufahOE/t27fuXvvfm/rh/6Dh48eP9neeXim5URRdkoll+oiJprxXLBTkxvOLkrFSBFzdh5fvXf8+TVTOpfis5mVbFSQVOTjnBID0HB3t4/Ty5/78L3cHoQHYT2CrhG1xgC14/hy5/7fOJF0UjBhKCdaD6OwNCNLlMkpZ1UfTzQrCb0iKRuCKUjB9MjWe66CPUCSYCwV/IQJanR1hSWF1rMiBmVBTKZ9zoGbuOHEjN+ObC7KiWGCNoHGEx4YGbgEBEmuGDV8BgahKoe9BjQjilADaer7YUxW7LfR9tsd7TcnWFM63kjJ9b42ADGRwn1U4E2wGyqLgojE4mllLY7HwbSq1omsJTKf0C2hfSJtibTyg8Dl88pit6E4tif+wrIaRiOLORubF4MIqzzNzEtfo1oHqqiDwNxTkFgv/Xyd+/nqqYRUxVKGFzrsC50BRSQnppb7SSh440XDdZVG51+YHUSdVC1kYymNkIZ9R1gqGbvsUcKD4NhnoUQX5K8+eZPlhtVB3J6hCGyDBN0oMZ+sK2tgk1DJG7GurJEN0lQxti5tkA1SxZI1oZtvkCXSQDoaaQCjuShORMoZyAFoLkw1CKxO2Ngl57wpELDsedXCF7kwg6glMplT1rf4wqE4yXXJyUybGWfYsKmprfpy9haiBT6/6A2aFWZu/5f2f6zAP2JYMD9Gc47B4eAnSJ9mJpbT8Ahn8LG7cD5HQH+4BbsuI9cUSpcpO1cADXiFr5gSrw5e45sknGcoSYjOYKGtc3U0R1fBV35rQE8smvE9TDz+WvIlfwaTTmuJZMWBm/kFSFY8/EI6HqBApktBPfMVZbUsh05HibUjuP+T32j3HGVHdbxB9WEl0IdOd87LuNucQVC58oYnLvIftK5xdngQhQfRpxBtoWfoOXqBIvQGvUMf0TE6RRRJ9Cf6C33rnfSmvd+bx/D2rfZVfIrWRu+PfwH4sdTZ</latexit>
sha1_base64="8hS6En7yoUeJfpzKXVynuzsw0KI=">AAAH0HicjVVLb9QwEE55LSxvOHIJbJFAKlVSEHBBQu2hXEAF0Ye0XlWO402iOnZke9tdTIS4cucKP4Hfw79hnGRfziKItMr4+76Zscczm6hgmdJB8Hvt3PkLFy91Ll/pXr12/cbNW7fvHCgxkoTuE8GEPIqwoizjdF9nmtGjQlKcR4weRic7lj88pVJlgn/Uk4IOcpzwbJgRrAHqr693UXL8rAvv41u9YDOoHr9thI3R85pn7/j2pV8oFmSUU64Jw0r1w6DQA4OlzgijZReNFC0wOcEJ7YPJcU7VwFR7Lv2HgMT+UEj4ce1X6KKHwblSkzwCZY51qlzOgqu4/kgPXw5MxouRppzUiYYj5mvh2wL4cSYp0WwCBiYyg736JMUSEw1l6rppdJpvNNk2mh1t1CdYUlpeC8HUhtIAUZ7AfZQQjdMzIvIc89igcWkMiob+uCyXibQhUpdQDaFcImmIpHSTwOWz0iC7oSgyH1zHouyHA4MYHepHvRDJLEn1Y1cjmwAyr5LA2lHgSM3jfJ7G+eyouJD5XIZmOuQKrQFNJEa6krtFyFkdRcF1FVpln6jpha1SzWRDITQXmv5FWEgR2eoRzHx/z2WhRWfkW5c8SzNNqyR2z9AEpkb8dpaIjZaVFbBKKMUZX1ZWyAppIildltbICqmk8ZLQrlfIYqGhHLXUh6e+KIZ5wijIAagvTNYIeMd0aItzWDcIWOawbOCjjOte2BCpyAjtGnRkURRnqmB4ovSEUaTpWFdWdTkPZ6IZPr3oFZoFZmr/S/sfHug+AofpMepz9LZ6T6F8iupIjINXKIWXWYfzWQLmwzqs24qcEmhdKs1UATTgJTqhkj/ZfI7O4mBaoTjGKgVHU9Xq1RRdBJ+4owEzMRvGHVg4/Klgc/4AFq3R4vFCALtyGxAvRNjGrQjQIOO5oFq5iqKct0NrovjSEez/yTvSPkfRUu2tUO0uJNptTee0jdvD6fulbW/4xIXuB61tHGxthsFm+D7ovd5uPnaXvXveA++RF3ovvNfeG2/P2/eIJ7zv3g/vZ+dDZ9z50vlaS8+tNT53vaWn8+0PU8vWFg==</latexit>
g9 g10
U
<latexit sha1_base64="n7ZTHUiUop3CT0ZMCIHMhParPlg=">AAAH1nicjVXPj9Q2FA6UMnRoy9IeewnMIlFpWU0WifaChOAAl6IFsbsjrUcrx3kzidaxI9vZnalJbxVX7lzh3L+H/4bnJPPLGUQtrfb5+773nv383iQueKbNcPj5ytXvrn1/vXfjh/7NH3/6+dbO7V+OtSwVgyMmuVSjmGrgmYAjkxkOo0IBzWMOJ/H5M8efXIDSmRRvzLyAcU6nIptkjBqEznZu7e72ScxLsEdVH+2zncFwf1ivsGtErTEI2nV4dvv6fySRrMxBGMap1qfRsDBjS5XJGIeqT0oNBWXndAqnaAqagx7b+uRVeA+RJJxIhX/ChDW67mFprvU8j1GZU5Nqn3PgNu60NJM/xzYTRWlAsCbRpOShkaErQ5hkCpjhczQoUxmeNWQpVZQZLFbfT2PSfK/NtteeaK+5wYbS8UZKrve0QQjEFF+lwmgCLpnMcyoSS2aVtSSehLOq2iTSlkh9QreE9olpS0wrPwm2AK8scQeKY/vadyyq02hsCYeJuT+IiMqmqfnd16g2gMrrJLj3FDTWqzhvF3HeeiohVb6SkaWO+EJnYBPJ0tRyvwg5b6JofK7C6OxvsIOoU6qlbCKlEdLAV4SFkrGrHqM8DA99Flt0Sf7lk5dpZqBO4s6MTWAbJOxmcYO1oayBbUIlL8Wmska2SKcKYFPaIFukCpINodtvkSXSYDkaaYireShOxZQDyhFoHkw1CHonMHHFOWkaBC17UrXwKBNmELVEKjMGfUtGDiVJpgtO59rMORADM1Nb9ePcW4qW+OKht2jWmIX9Le3/8CB3CDosrtHcY3AweIjl02BiORs+Jin+s7t4P0fgfDiHXVeRC4atC8ouFEgjXpFzUOLB/iNymQwXFUoSqlN0tHWtHi/QdfCBPxo4E8thfIYbj7+QfMUf46YzWiJZC+B2fgPStQhPaScCNshsJah3vqKoVu3QmSixcQX3e/KSde9RdFSHW1TP1xI970znoo27wxmGlWtv/MRF/getaxwf7EfD/ejVweDJ0/ZjdyP4Lbgb3A+i4I/gSfAiOAyOAhaUwYfgY/CpN+r90/u3966RXr3S+vwabKze+y/fmdhs</latexit>
<latexit
U
<latexit sha1_base64="n7ZTHUiUop3CT0ZMCIHMhParPlg=">AAAH1nicjVXPj9Q2FA6UMnRoy9IeewnMIlFpWU0WifaChOAAl6IFsbsjrUcrx3kzidaxI9vZnalJbxVX7lzh3L+H/4bnJPPLGUQtrfb5+773nv383iQueKbNcPj5ytXvrn1/vXfjh/7NH3/6+dbO7V+OtSwVgyMmuVSjmGrgmYAjkxkOo0IBzWMOJ/H5M8efXIDSmRRvzLyAcU6nIptkjBqEznZu7e72ScxLsEdVH+2zncFwf1ivsGtErTEI2nV4dvv6fySRrMxBGMap1qfRsDBjS5XJGIeqT0oNBWXndAqnaAqagx7b+uRVeA+RJJxIhX/ChDW67mFprvU8j1GZU5Nqn3PgNu60NJM/xzYTRWlAsCbRpOShkaErQ5hkCpjhczQoUxmeNWQpVZQZLFbfT2PSfK/NtteeaK+5wYbS8UZKrve0QQjEFF+lwmgCLpnMcyoSS2aVtSSehLOq2iTSlkh9QreE9olpS0wrPwm2AK8scQeKY/vadyyq02hsCYeJuT+IiMqmqfnd16g2gMrrJLj3FDTWqzhvF3HeeiohVb6SkaWO+EJnYBPJ0tRyvwg5b6JofK7C6OxvsIOoU6qlbCKlEdLAV4SFkrGrHqM8DA99Flt0Sf7lk5dpZqBO4s6MTWAbJOxmcYO1oayBbUIlL8Wmska2SKcKYFPaIFukCpINodtvkSXSYDkaaYireShOxZQDyhFoHkw1CHonMHHFOWkaBC17UrXwKBNmELVEKjMGfUtGDiVJpgtO59rMORADM1Nb9ePcW4qW+OKht2jWmIX9Le3/8CB3CDosrtHcY3AweIjl02BiORs+Jin+s7t4P0fgfDiHXVeRC4atC8ouFEgjXpFzUOLB/iNymQwXFUoSqlN0tHWtHi/QdfCBPxo4E8thfIYbj7+QfMUf46YzWiJZC+B2fgPStQhPaScCNshsJah3vqKoVu3QmSixcQX3e/KSde9RdFSHW1TP1xI970znoo27wxmGlWtv/MRF/getaxwf7EfD/ejVweDJ0/ZjdyP4Lbgb3A+i4I/gSfAiOAyOAhaUwYfgY/CpN+r90/u3966RXr3S+vwabKze+y/fmdhs</latexit>
<latexit
U
<latexit sha1_base64="n7ZTHUiUop3CT0ZMCIHMhParPlg=">AAAH1nicjVXPj9Q2FA6UMnRoy9IeewnMIlFpWU0WifaChOAAl6IFsbsjrUcrx3kzidaxI9vZnalJbxVX7lzh3L+H/4bnJPPLGUQtrfb5+773nv383iQueKbNcPj5ytXvrn1/vXfjh/7NH3/6+dbO7V+OtSwVgyMmuVSjmGrgmYAjkxkOo0IBzWMOJ/H5M8efXIDSmRRvzLyAcU6nIptkjBqEznZu7e72ScxLsEdVH+2zncFwf1ivsGtErTEI2nV4dvv6fySRrMxBGMap1qfRsDBjS5XJGIeqT0oNBWXndAqnaAqagx7b+uRVeA+RJJxIhX/ChDW67mFprvU8j1GZU5Nqn3PgNu60NJM/xzYTRWlAsCbRpOShkaErQ5hkCpjhczQoUxmeNWQpVZQZLFbfT2PSfK/NtteeaK+5wYbS8UZKrve0QQjEFF+lwmgCLpnMcyoSS2aVtSSehLOq2iTSlkh9QreE9olpS0wrPwm2AK8scQeKY/vadyyq02hsCYeJuT+IiMqmqfnd16g2gMrrJLj3FDTWqzhvF3HeeiohVb6SkaWO+EJnYBPJ0tRyvwg5b6JofK7C6OxvsIOoU6qlbCKlEdLAV4SFkrGrHqM8DA99Flt0Sf7lk5dpZqBO4s6MTWAbJOxmcYO1oayBbUIlL8Wmska2SKcKYFPaIFukCpINodtvkSXSYDkaaYireShOxZQDyhFoHkw1CHonMHHFOWkaBC17UrXwKBNmELVEKjMGfUtGDiVJpgtO59rMORADM1Nb9ePcW4qW+OKht2jWmIX9Le3/8CB3CDosrtHcY3AweIjl02BiORs+Jin+s7t4P0fgfDiHXVeRC4atC8ouFEgjXpFzUOLB/iNymQwXFUoSqlN0tHWtHi/QdfCBPxo4E8thfIYbj7+QfMUf46YzWiJZC+B2fgPStQhPaScCNshsJah3vqKoVu3QmSixcQX3e/KSde9RdFSHW1TP1xI970znoo27wxmGlWtv/MRF/getaxwf7EfD/ejVweDJ0/ZjdyP4Lbgb3A+i4I/gSfAiOAyOAhaUwYfgY/CpN+r90/u3966RXr3S+vwabKze+y/fmdhs</latexit>
<latexit
U
<latexit sha1_base64="n7ZTHUiUop3CT0ZMCIHMhParPlg=">AAAH1nicjVXPj9Q2FA6UMnRoy9IeewnMIlFpWU0WifaChOAAl6IFsbsjrUcrx3kzidaxI9vZnalJbxVX7lzh3L+H/4bnJPPLGUQtrfb5+773nv383iQueKbNcPj5ytXvrn1/vXfjh/7NH3/6+dbO7V+OtSwVgyMmuVSjmGrgmYAjkxkOo0IBzWMOJ/H5M8efXIDSmRRvzLyAcU6nIptkjBqEznZu7e72ScxLsEdVH+2zncFwf1ivsGtErTEI2nV4dvv6fySRrMxBGMap1qfRsDBjS5XJGIeqT0oNBWXndAqnaAqagx7b+uRVeA+RJJxIhX/ChDW67mFprvU8j1GZU5Nqn3PgNu60NJM/xzYTRWlAsCbRpOShkaErQ5hkCpjhczQoUxmeNWQpVZQZLFbfT2PSfK/NtteeaK+5wYbS8UZKrve0QQjEFF+lwmgCLpnMcyoSS2aVtSSehLOq2iTSlkh9QreE9olpS0wrPwm2AK8scQeKY/vadyyq02hsCYeJuT+IiMqmqfnd16g2gMrrJLj3FDTWqzhvF3HeeiohVb6SkaWO+EJnYBPJ0tRyvwg5b6JofK7C6OxvsIOoU6qlbCKlEdLAV4SFkrGrHqM8DA99Flt0Sf7lk5dpZqBO4s6MTWAbJOxmcYO1oayBbUIlL8Wmska2SKcKYFPaIFukCpINodtvkSXSYDkaaYireShOxZQDyhFoHkw1CHonMHHFOWkaBC17UrXwKBNmELVEKjMGfUtGDiVJpgtO59rMORADM1Nb9ePcW4qW+OKht2jWmIX9Le3/8CB3CDosrtHcY3AweIjl02BiORs+Jin+s7t4P0fgfDiHXVeRC4atC8ouFEgjXpFzUOLB/iNymQwXFUoSqlN0tHWtHi/QdfCBPxo4E8thfIYbj7+QfMUf46YzWiJZC+B2fgPStQhPaScCNshsJah3vqKoVu3QmSixcQX3e/KSde9RdFSHW1TP1xI970znoo27wxmGlWtv/MRF/getaxwf7EfD/ejVweDJ0/ZjdyP4Lbgb3A+i4I/gSfAiOAyOAhaUwYfgY/CpN+r90/u3966RXr3S+vwabKze+y/fmdhs</latexit>
<latexit
U
<latexit sha1_base64="n7ZTHUiUop3CT0ZMCIHMhParPlg=">AAAH1nicjVXPj9Q2FA6UMnRoy9IeewnMIlFpWU0WifaChOAAl6IFsbsjrUcrx3kzidaxI9vZnalJbxVX7lzh3L+H/4bnJPPLGUQtrfb5+773nv383iQueKbNcPj5ytXvrn1/vXfjh/7NH3/6+dbO7V+OtSwVgyMmuVSjmGrgmYAjkxkOo0IBzWMOJ/H5M8efXIDSmRRvzLyAcU6nIptkjBqEznZu7e72ScxLsEdVH+2zncFwf1ivsGtErTEI2nV4dvv6fySRrMxBGMap1qfRsDBjS5XJGIeqT0oNBWXndAqnaAqagx7b+uRVeA+RJJxIhX/ChDW67mFprvU8j1GZU5Nqn3PgNu60NJM/xzYTRWlAsCbRpOShkaErQ5hkCpjhczQoUxmeNWQpVZQZLFbfT2PSfK/NtteeaK+5wYbS8UZKrve0QQjEFF+lwmgCLpnMcyoSS2aVtSSehLOq2iTSlkh9QreE9olpS0wrPwm2AK8scQeKY/vadyyq02hsCYeJuT+IiMqmqfnd16g2gMrrJLj3FDTWqzhvF3HeeiohVb6SkaWO+EJnYBPJ0tRyvwg5b6JofK7C6OxvsIOoU6qlbCKlEdLAV4SFkrGrHqM8DA99Flt0Sf7lk5dpZqBO4s6MTWAbJOxmcYO1oayBbUIlL8
<latexit sha1_base64="n7ZTHUiUop3CT0ZMCIHMhParPlg=">AAAH1nicjVXPj9Q2FA6UMnRoy9IeewnMIlFpWU0WifaChOAAl6IFsbsjrUcrx3kzidaxI9vZnalJbxVX7lzh3L+H/4bnJPPLGUQtrfb5+773nv383iQueKbNcPj5ytXvrn1/vXfjh/7NH3/6+dbO7V+OtSwVgyMmuVSjmGrgmYAjkxkOo0IBzWMOJ/H5M8efXIDSmRRvzLyAcU6nIptkjBqEznZu7e72ScxLsEdVH+2zncFwf1ivsGtErTEI2nV4dvv6fySRrMxBGMap1qfRsDBjS5XJGIeqT0oNBWXndAqnaAqagx7b+uRVeA+RJJxIhX/ChDW67mFprvU8j1GZU5Nqn3PgNu60NJM/xzYTRWlAsCbRpOShkaErQ5hkCpjhczQoUxmeNWQpVZQZLFbfT2PSfK/NtteeaK+5wYbS8UZKrve0QQjEFF+lwmgCLpnMcyoSS2aVtSSehLOq2iTSlkh9QreE9olpS0wrPwm2AK8scQeKY/vadyyq02hsCYeJuT+IiMqmqfnd16g2gMrrJLj3FDTWqzhvF3HeeiohVb6SkaWO+EJnYBPJ0tRyvwg5b6JofK7C6OxvsIOoU6qlbCKlEdLAV4SFkrGrHqM8DA99Flt0Sf7lk5dpZqBO4s6MTWAbJOxmcYO1oayBbUIlL8Wmska2SKcKYFPaIFukCpINodtvkSXSYDkaaYireShOxZQDyhFoHkw1CHonMHHFOWkaBC17UrXwKBNmELVEKjMGfUtGDiVJpgtO59rMORADM1Nb9ePcW4qW+OKht2jWmIX9Le3/8CB3CDosrtHcY3AweIjl02BiORs+Jin+s7t4P0fgfDiHXVeRC4atC8ouFEgjXpFzUOLB/iNymQwXFUoSqlN0tHWtHi/QdfCBPxo4E8thfIYbj7+QfMUf46YzWiJZC+B2fgPStQhPaScCNshsJah3vqKoVu3QmSixcQX3e/KSde9RdFSHW1TP1xI970znoo27wxmGlWtv/MRF/getaxwf7EfD/ejVweDJ0/ZjdyP4Lbgb3A+i4I/gSfAiOAyOAhaUwYfgY/CpN+r90/u3966RXr3S+vwabKze+y/fmdhs</latexit>
U U
Yesterday I went to … saw a Sequence of Query the
context words next word
⇣ ⌘
= kh=1 HAh (q, K, V ) W O , W O 2 Rd⇥d
H
Concatenation
operation with q = gt 2 Rd , K = V = {gt , gt 1 , ..., gt (L 1) } 2 RL⇥d
Positional Positional
⇣ q KT ⌘
<latexit sha1_base64="XJxsbxiqn0moy0i0DVYB61bfbXY=">AAADgHicnVLbbtNAEN3YXEq4NIVHXkZEVKkU2rhCBSFVKvASyS8FGqdSnFrrzTpe1dfdMTSy/Bt8GG98DBJrJ6CSRiAxkq3jc2bOWa/GzyKhcDD43jLMW7fv3N26177/4OGj7c7OY0elhWR8xNIolec+VTwSCR+hwIifZ5LT2I/42L98X+vjz1wqkSZnuMj4NKbzRASCUdSUt9P62naRX6GMy+Hbygt7ed/uO3uwewy/+E9pgDG9qtx3Yt4DN5CUlbkXgu2FF2dV6apcYjk7KIdVVUHdtAeOF7oicWOKoe+XH6sLrQ+16LZ3f9vCF4EhVFBbHUM+1m45bJjqQyOtKy6KmCvY4Ku/84LOVm+7sbdrD3vdPrlmsoyx/zfGaWKc2sP5d4zzt5h2XV6nO9gfNAU3gbUCXbKqU6/zzZ2lrIh5giyiSk2sQYbTkkoULOJV2y0Uzyi7pHM+0TChOm9aNgtUwXPNzCBIpX4ShIa9PlHSWKlF7OvO+tRqXavJTdqkwOD1tBRJViBP2DIoKCLAFOpthJmQnGG00IAyKfRZgYVUbxjqna0vwVr/5ZvAOdy3jvaPPrzsnhyurmOLPCXPSI9Y5BU5IUNySkaEtX4YXaNvvDANs2cemNay1WitZp6QP8p88xMznhpm</latexit>
Encoding
h Encoding
HAh (q, K, V ) = Softmax p h Vh 2 Rd/H
d/H
with qh = qWhq 2 Rd/H , Whq 2 Rd⇥d/H
Kh = KWhK 2 Rn⇥d/H , WhK 2 Rd⇥d/H Sequence of Query the
context words next word
Vh = V WhV 2 Rn⇥d/H , WhV 2 Rd⇥d/H
Xavier Bresson 23
24
Attention to the
context words
Prediction of
the next
word
gt
<latexit sha1_base64="4oifmi6YMpjuyVBbE/rL2FKzDJ0=">AAAB/HicbVDLSsNAFJ3UV62vaJdugkWoYEsiUl0W3LhwUcE+oA1hMpm0QyczYWYihBB/xY0LRdz6Ie78GydtF9p64MLhnHu59x4/pkQq2/42SmvrG5tb5e3Kzu7e/oF5eNSTPBEIdxGnXAx8KDElDHcVURQPYoFh5FPc96c3hd9/xEISzh5UGmM3gmNGQoKg0pJnVkc+p0E2zr1MNep3Decsr3hmzW7aM1irxFmQGlig45lfo4CjJMJMIQqlHDp2rNwMCkUQxXlllEgcQzSFYzzUlMEISzebHZ9bp1oJrJALXUxZM/X3RAYjKdPI150RVBO57BXif94wUeG1mxEWJwozNF8UJtRS3CqSsAIiMFI01QQiQfStFppAAZHSeRUhOMsvr5LeRdNpNVv3l7X2+SKOMjgGJ6AOHHAF2uAWdEAXIJCCZ/AK3own48V4Nz7mrSVjMVMFf2B8/gAf7ZO3</latexit>
(L 1) gt
<latexit sha1_base64="xKCFVgjdocVyvenp8fH+w4TBFZs=">AAAB/HicbVDLSsNAFJ34rPUV7dJNsAgVbEmKVJcFNy5cVLAPaEOYTCbt0MlMmJkIIdRfceNCEbd+iDv/xkmbhbYeuHA4517uvcePKZHKtr+NtfWNza3t0k55d2//4NA8Ou5JngiEu4hTLgY+lJgShruKKIoHscAw8inu+9Ob3O8/YiEJZw8qjbEbwTEjIUFQackzKyOf0yAbz7xM1Wt39eb5rOyZVbthz2GtEqcgVVCg45lfo4CjJMJMIQqlHDp2rNwMCkUQxbPyKJE4hmgKx3ioKYMRlm42P35mnWklsEIudDFlzdXfExmMpEwjX3dGUE3kspeL/3nDRIXXbkZYnCjM0GJRmFBLcStPwgqIwEjRVBOIBNG3WmgCBURK55WH4Cy/vEp6zYbTarTuL6vtiyKOEjgBp6AGHHAF2uAWdEAXIJCCZ/AK3own48V4Nz4WrWtGMVMBf2B8/gAhdJO4</latexit>
(L 2)
… gt
<latexit sha1_base64="RgoPsL38sDPl7mYgZ2akQngPojo=">AAAB+XicbVBNS8NAEN3Urxq/oh69LBbBg5ZEpHosePFYwdpCG8Jms2mXbnbD7qZQQv6JFw+KePWfePPfuGlz0NYHA4/3ZpiZF6aMKu2631ZtbX1jc6u+be/s7u0fOIdHT0pkEpMuFkzIfogUYZSTrqaakX4qCUpCRnrh5K70e1MiFRX8Uc9S4idoxGlMMdJGChxnGAoW5aMiyPWlV9h24DTcpjsHXCVeRRqgQidwvoaRwFlCuMYMKTXw3FT7OZKaYkYKe5gpkiI8QSMyMJSjhCg/n19ewDOjRDAW0hTXcK7+nshRotQsCU1ngvRYLXul+J83yHR86+eUp5kmHC8WxRmDWsAyBhhRSbBmM0MQltTcCvEYSYS1CasMwVt+eZU8XTW9VrP1cN1oX1Rx1MEJOAXnwAM3oA3uQQd0AQZT8AxewZuVWy/Wu/WxaK1Z1cwx+APr8wd79pLZ</latexit>
1 q = gt
<latexit sha1_base64="iwTNnvjF4D82jHvNQ2VKVI9uvn8=">AAAB+XicbVDLSsNAFJ3UV62vqEs3g0VwISURqW6EghuXFewD2hAmk0k7dDITZyaFEvInblwo4tY/ceffOGmz0NYDFw7n3Mu99wQJo0o7zrdVWVvf2Nyqbtd2dvf2D+zDo64SqcSkgwUTsh8gRRjlpKOpZqSfSILigJFeMLkr/N6USEUFf9SzhHgxGnEaUYy0kXzbfrodBoKF2Sj3M53Xar5ddxrOHHCVuCWpgxJt3/4ahgKnMeEaM6TUwHUS7WVIaooZyWvDVJEE4QkakYGhHMVEedn88hyeGSWEkZCmuIZz9fdEhmKlZnFgOmOkx2rZK8T/vEGqoxsvozxJNeF4sShKGdQCFjHAkEqCNZsZgrCk5laIx0girE1YRQju8surpHvZcJuN5sNVvXVRxlEFJ+AUnAMXXIMWuAdt0AEYTMEzeAVvVma9WO/Wx6K1YpUzx+APrM8f9JCTKQ==</latexit>
Sequence length L
Xavier Bresson 24
25
Attention block
Output
Probability
Attention block layer (2017) :
h̄ = LN q + MHA(q, K, V ) 2 Rd
<latexit sha1_base64="FivNuXl7IqGschm93z5IQD8lRpQ=">AAACn3icfVHfT9swEHYyBixsUNjjeDBUm4pAVYIQ7GUSGw8g0aFuWgtTUyrbdVsLx0ntC6KK8m/xh/DGf4NTwsQv7SSfPn139935jiZSGPD9W8d9M/N2dm7+nbfw/sPiUmV5pW3iVDPeYrGM9RklhkuheAsESH6WaE4iKvkpvTgo4qeXXBsRqz8wSXg3IkMlBoIRsFSvcv0Fh+OU9KcOh5TobJTjbzgEfgU6yhoneUjFsIbHePMf+fPoe14bbx1vtTdwEbVeqDAiMKI0+52fW6HQeyI8ek3yodsj4UYzr5X0f6Q9r1ep+nV/avglCEpQRaU1e5WbsB+zNOIKmCTGdAI/gW5GNAgmee6FqeEJYRdkyDsWKhJx082m+83xZ8v08SDW9inAU/ZxRUYiYyYRtZnFpOZ5rCBfi3VSGHztZkIlKXDF7hsNUokhxsWxcF9ozkBOLCBMCzsrZiOiCQN70mIJwfMvvwTt7XqwW9/9tVPd3y7XMY8+oXVUQwHaQ/voCDVRCzFn1fnhHDsNd809dE/c5n2q65Q1H9ETc//eAZKPyZI=</latexit>
h = LN h̄ + MLP(h̄) 2 Rd
Positional Positional
In 2019, LayerNorm (LN) was applied before non- Encoding Encoding
linear operations :
Xavier Bresson 25
26
⇣q µ⌘
LN(q) = a + b 2 Rd
with µ = Mean(q) 2 R, = Std(q) 2 R, a, b 2 Rd
Residual/skip connection :
<latexit sha1_base64="rz+dMgom4vANeyK8Aeaxeii1/ns=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahIpSkiHoRCl48VrAf0May2W7apZtN2N1oQ+j/8OJBEa/+F2/+G7dtDtr6YODx3gwz87yIM6Vt+9vKrayurW/kNwtb2zu7e8X9g6YKY0log4Q8lG0PK8qZoA3NNKftSFIceJy2vNHN1G89UqlYKO51ElE3wAPBfEawNtJDgq7RGJ0hv9cqj097xZJdsWdAy8TJSAky1HvFr24/JHFAhSYcK9Vx7Ei7KZaaEU4nhW6saITJCA9ox1CBA6rcdHb1BJ0YpY/8UJoSGs3U3xMpDpRKAs90BlgP1aI3Ff/zOrH2r9yUiSjWVJD5Ij/mSIdoGgHqM0mJ5okhmEhmbkVkiCUm2gRVMCE4iy8vk2a14lxUnLvzUq2axZGHIziGMjhwCTW4hTo0gICEZ3iFN+vJerHerY95a87KZg7hD6zPH9ICkLo=</latexit>
· + fW (·) y = x + fW (x)
<latexit sha1_base64="USvj3oi+twil8YeX/h1Fw+zl6ko=">AAAB+3icbZDLSsNAFIZPvNZ6i3XpZrAIFaEkRdRlwY3LCvYCbQiTyaQdOpmEmYlYSl/FjQtF3Poi7nwbp2kW2vrDwMd/zuGc+YOUM6Ud59taW9/Y3Nou7ZR39/YPDu2jSkclmSS0TRKeyF6AFeVM0LZmmtNeKimOA067wfh2Xu8+UqlYIh70JKVejIeCRYxgbSzfrgxImGh0gSK/W8v53LerTt3JhVbBLaAKhVq+/TUIE5LFVGjCsVJ910m1N8VSM8LprDzIFE0xGeMh7RsUOKbKm+a3z9CZcUIUJdI8oVHu/p6Y4lipSRyYzhjrkVquzc3/av1MRzfelIk001SQxaIo40gnaB4ECpmkRPOJAUwkM7ciMsISE23iKpsQ3OUvr0KnUXev6u79ZbXZKOIowQmcQg1cuIYm3EEL2kDgCZ7hFd6smfVivVsfi9Y1q5g5hj+yPn8AOneTOQ==</latexit>
x
<latexit sha1_base64="zBytBMYdIu5fHbB0Z0zhfJX4Ma4=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mKqMeCF48t2FZoQ9lsJ+3azSbsbsQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgpr6xubW8Xt0s7u3v5B+fCoreNUMWyxWMTqPqAaBZfYMtwIvE8U0igQ2AnGNzO/84hK81jemUmCfkSHkoecUWOl5lO/XHGr7hxklXg5qUCORr/81RvELI1QGiao1l3PTYyfUWU4Ezgt9VKNCWVjOsSupZJGqP1sfuiUnFllQMJY2ZKGzNXfExmNtJ5Ege2MqBnpZW8m/ud1UxNe+xmXSWpQssWiMBXExGT2NRlwhcyIiSWUKW5vJWxEFWXGZlOyIXjLL6+Sdq3qXVa95kWlXsvjKMIJnMI5eHAFdbiFBrSAAcIzvMKb8+C8OO/Ox6K14OQzx/AHzucP42GM8g==</latexit>
Forward pass
<latexit sha1_base64="RirLaIJz3bqPK26aFIqcBygtBpQ=">AAAC3XicjVJBS9xAFJ7EanVrdatHL4NLRRGWRErrRRB6qeBhC65KN0t4mZ2sg5NJmHmxG0LASw8V6bX/y1v/R39AJ+tC11VsHwx8fO/73pt5b6JMCoOe98tx517ML7xcXGq8Wn69stp8s3Zq0lwz3mWpTPV5BIZLoXgXBUp+nmkOSST5WXT5sc6fXXFtRKpOsMh4P4GhErFggJYKm7+DWAMrgww0CpD0uPqLRxXdOqDPCIpqJlk8cAdB45/+SAy3aYB8hDopjwYV3aWBgkhCOKJx+JXWgp3/KFT7nm00XTRstry2Nw76GPgT0CKT6ITNu2CQsjzhCpkEY3q+l2G/rKszyatGkBueAbuEIe9ZqCDhpl+Ot1PRt5YZ0DjV9iikY3baUUJiTJFEVpkAXpjZXE0+levlGO/3S6GyHLli943iXFJMab1qOhCaM5SFBcC0sHel7ALskNB+iIYdgj/75MfgdK/tv2/7n9+1Dvcm41gkG2STbBOffCCH5BPpkC5hzhfn2vnu3Lih+829dX/cS11n4lknD8L9+Qdocefm</latexit>
@L @L @y
<latexit sha1_base64="PkznhpaZaBRFnH4DEtJiO1Hs0lE=">AAACBnicbZDLSsNAFIZPvNZ6i7oUYbAIrkpSRF0W3LhwUcFeoAllMp20QycXZiZCCFm58VXcuFDErc/gzrdx0gbU1h8GPv5zzsyc34s5k8qyvoyl5ZXVtfXKRnVza3tn19zb78goEYS2ScQj0fOwpJyFtK2Y4rQXC4oDj9OuN7kq6t17KiSLwjuVxtQN8ChkPiNYaWtgHjm+wCRzYiwUwxzd5D+c5gOzZtWtqdAi2CXUoFRrYH46w4gkAQ0V4VjKvm3Fys2KCwmnedVJJI0xmeAR7WsMcUClm03XyNGJdobIj4Q+oUJT9/dEhgMp08DTnQFWYzlfK8z/av1E+ZduxsI4UTQks4f8hCMVoSITNGSCEsVTDZgIpv+KyBjrXJROrqpDsOdXXoROo26f1+3bs1qzUcZRgUM4hlOw4QKacA0taAOBB3iCF3g1Ho1n4814n7UuGeXMAfyR8fENwiKZQg==</latexit>
· + fW (·)
<latexit sha1_base64="USvj3oi+twil8YeX/h1Fw+zl6ko=">AAAB+3icbZDLSsNAFIZPvNZ6i3XpZrAIFaEkRdRlwY3LCvYCbQiTyaQdOpmEmYlYSl/FjQtF3Poi7nwbp2kW2vrDwMd/zuGc+YOUM6Ud59taW9/Y3Nou7ZR39/YPDu2jSkclmSS0TRKeyF6AFeVM0LZmmtNeKimOA067wfh2Xu8+UqlYIh70JKVejIeCRYxgbSzfrgxImGh0gSK/W8v53LerTt3JhVbBLaAKhVq+/TUIE5LFVGjCsVJ910m1N8VSM8LprDzIFE0xGeMh7RsUOKbKm+a3z9CZcUIUJdI8oVHu/p6Y4lipSRyYzhjrkVquzc3/av1MRzfelIk001SQxaIo40gnaB4ECpmkRPOJAUwkM7ciMsISE23iKpsQ3OUvr0KnUXev6u79ZbXZKOIowQmcQg1cuIYm3EEL2kDgCZ7hFd6smfVivVsfi9Y1q5g5hj+yPn8AOneTOQ==</latexit>
@L
= Positional Positional
@x @y @x @y Encoding
Backward pass Encoding
@L
= Id + rx fw
@y
@L @L
= + rx f w Sequence of Query the
@y @y context words next word
No vanishing gradient
for residual connection
Xavier Bresson 26
27
Positional encoding
Output
Probability
Transformers are designed to process sets of vectors but
items in a set are not ordered.
This is an issue for NLP tasks.
An additional ordering feature is required to inject
causal ordering in the attention mechanism.
Two classes of Positional Encoding (PE) :
Learnable vs non-learnable PE
Learnable PE :
Embedding of discrete ordering index 0,1,2,3,…,L-1,
with L is the sequence length.
Positional Positional
Two issues : Encoding Encoding
Xavier Bresson 27
28
Positional encoding
Non-learnable PE :
Continuous ordering with sin and cos functions.
Advantages :
No training necessary
No need to know the maximum length in the train set.
Test sequences may have lengths not present in the train set.
PEt 2 Rd is defined as
<latexit sha1_base64="i5Xx8nrNgDx06yPtBiWpH1ejnHE=">AAADEHichVJLbxMxEPYur7K8UjhysUipihRFuxECLpUqISSOAZG2UpxGXq+dWPV6F3u2EFn+CVz4K1w4gBBXjtz4N3iTVOoDiTl9/ubxjWcmr5W0kKZ/ovjK1WvXb2zcTG7dvnP3Xmfz/r6tGsP4iFWqMoc5tVxJzUcgQfHD2nBa5oof5McvW//BCTdWVvodLGo+KelMSyEZhUBNN6PtbQL8I5jSDV/5KRCpSUlhnufurT9yhcenbiwtLrgIOgWmFntCknOpDnrS7yZEcQHEJSTnM6kdNYYuvFOMMZ8QKzXeGZBaYjGVGJ6cKS7wltxqNfgJ176HcahPWGX/G18Vhe+1wVwXazmcECNnc+gn5H1Di9OkDxLm2LeldokwlLks7aVperR6FN4RJVRVGTyQxCyR965V9/1k2umm/XRp+DLI1qCL1jacdn6TomJNyTUwRa0dZ2kNk9AgSKZ4mEVjeU3ZMZ3xcYCaltxO3HKhHj8OTIFFaEVUGvCSPZvhaGntosxDZLsse9HXkv/yjRsQLyZO6roBrtlKSDQKQ4Xb68CFNJyBWgRAmZGhV8zmNIwHwg21Q8gufvky2B/0s2f97M3T7t5gPY4N9BA9QjsoQ8/RHnqNhmiEWPQp+hJ9i77Hn+Ov8Y/45yo0jtY5D9A5i3/9Bcsd91U=</latexit>
⇢ d
sin(2⇡fi t) if i is even, 10, 000 b2ic
PEt,i = with fi = .
cos(2⇡fi t) if i is odd, 2⇡
Xavier Bresson 28
29
Classification layer
Output
probability
for next word
The last layer is a standard linear layer to
<latexit sha1_base64="5aQlUiwBW1m2+47j+F7nBBOdGuE=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0mKVI8FLx6r2A9oQ9lsN+3SzSbsToRS+g+8eFDEq//Im//GTZuDtj4YeLw3w8y8IJHCoOt+O4WNza3tneJuaW//4PCofHzSNnGqGW+xWMa6G1DDpVC8hQIl7yaa0yiQvBNMbjO/88S1EbF6xGnC/YiOlAgFo2ilh6Q0KFfcqrsAWSdeTiqQozkof/WHMUsjrpBJakzPcxP0Z1SjYJLPS/3U8ISyCR3xnqWKRtz4s8Wlc3JhlSEJY21LIVmovydmNDJmGgW2M6I4NqteJv7n9VIMb/yZUEmKXLHlojCVBGOSvU2GQnOGcmoJZVrYWwkbU00Z2nCyELzVl9dJu1b16tX6/VWlUcvjKMIZnMMleHANDbiDJrSAQQjP8ApvzsR5cd6dj2VrwclnTuEPnM8fDWaNAw==</latexit>
p
compute the scores of the next word in the s
<latexit sha1_base64="O2K56xTd+mze1CGaAlqioGifjig=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0mKVI8FLx6r2A9oQ9lsN+3SzSbsToRS+g+8eFDEq//Im//GTZuDtj4YeLw3w8y8IJHCoOt+O4WNza3tneJuaW//4PCofHzSNnGqGW+xWMa6G1DDpVC8hQIl7yaa0yiQvBNMbjO/88S1EbF6xGnC/YiOlAgFo2ilB1MalCtu1V2ArBMvJxXI0RyUv/rDmKURV8gkNabnuQn6M6pRMMnnpX5qeELZhI54z1JFI2782eLSObmwypCEsbalkCzU3xMzGhkzjQLbGVEcm1UvE//zeimGN/5MqCRFrthyUZhKgjHJ3iZDoTlDObWEMi3srYSNqaYMbThZCN7qy+ukXat69Wr9/qrSqOVxFOEMzuESPLiGBtxBE1rAIIRneIU3Z+K8OO/Ox7K14OQzp/AHzucPEfWNBg==</latexit>
Classification
Layer
s = MLP(h) 2 RV
<latexit sha1_base64="IjrUytIlDaaYH98+5fEKOoISdNQ=">AAACQnicbVC7SgNBFJ31bXxFLW0GgyE2YVdEbYSAjYVCfCRGsjHMTmbN4MzsMnNXEpb9Nhu/wM4PsLFQxNbC3ZhCYw4MHM45l3vneKHgBmz72ZqYnJqemZ2bzy0sLi2v5FfX6iaINGU1GohANzximOCK1YCDYI1QMyI9wa68u6PMv7pn2vBAXUI/ZC1JbhX3OSWQSu38dREbfIhdYD3QMj49qSal7rbLlSsJdD0vPk9u4nqCXTdXxOGv5EXggyS9pGTGp3PtfMEu2wPg/8QZkgIaotrOP7mdgEaSKaCCGNN07BBaMdHAqWBJzo0MCwm9I7esmVJFJDOteFBBgrdSpYP9QKdPAR6ovydiIo3pSy9NZqeaUS8Tx3nNCPyDVsxVGAFT9GeRHwkMAc76xB2uGQXRTwmhmqe3YtolmlBIW89KcEa//J/Ud8rOXnnvbLdQ2RnWMYc20CYqIQftowo6RlVUQxQ9oBf0ht6tR+vV+rA+f6IT1nBmHf2B9fUNT62vdQ==</latexit>
p = Softmax(s) 2 RV
Positional Positional
Encoding Encoding
Xavier Bresson 29
30
Efficient training
Xavier Bresson 30
31
Attention matrix
Step 1 : Compute the attention matrix A, i.e. Aij is the dot product between word vector qi
and word vector kj.
Two similar vectors will receive a high value and inversely, two dissimilar ones a low value.
Yesterday I … saw a
j
<latexit sha1_base64="UeR/RbFXcM5OszSfL2wz/T9FNMs=">AAAB6HicbVDLTgJBEOzFF+IL9ehlIjHxRHaJQY8kXjxCIo8ENmR2aGBgdnYzM2tCNnyBFw8a49VP8ubfOMAeFKykk0pVd7q7glhwbVz328ltbe/s7uX3CweHR8cnxdOzlo4SxbDJIhGpTkA1Ci6xabgR2IkV0jAQ2A6m9wu//YRK80g+mlmMfkhHkg85o8ZKjUm/WHLL7hJkk3gZKUGGer/41RtELAlRGiao1l3PjY2fUmU4Ezgv9BKNMWVTOsKupZKGqP10eeicXFllQIaRsiUNWaq/J1Iaaj0LA9sZUjPW695C/M/rJmZ456dcxolByVaLhokgJiKLr8mAK2RGzCyhTHF7K2FjqigzNpuCDcFbf3mTtCplr1quNm5KtUoWRx4u4BKuwYNbqMED1KEJDBCe4RXenInz4rw7H6vWnJPNnMMfOJ8/z8OM6Q==</latexit>
Yesterday qk qk qk
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit> <latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
qk qk
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit> <latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
qk qk qk
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit> <latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
I qk qk
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit> <latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
A = QK T 2 RL⇥L
<latexit sha1_base64="MVTyiYPLGRNaQpvt0C0rgYEzwQg=">AAACNXicbVDLSgMxFM34rPU16tJNsCiuykyR6kaouBHsopW+oNMOmTRt02YyY5IRyjA/5cb/cKULF4q49RdMHwttPRA4nHMuufd4IaNSWdarsbS8srq2ntpIb25t7+yae/s1GUQCkyoOWCAaHpKEUU6qiipGGqEgyPcYqXvD67FffyBC0oBX1CgkLR/1OO1SjJSWXLN4Aq/gJSzftivQoRw6PlJ9z4vvknZcdBT1iYTFBDpOWgfdmA4Snb53qY4P3cHciGtmrKw1AVwk9oxkwAwl13x2OgGOfMIVZkjKpm2FqhUjoShmJEk7kSQhwkPUI01NOdLrtOLJ1Qk81koHdgOhH1dwov6eiJEv5cj3dHK8oZz3xuJ/XjNS3YtWTHkYKcLx9KNuxKAK4LhC2KGCYMVGmiAsqN4V4j4SCCtddFqXYM+fvEhquaydz+bLZ5lCblZHChyCI3AKbHAOCuAGlEAVYPAIXsA7+DCejDfj0/iaRpeM2cwB+APj+wdawajZ</latexit>
… qk qk qk
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit> <latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
qk qk
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit> <latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
Aij = qiT kj 2 R
saw qk qk qk
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit> <latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
qk qk
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit> <latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
a qk qk qk
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit> <latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
qk qk
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit> <latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
i
<latexit sha1_base64="/567GNfqs6hI1+r9uu/P1P0myJ4=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4KkmR6rHgxWML9gPaUDbbSbt2swm7G6GE/gIvHhTx6k/y5r9x2+agrQ8GHu/NMDMvSATXxnW/nY3Nre2d3cJecf/g8Oi4dHLa1nGqGLZYLGLVDahGwSW2DDcCu4lCGgUCO8Hkbu53nlBpHssHM03Qj+hI8pAzaqzU5INS2a24C5B14uWkDDkag9JXfxizNEJpmKBa9zw3MX5GleFM4KzYTzUmlE3oCHuWShqh9rPFoTNyaZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjrZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTdGG4K2+vE7a1YpXq9Sa1+V6NY+jAOdwAVfgwQ3U4R4a0AIGCM/wCm/Oo/PivDsfy9YNJ585gz9wPn8Azj+M6A==</latexit>
Xavier Bresson 31
32
Masked attention
Yesterday qk qk qk Yesterday 1 1 1 1 1
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit> <latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
qk qk
<latexit sha1_base64="sx8aqgD1cIeIGNAHGJRIIe7slBA=">AAAB6nicbVBNS8NAEJ3Urxq/qh69LBbBg5REpHosePFY0X5AG8pmu2mXbnbD7kYooT/BiwdFvPqLvPlv3LQ5aOuDgcd7M8zMCxPOtPG8b6e0tr6xuVXednd29/YPKodHbS1TRWiLSC5VN8SaciZoyzDDaTdRFMchp51wcpv7nSeqNJPi0UwTGsR4JFjECDZWevBdd1CpejVvDrRK/IJUoUBzUPnqDyVJYyoM4Vjrnu8lJsiwMoxwOnP7qaYJJhM8oj1LBY6pDrL5qTN0ZpUhiqSyJQyaq78nMhxrPY1D2xljM9bLXi7+5/VSE90EGRNJaqggi0VRypGRKP8bDZmixPCpJZgoZm9FZIwVJsamk4fgL7+8StqXNb9eq99fVRsXRRxlOIFTOAcfrqEBd9CEFhAYwTO8wpvDnRfn3flYtJacYuYY/sD5/AHgF4zS</latexit>
<latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
<latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
<latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
<latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
I I 1 1 1 1 1
<latexit sha1_base64="sx8aqgD1cIeIGNAHGJRIIe7slBA=">AAAB6nicbVBNS8NAEJ3Urxq/qh69LBbBg5REpHosePFY0X5AG8pmu2mXbnbD7kYooT/BiwdFvPqLvPlv3LQ5aOuDgcd7M8zMCxPOtPG8b6e0tr6xuVXednd29/YPKodHbS1TRWiLSC5VN8SaciZoyzDDaTdRFMchp51wcpv7nSeqNJPi0UwTGsR4JFjECDZWevBdd1CpejVvDrRK/IJUoUBzUPnqDyVJYyoM4Vjrnu8lJsiwMoxwOnP7qaYJJhM8oj1LBY6pDrL5qTN0ZpUhiqSyJQyaq78nMhxrPY1D2xljM9bLXi7+5/VSE90EGRNJaqggi0VRypGRKP8bDZmixPCpJZgoZm9FZIwVJsamk4fgL7+8StqXNb9eq99fVRsXRRxlOIFTOAcfrqEBd9CEFhAYwTO8wpvDnRfn3flYtJacYuYY/sD5/AHgF4zS</latexit>
qk qk qk
<latexit sha1_base64="sx8aqgD1cIeIGNAHGJRIIe7slBA=">AAAB6nicbVBNS8NAEJ3Urxq/qh69LBbBg5REpHosePFY0X5AG8pmu2mXbnbD7kYooT/BiwdFvPqLvPlv3LQ5aOuDgcd7M8zMCxPOtPG8b6e0tr6xuVXednd29/YPKodHbS1TRWiLSC5VN8SaciZoyzDDaTdRFMchp51wcpv7nSeqNJPi0UwTGsR4JFjECDZWevBdd1CpejVvDrRK/IJUoUBzUPnqDyVJYyoM4Vjrnu8lJsiwMoxwOnP7qaYJJhM8oj1LBY6pDrL5qTN0ZpUhiqSyJQyaq78nMhxrPY1D2xljM9bLXi7+5/VSE90EGRNJaqggi0VRypGRKP8bDZmixPCpJZgoZm9FZIwVJsamk4fgL7+8StqXNb9eq99fVRsXRRxlOIFTOAcfrqEBd9CEFhAYwTO8wpvDnRfn3flYtJacYuYY/sD5/AHgF4zS</latexit>
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
qk qk
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit> <latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
<latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
<latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
<latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
<latexit sha1_base64="boscvz4VDrtRuQyjRUqNKxuzqnY=">AAAB7HicbVBNS8NAEJ34WetX1aOXYBE8laRI9Vjw4rGCaQttKJvNtl262Q27E6GE/gYvHhTx6g/y5r9x2+agrQ8GHu/NMDMvSgU36Hnfzsbm1vbObmmvvH9weHRcOTltG5VpygKqhNLdiBgmuGQBchSsm2pGkkiwTjS5m/udJ6YNV/IRpykLEzKSfMgpQSsFfRUrHFSqXs1bwF0nfkGqUKA1qHz1Y0WzhEmkghjT870Uw5xo5FSwWbmfGZYSOiEj1rNUkoSZMF8cO3MvrRK7Q6VtSXQX6u+JnCTGTJPIdiYEx2bVm4v/eb0Mh7dhzmWaIZN0uWiYCReVO//cjblmFMXUEkI1t7e6dEw0oWjzKdsQ/NWX10m7XvMbtcbDdbVZL+IowTlcwBX4cANNuIcWBECBwzO8wpsjnRfn3flYtm44xcwZ/IHz+QPqII65</latexit>
… qk qk … 1 1 1 1 1
<latexit sha1_base64="sx8aqgD1cIeIGNAHGJRIIe7slBA=">AAAB6nicbVBNS8NAEJ3Urxq/qh69LBbBg5REpHosePFY0X5AG8pmu2mXbnbD7kYooT/BiwdFvPqLvPlv3LQ5aOuDgcd7M8zMCxPOtPG8b6e0tr6xuVXednd29/YPKodHbS1TRWiLSC5VN8SaciZoyzDDaTdRFMchp51wcpv7nSeqNJPi0UwTGsR4JFjECDZWevBdd1CpejVvDrRK/IJUoUBzUPnqDyVJYyoM4Vjrnu8lJsiwMoxwOnP7qaYJJhM8oj1LBY6pDrL5qTN0ZpUhiqSyJQyaq78nMhxrPY1D2xljM9bLXi7+5/VSE90EGRNJaqggi0VRypGRKP8bDZmixPCpJZgoZm9FZIwVJsamk4fgL7+8StqXNb9eq99fVRsXRRxlOIFTOAcfrqEBd9CEFhAYwTO8wpvDnRfn3flYtJacYuYY/sD5/AHgF4zS</latexit>
qk
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit> <latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
qk qk
<latexit sha1_base64="sx8aqgD1cIeIGNAHGJRIIe7slBA=">AAAB6nicbVBNS8NAEJ3Urxq/qh69LBbBg5REpHosePFY0X5AG8pmu2mXbnbD7kYooT/BiwdFvPqLvPlv3LQ5aOuDgcd7M8zMCxPOtPG8b6e0tr6xuVXednd29/YPKodHbS1TRWiLSC5VN8SaciZoyzDDaTdRFMchp51wcpv7nSeqNJPi0UwTGsR4JFjECDZWevBdd1CpejVvDrRK/IJUoUBzUPnqDyVJYyoM4Vjrnu8lJsiwMoxwOnP7qaYJJhM8oj1LBY6pDrL5qTN0ZpUhiqSyJQyaq78nMhxrPY1D2xljM9bLXi7+5/VSE90EGRNJaqggi0VRypGRKP8bDZmixPCpJZgoZm9FZIwVJsamk4fgL7+8StqXNb9eq99fVRsXRRxlOIFTOAcfrqEBd9CEFhAYwTO8wpvDnRfn3flYtJacYuYY/sD5/AHgF4zS</latexit>
<latexit sha1_base64="sx8aqgD1cIeIGNAHGJRIIe7slBA=">AAAB6nicbVBNS8NAEJ3Urxq/qh69LBbBg5REpHosePFY0X5AG8pmu2mXbnbD7kYooT/BiwdFvPqLvPlv3LQ5aOuDgcd7M8zMCxPOtPG8b6e0tr6xuVXednd29/YPKodHbS1TRWiLSC5VN8SaciZoyzDDaTdRFMchp51wcpv7nSeqNJPi0UwTGsR4JFjECDZWevBdd1CpejVvDrRK/IJUoUBzUPnqDyVJYyoM4Vjrnu8lJsiwMoxwOnP7qaYJJhM8oj1LBY6pDrL5qTN0ZpUhiqSyJQyaq78nMhxrPY1D2xljM9bLXi7+5/VSE90EGRNJaqggi0VRypGRKP8bDZmixPCpJZgoZm9FZIwVJsamk4fgL7+8StqXNb9eq99fVRsXRRxlOIFTOAcfrqEBd9CEFhAYwTO8wpvDnRfn3flYtJacYuYY/sD5/AHgF4zS</latexit>
<latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
<latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
Pointwise
multiplication
saw qk qk saw 1 1 1 1 1
<latexit sha1_base64="sx8aqgD1cIeIGNAHGJRIIe7slBA=">AAAB6nicbVBNS8NAEJ3Urxq/qh69LBbBg5REpHosePFY0X5AG8pmu2mXbnbD7kYooT/BiwdFvPqLvPlv3LQ5aOuDgcd7M8zMCxPOtPG8b6e0tr6xuVXednd29/YPKodHbS1TRWiLSC5VN8SaciZoyzDDaTdRFMchp51wcpv7nSeqNJPi0UwTGsR4JFjECDZWevBdd1CpejVvDrRK/IJUoUBzUPnqDyVJYyoM4Vjrnu8lJsiwMoxwOnP7qaYJJhM8oj1LBY6pDrL5qTN0ZpUhiqSyJQyaq78nMhxrPY1D2xljM9bLXi7+5/VSE90EGRNJaqggi0VRypGRKP8bDZmixPCpJZgoZm9FZIwVJsamk4fgL7+8StqXNb9eq99fVRsXRRxlOIFTOAcfrqEBd9CEFhAYwTO8wpvDnRfn3flYtJacYuYY/sD5/AHgF4zS</latexit>
qk
<latexit sha1_base64="sx8aqgD1cIeIGNAHGJRIIe7slBA=">AAAB6nicbVBNS8NAEJ3Urxq/qh69LBbBg5REpHosePFY0X5AG8pmu2mXbnbD7kYooT/BiwdFvPqLvPlv3LQ5aOuDgcd7M8zMCxPOtPG8b6e0tr6xuVXednd29/YPKodHbS1TRWiLSC5VN8SaciZoyzDDaTdRFMchp51wcpv7nSeqNJPi0UwTGsR4JFjECDZWevBdd1CpejVvDrRK/IJUoUBzUPnqDyVJYyoM4Vjrnu8lJsiwMoxwOnP7qaYJJhM8oj1LBY6pDrL5qTN0ZpUhiqSyJQyaq78nMhxrPY1D2xljM9bLXi7+5/VSE90EGRNJaqggi0VRypGRKP8bDZmixPCpJZgoZm9FZIwVJsamk4fgL7+8StqXNb9eq99fVRsXRRxlOIFTOAcfrqEBd9CEFhAYwTO8wpvDnRfn3flYtJacYuYY/sD5/AHgF4zS</latexit>
<latexit sha1_base64="sx8aqgD1cIeIGNAHGJRIIe7slBA=">AAAB6nicbVBNS8NAEJ3Urxq/qh69LBbBg5REpHosePFY0X5AG8pmu2mXbnbD7kYooT/BiwdFvPqLvPlv3LQ5aOuDgcd7M8zMCxPOtPG8b6e0tr6xuVXednd29/YPKodHbS1TRWiLSC5VN8SaciZoyzDDaTdRFMchp51wcpv7nSeqNJPi0UwTGsR4JFjECDZWevBdd1CpejVvDrRK/IJUoUBzUPnqDyVJYyoM4Vjrnu8lJsiwMoxwOnP7qaYJJhM8oj1LBY6pDrL5qTN0ZpUhiqSyJQyaq78nMhxrPY1D2xljM9bLXi7+5/VSE90EGRNJaqggi0VRypGRKP8bDZmixPCpJZgoZm9FZIwVJsamk4fgL7+8StqXNb9eq99fVRsXRRxlOIFTOAcfrqEBd9CEFhAYwTO8wpvDnRfn3flYtJacYuYY/sD5/AHgF4zS</latexit>
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
qk qk
<latexit sha1_base64="sx8aqgD1cIeIGNAHGJRIIe7slBA=">AAAB6nicbVBNS8NAEJ3Urxq/qh69LBbBg5REpHosePFY0X5AG8pmu2mXbnbD7kYooT/BiwdFvPqLvPlv3LQ5aOuDgcd7M8zMCxPOtPG8b6e0tr6xuVXednd29/YPKodHbS1TRWiLSC5VN8SaciZoyzDDaTdRFMchp51wcpv7nSeqNJPi0UwTGsR4JFjECDZWevBdd1CpejVvDrRK/IJUoUBzUPnqDyVJYyoM4Vjrnu8lJsiwMoxwOnP7qaYJJhM8oj1LBY6pDrL5qTN0ZpUhiqSyJQyaq78nMhxrPY1D2xljM9bLXi7+5/VSE90EGRNJaqggi0VRypGRKP8bDZmixPCpJZgoZm9FZIwVJsamk4fgL7+8StqXNb9eq99fVRsXRRxlOIFTOAcfrqEBd9CEFhAYwTO8wpvDnRfn3flYtJacYuYY/sD5/AHgF4zS</latexit>
<latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
a qk qk a 1 1 1
<latexit sha1_base64="sx8aqgD1cIeIGNAHGJRIIe7slBA=">AAAB6nicbVBNS8NAEJ3Urxq/qh69LBbBg5REpHosePFY0X5AG8pmu2mXbnbD7kYooT/BiwdFvPqLvPlv3LQ5aOuDgcd7M8zMCxPOtPG8b6e0tr6xuVXednd29/YPKodHbS1TRWiLSC5VN8SaciZoyzDDaTdRFMchp51wcpv7nSeqNJPi0UwTGsR4JFjECDZWevBdd1CpejVvDrRK/IJUoUBzUPnqDyVJYyoM4Vjrnu8lJsiwMoxwOnP7qaYJJhM8oj1LBY6pDrL5qTN0ZpUhiqSyJQyaq78nMhxrPY1D2xljM9bLXi7+5/VSE90EGRNJaqggi0VRypGRKP8bDZmixPCpJZgoZm9FZIwVJsamk4fgL7+8StqXNb9eq99fVRsXRRxlOIFTOAcfrqEBd9CEFhAYwTO8wpvDnRfn3flYtJacYuYY/sD5/AHgF4zS</latexit>
qk 1 1
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit> <latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
qk qk
<latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit> <latexit sha1_base64="p2wCpVEKraZO55cbiWrKB/TDMbg=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqseCF49V7Ae0oWy2k3bpZhN3N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3n1BpHssHM0nQj+hQ8pAzaqx0/zjulytu1Z2DrBIvJxXI0eiXv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzS+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCaz/jMkkNSrZYFKaCmJjM3iYDrpAZMbGEMsXtrYSNqKLM2HBKNgRv+eVV0rqoerVq7e6yUq/lcRThBE7hHDy4gjrcQgOawCCEZ3iFN2fsvDjvzseiteDkM8fwB87nD6MjjWk=</latexit>
<latexit sha1_base64="sx8aqgD1cIeIGNAHGJRIIe7slBA=">AAAB6nicbVBNS8NAEJ3Urxq/qh69LBbBg5REpHosePFY0X5AG8pmu2mXbnbD7kYooT/BiwdFvPqLvPlv3LQ5aOuDgcd7M8zMCxPOtPG8b6e0tr6xuVXednd29/YPKodHbS1TRWiLSC5VN8SaciZoyzDDaTdRFMchp51wcpv7nSeqNJPi0UwTGsR4JFjECDZWevBdd1CpejVvDrRK/IJUoUBzUPnqDyVJYyoM4Vjrnu8lJsiwMoxwOnP7qaYJJhM8oj1LBY6pDrL5qTN0ZpUhiqSyJQyaq78nMhxrPY1D2xljM9bLXi7+5/VSE90EGRNJaqggi0VRypGRKP8bDZmixPCpJZgoZm9FZIwVJsamk4fgL7+8StqXNb9eq99fVRsXRRxlOIFTOAcfrqEBd9CEFhAYwTO8wpvDnRfn3flYtJacYuYY/sD5/AHgF4zS</latexit>
<latexit sha1_base64="sx8aqgD1cIeIGNAHGJRIIe7slBA=">AAAB6nicbVBNS8NAEJ3Urxq/qh69LBbBg5REpHosePFY0X5AG8pmu2mXbnbD7kYooT/BiwdFvPqLvPlv3LQ5aOuDgcd7M8zMCxPOtPG8b6e0tr6xuVXednd29/YPKodHbS1TRWiLSC5VN8SaciZoyzDDaTdRFMchp51wcpv7nSeqNJPi0UwTGsR4JFjECDZWevBdd1CpejVvDrRK/IJUoUBzUPnqDyVJYyoM4Vjrnu8lJsiwMoxwOnP7qaYJJhM8oj1LBY6pDrL5qTN0ZpUhiqSyJQyaq78nMhxrPY1D2xljM9bLXi7+5/VSE90EGRNJaqggi0VRypGRKP8bDZmixPCpJZgoZm9FZIwVJsamk4fgL7+8StqXNb9eq99fVRsXRRxlOIFTOAcfrqEBd9CEFhAYwTO8wpvDnRfn3flYtJacYuYY/sD5/AHgF4zS</latexit>
i i
<latexit sha1_base64="/567GNfqs6hI1+r9uu/P1P0myJ4=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4KkmR6rHgxWML9gPaUDbbSbt2swm7G6GE/gIvHhTx6k/y5r9x2+agrQ8GHu/NMDMvSATXxnW/nY3Nre2d3cJecf/g8Oi4dHLa1nGqGLZYLGLVDahGwSW2DDcCu4lCGgUCO8Hkbu53nlBpHssHM03Qj+hI8pAzaqzU5INS2a24C5B14uWkDDkag9JXfxizNEJpmKBa9zw3MX5GleFM4KzYTzUmlE3oCHuWShqh9rPFoTNyaZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjrZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTdGG4K2+vE7a1YpXq9Sa1+V6NY+jAOdwAVfgwQ3U4R4a0AIGCM/wCm/Oo/PivDsfy9YNJ585gz9wPn8Azj+M6A==</latexit>
<latexit sha1_base64="/567GNfqs6hI1+r9uu/P1P0myJ4=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4KkmR6rHgxWML9gPaUDbbSbt2swm7G6GE/gIvHhTx6k/y5r9x2+agrQ8GHu/NMDMvSATXxnW/nY3Nre2d3cJecf/g8Oi4dHLa1nGqGLZYLGLVDahGwSW2DDcCu4lCGgUCO8Hkbu53nlBpHssHM03Qj+hI8pAzaqzU5INS2a24C5B14uWkDDkag9JXfxizNEJpmKBa9zw3MX5GleFM4KzYTzUmlE3oCHuWShqh9rPFoTNyaZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjrZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTdGG4K2+vE7a1YpXq9Sa1+V6NY+jAOdwAVfgwQ3U4R4a0AIGCM/wCm/Oo/PivDsfy9YNJ585gz9wPn8Azj+M6A==</latexit>
<latexit sha1_base64="hq8ds1GzNxndWdr5mfVf8jEt2wI=">AAACh3icbVFNb9QwEHVCgdZ8LXDkYrELQkhsk1ItvSAVceGCVCS2rbRerRxnsuuu40T2pCWK8lf4Udz4NzjbIJUtI1l6ejPvjWcmKbVyGEW/g/DOzt1793f36IOHjx4/GTx9duqKykqYykIX9jwRDrQyMEWFGs5LCyJPNJwl689d/uwSrFOF+Y51CfNcLI3KlBToqcXgJ0f4gTZvvgq3bheNumg/Mso1ZMgbyhNYKtMIa0XdNlrrlsbsNfurYSpjAhFM58USwCsAw0ZqxIRJ2ehixFrO6TvGlcmw3hKa4oa2pRxM2jei3KrlCseU0sVgGI2jTbDbIO7BkPRxshj84mkhq9z7Si2cm8VRiXNvjEpq8NaVg1LItVjCzEMjcnDzZrPHlr3yTMqywvpnkG3Ym4pG5M7VeeIrc4Ert53ryP/lZhVmR/NGmbLyA8vrRlmlGRasOwpLlQWJuvZASKv8X5lcCSsk+tN1S4i3R74NTg/G8WQ8+XY4PD7o17FLXpCX5A2JyQdyTL6QEzIlMtgJ3gbvg8NwL9wPJ+HRdWkY9Jrn5J8IP/0Bs9DBow==</latexit>
⇢
<latexit sha1_base64="pm30OV/CIeIa8nSQbMezpf4a7MY=">AAACDHicbVDLSsNAFJ34rPVVdelmsAiuSlKkuhEqbgS7aKUvaNIymU7aoZNJmJkIJeQD3Pgrblwo4tYPcOffOGmz0NYDFw7n3Mu997gho1KZ5rexsrq2vrGZ28pv7+zu7RcODtsyiAQmLRywQHRdJAmjnLQUVYx0Q0GQ7zLScSc3qd95IELSgDfVNCSOj0acehQjpaVBoXgNr2Djrt+ENuXQ9pEau258n/Tjmq2oTySsJbrLLJkzwGViZaQIMtQHhS97GODIJ1xhhqTsWWaonBgJRTEjSd6OJAkRnqAR6WnKkd7jxLNnEniqlSH0AqGLKzhTf0/EyJdy6ru6M71WLnqp+J/Xi5R36cSUh5EiHM8XeRGDKoBpMnBIBcGKTTVBWFB9K8RjJBBWOr+8DsFafHmZtMslq1KqNM6L1XIWRw4cgxNwBixwAargFtRBC2DwCJ7BK3gznowX4934mLeuGNnMEfgD4/MHsPSaIw==</latexit>
Masked attention
(
(
Yesterday qk 1 1 1 1 1 0 0 0 0
<latexit sha1_base64="JZSNh8HMqbgLWiOy9tGv37FCpyg=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0lEqseCF48V7Ac0oWw2m3bpZpPuToQQ4l/x4kERr/4Qb/4bt20O2vpg4PHeDDPz/IQzBbb9bVQ2Nre2d6q7tb39g8Mj8/ikp+JUEtolMY/lwMeKciZoFxhwOkgkxZHPad+f3s79/iOVisXiAbKEehEeCxYygkFLI7PuhhKTfDYtclfNJORBUYzMht20F7DWiVOSBirRGZlfbhCTNKICCMdKDR07AS/HEhjhtKi5qaIJJlM8pkNNBY6o8vLF8YV1rpXACmOpS4C1UH9P5DhSKot83RlhmKhVby7+5w1TCG+8nIkkBSrIclGYcgtia56EFTBJCfBME0wk07daZIJ1GqDzqukQnNWX10nvsum0mq37q0a7VcZRRafoDF0gB12jNrpDHdRFBGXoGb2iN+PJeDHejY9la8UoZ+roD4zPHxT9la4=</latexit>
Yesterday
<latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
<latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
<latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
<latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
p
d
qk qk
<latexit sha1_base64="JZSNh8HMqbgLWiOy9tGv37FCpyg=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0lEqseCF48V7Ac0oWw2m3bpZpPuToQQ4l/x4kERr/4Qb/4bt20O2vpg4PHeDDPz/IQzBbb9bVQ2Nre2d6q7tb39g8Mj8/ikp+JUEtolMY/lwMeKciZoFxhwOkgkxZHPad+f3s79/iOVisXiAbKEehEeCxYygkFLI7PuhhKTfDYtclfNJORBUYzMht20F7DWiVOSBirRGZlfbhCTNKICCMdKDR07AS/HEhjhtKi5qaIJJlM8pkNNBY6o8vLF8YV1rpXACmOpS4C1UH9P5DhSKot83RlhmKhVby7+5w1TCG+8nIkkBSrIclGYcgtia56EFTBJCfBME0wk07daZIJ1GqDzqukQnNWX10nvsum0mq37q0a7VcZRRafoDF0gB12jNrpDHdRFBGXoGb2iN+PJeDHejY9la8UoZ+roD4zPHxT9la4=</latexit>
<latexit sha1_base64="JZSNh8HMqbgLWiOy9tGv37FCpyg=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0lEqseCF48V7Ac0oWw2m3bpZpPuToQQ4l/x4kERr/4Qb/4bt20O2vpg4PHeDDPz/IQzBbb9bVQ2Nre2d6q7tb39g8Mj8/ikp+JUEtolMY/lwMeKciZoFxhwOkgkxZHPad+f3s79/iOVisXiAbKEehEeCxYygkFLI7PuhhKTfDYtclfNJORBUYzMht20F7DWiVOSBirRGZlfbhCTNKICCMdKDR07AS/HEhjhtKi5qaIJJlM8pkNNBY6o8vLF8YV1rpXACmOpS4C1UH9P5DhSKot83RlhmKhVby7+5w1TCG+8nIkkBSrIclGYcgtia56EFTBJCfBME0wk07daZIJ1GqDzqukQnNWX10nvsum0mq37q0a7VcZRRafoDF0gB12jNrpDHdRFBGXoGb2iN+PJeDHejY9la8UoZ+roD4zPHxT9la4=</latexit>
I p p
<latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
1 <latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
1 <latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
1 0.3 0.7 0 0 0
d d I
Softmax qk
=
<latexit sha1_base64="JZSNh8HMqbgLWiOy9tGv37FCpyg=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0lEqseCF48V7Ac0oWw2m3bpZpPuToQQ4l/x4kERr/4Qb/4bt20O2vpg4PHeDDPz/IQzBbb9bVQ2Nre2d6q7tb39g8Mj8/ikp+JUEtolMY/lwMeKciZoFxhwOkgkxZHPad+f3s79/iOVisXiAbKEehEeCxYygkFLI7PuhhKTfDYtclfNJORBUYzMht20F7DWiVOSBirRGZlfbhCTNKICCMdKDR07AS/HEhjhtKi5qaIJJlM8pkNNBY6o8vLF8YV1rpXACmOpS4C1UH9P5DhSKot83RlhmKhVby7+5w1TCG+8nIkkBSrIclGYcgtia56EFTBJCfBME0wk07daZIJ1GqDzqukQnNWX10nvsum0mq37q0a7VcZRRafoDF0gB12jNrpDHdRFBGXoGb2iN+PJeDHejY9la8UoZ+roD4zPHxT9la4=</latexit>
qk qk
<latexit sha1_base64="JZSNh8HMqbgLWiOy9tGv37FCpyg=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0lEqseCF48V7Ac0oWw2m3bpZpPuToQQ4l/x4kERr/4Qb/4bt20O2vpg4PHeDDPz/IQzBbb9bVQ2Nre2d6q7tb39g8Mj8/ikp+JUEtolMY/lwMeKciZoFxhwOkgkxZHPad+f3s79/iOVisXiAbKEehEeCxYygkFLI7PuhhKTfDYtclfNJORBUYzMht20F7DWiVOSBirRGZlfbhCTNKICCMdKDR07AS/HEhjhtKi5qaIJJlM8pkNNBY6o8vLF8YV1rpXACmOpS4C1UH9P5DhSKot83RlhmKhVby7+5w1TCG+8nIkkBSrIclGYcgtia56EFTBJCfBME0wk07daZIJ1GqDzqukQnNWX10nvsum0mq37q0a7VcZRRafoDF0gB12jNrpDHdRFBGXoGb2iN+PJeDHejY9la8UoZ+roD4zPHxT9la4=</latexit>
<latexit sha1_base64="JZSNh8HMqbgLWiOy9tGv37FCpyg=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0lEqseCF48V7Ac0oWw2m3bpZpPuToQQ4l/x4kERr/4Qb/4bt20O2vpg4PHeDDPz/IQzBbb9bVQ2Nre2d6q7tb39g8Mj8/ikp+JUEtolMY/lwMeKciZoFxhwOkgkxZHPad+f3s79/iOVisXiAbKEehEeCxYygkFLI7PuhhKTfDYtclfNJORBUYzMht20F7DWiVOSBirRGZlfbhCTNKICCMdKDR07AS/HEhjhtKi5qaIJJlM8pkNNBY6o8vLF8YV1rpXACmOpS4C1UH9P5DhSKot83RlhmKhVby7+5w1TCG+8nIkkBSrIclGYcgtia56EFTBJCfBME0wk07daZIJ1GqDzqukQnNWX10nvsum0mq37q0a7VcZRRafoDF0gB12jNrpDHdRFBGXoGb2iN+PJeDHejY9la8UoZ+roD4zPHxT9la4=</latexit>
<latexit sha1_base64="JR5w8iGbxoaP1JzolmcCEj+dNyg=">AAAB8XicbVBNS8NAEJ34WeNX1aOXxSJ40JKIVI8FLx4r2A9sQ9lsN+3SzSbsToQS+i+8eFDEq//Gm//GpM1BWx8MPN6bYWaeH0th0HG+rZXVtfWNzdKWvb2zu7dfPjhsmSjRjDdZJCPd8anhUijeRIGSd2LNaehL3vbHt7nffuLaiEg94CTmXkiHSgSCUcykxwvSEyrAiW33yxWn6sxAlolbkAoUaPTLX71BxJKQK2SSGtN1nRi9lGoUTPKp3UsMjykb0yHvZlTRkBsvnV08JaeZMiBBpLNSSGbq74mUhsZMQj/rDCmOzKKXi/953QSDGy8VKk6QKzZfFCSSYETy98lAaM5QTjJCmRbZrYSNqKYMs5DyENzFl5dJ67Lq1qq1+6tK/byIowTHcAJn4MI11OEOGtAEBgqe4RXeLGO9WO/Wx7x1xSpmjuAPrM8f7aWPug==</latexit>
d d d
qk
<latexit sha1_base64="JZSNh8HMqbgLWiOy9tGv37FCpyg=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0lEqseCF48V7Ac0oWw2m3bpZpPuToQQ4l/x4kERr/4Qb/4bt20O2vpg4PHeDDPz/IQzBbb9bVQ2Nre2d6q7tb39g8Mj8/ikp+JUEtolMY/lwMeKciZoFxhwOkgkxZHPad+f3s79/iOVisXiAbKEehEeCxYygkFLI7PuhhKTfDYtclfNJORBUYzMht20F7DWiVOSBirRGZlfbhCTNKICCMdKDR07AS/HEhjhtKi5qaIJJlM8pkNNBY6o8vLF8YV1rpXACmOpS4C1UH9P5DhSKot83RlhmKhVby7+5w1TCG+8nIkkBSrIclGYcgtia56EFTBJCfBME0wk07daZIJ1GqDzqukQnNWX10nvsum0mq37q0a7VcZRRafoDF0gB12jNrpDHdRFBGXoGb2iN+PJeDHejY9la8UoZ+roD4zPHxT9la4=</latexit>
qk qk
<latexit sha1_base64="JZSNh8HMqbgLWiOy9tGv37FCpyg=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0lEqseCF48V7Ac0oWw2m3bpZpPuToQQ4l/x4kERr/4Qb/4bt20O2vpg4PHeDDPz/IQzBbb9bVQ2Nre2d6q7tb39g8Mj8/ikp+JUEtolMY/lwMeKciZoFxhwOkgkxZHPad+f3s79/iOVisXiAbKEehEeCxYygkFLI7PuhhKTfDYtclfNJORBUYzMht20F7DWiVOSBirRGZlfbhCTNKICCMdKDR07AS/HEhjhtKi5qaIJJlM8pkNNBY6o8vLF8YV1rpXACmOpS4C1UH9P5DhSKot83RlhmKhVby7+5w1TCG+8nIkkBSrIclGYcgtia56EFTBJCfBME0wk07daZIJ1GqDzqukQnNWX10nvsum0mq37q0a7VcZRRafoDF0gB12jNrpDHdRFBGXoGb2iN+PJeDHejY9la8UoZ+roD4zPHxT9la4=</latexit> <latexit sha1_base64="JZSNh8HMqbgLWiOy9tGv37FCpyg=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0lEqseCF48V7Ac0oWw2m3bpZpPuToQQ4l/x4kERr/4Qb/4bt20O2vpg4PHeDDPz/IQzBbb9bVQ2Nre2d6q7tb39g8Mj8/ikp+JUEtolMY/lwMeKciZoFxhwOkgkxZHPad+f3s79/iOVisXiAbKEehEeCxYygkFLI7PuhhKTfDYtclfNJORBUYzMht20F7DWiVOSBirRGZlfbhCTNKICCMdKDR07AS/HEhjhtKi5qaIJJlM8pkNNBY6o8vLF8YV1rpXACmOpS4C1UH9P5DhSKot83RlhmKhVby7+5w1TCG+8nIkkBSrIclGYcgtia56EFTBJCfBME0wk07daZIJ1GqDzqukQnNWX10nvsum0mq37q0a7VcZRRafoDF0gB12jNrpDHdRFBGXoGb2iN+PJeDHejY9la8UoZ+roD4zPHxT9la4=</latexit>
qk
<latexit sha1_base64="JZSNh8HMqbgLWiOy9tGv37FCpyg=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0lEqseCF48V7Ac0oWw2m3bpZpPuToQQ4l/x4kERr/4Qb/4bt20O2vpg4PHeDDPz/IQzBbb9bVQ2Nre2d6q7tb39g8Mj8/ikp+JUEtolMY/lwMeKciZoFxhwOkgkxZHPad+f3s79/iOVisXiAbKEehEeCxYygkFLI7PuhhKTfDYtclfNJORBUYzMht20F7DWiVOSBirRGZlfbhCTNKICCMdKDR07AS/HEhjhtKi5qaIJJlM8pkNNBY6o8vLF8YV1rpXACmOpS4C1UH9P5DhSKot83RlhmKhVby7+5w1TCG+8nIkkBSrIclGYcgtia56EFTBJCfBME0wk07daZIJ1GqDzqukQnNWX10nvsum0mq37q0a7VcZRRafoDF0gB12jNrpDHdRFBGXoGb2iN+PJeDHejY9la8UoZ+roD4zPHxT9la4=</latexit>
d d d d
qk qk qk
<latexit sha1_base64="JZSNh8HMqbgLWiOy9tGv37FCpyg=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0lEqseCF48V7Ac0oWw2m3bpZpPuToQQ4l/x4kERr/4Qb/4bt20O2vpg4PHeDDPz/IQzBbb9bVQ2Nre2d6q7tb39g8Mj8/ikp+JUEtolMY/lwMeKciZoFxhwOkgkxZHPad+f3s79/iOVisXiAbKEehEeCxYygkFLI7PuhhKTfDYtclfNJORBUYzMht20F7DWiVOSBirRGZlfbhCTNKICCMdKDR07AS/HEhjhtKi5qaIJJlM8pkNNBY6o8vLF8YV1rpXACmOpS4C1UH9P5DhSKot83RlhmKhVby7+5w1TCG+8nIkkBSrIclGYcgtia56EFTBJCfBME0wk07daZIJ1GqDzqukQnNWX10nvsum0mq37q0a7VcZRRafoDF0gB12jNrpDHdRFBGXoGb2iN+PJeDHejY9la8UoZ+roD4zPHxT9la4=</latexit>
qk qk
<latexit sha1_base64="JZSNh8HMqbgLWiOy9tGv37FCpyg=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0lEqseCF48V7Ac0oWw2m3bpZpPuToQQ4l/x4kERr/4Qb/4bt20O2vpg4PHeDDPz/IQzBbb9bVQ2Nre2d6q7tb39g8Mj8/ikp+JUEtolMY/lwMeKciZoFxhwOkgkxZHPad+f3s79/iOVisXiAbKEehEeCxYygkFLI7PuhhKTfDYtclfNJORBUYzMht20F7DWiVOSBirRGZlfbhCTNKICCMdKDR07AS/HEhjhtKi5qaIJJlM8pkNNBY6o8vLF8YV1rpXACmOpS4C1UH9P5DhSKot83RlhmKhVby7+5w1TCG+8nIkkBSrIclGYcgtia56EFTBJCfBME0wk07daZIJ1GqDzqukQnNWX10nvsum0mq37q0a7VcZRRafoDF0gB12jNrpDHdRFBGXoGb2iN+PJeDHejY9la8UoZ+roD4zPHxT9la4=</latexit>
<latexit sha1_base64="JZSNh8HMqbgLWiOy9tGv37FCpyg=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0lEqseCF48V7Ac0oWw2m3bpZpPuToQQ4l/x4kERr/4Qb/4bt20O2vpg4PHeDDPz/IQzBbb9bVQ2Nre2d6q7tb39g8Mj8/ikp+JUEtolMY/lwMeKciZoFxhwOkgkxZHPad+f3s79/iOVisXiAbKEehEeCxYygkFLI7PuhhKTfDYtclfNJORBUYzMht20F7DWiVOSBirRGZlfbhCTNKICCMdKDR07AS/HEhjhtKi5qaIJJlM8pkNNBY6o8vLF8YV1rpXACmOpS4C1UH9P5DhSKot83RlhmKhVby7+5w1TCG+8nIkkBSrIclGYcgtia56EFTBJCfBME0wk07daZIJ1GqDzqukQnNWX10nvsum0mq37q0a7VcZRRafoDF0gB12jNrpDHdRFBGXoGb2iN+PJeDHejY9la8UoZ+roD4zPHxT9la4=</latexit>
a p
d
p
d
p
d
p
d
p
d a 0.3 0.1 0.1 0.2 0.3
i
<latexit sha1_base64="/567GNfqs6hI1+r9uu/P1P0myJ4=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4KkmR6rHgxWML9gPaUDbbSbt2swm7G6GE/gIvHhTx6k/y5r9x2+agrQ8GHu/NMDMvSATXxnW/nY3Nre2d3cJecf/g8Oi4dHLa1nGqGLZYLGLVDahGwSW2DDcCu4lCGgUCO8Hkbu53nlBpHssHM03Qj+hI8pAzaqzU5INS2a24C5B14uWkDDkag9JXfxizNEJpmKBa9zw3MX5GleFM4KzYTzUmlE3oCHuWShqh9rPFoTNyaZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjrZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTdGG4K2+vE7a1YpXq9Sa1+V6NY+jAOdwAVfgwQ3U4R4a0AIGCM/wCm/Oo/PivDsfy9YNJ585gz9wPn8Azj+M6A==</latexit>
<latexit sha1_base64="a/l1BFMoxqpLl0Ek1dbfORNKOpY=">AAAB6nicbVBNS8NAEJ34WeNX1aOXxSL0ICURqR4LXjxWtB/QhrLZbtqlm92wuxFK6E/w4kERr/4ib/4bN20O2vpg4PHeDDPzwoQzbTzv21lb39jc2i7tuLt7+weH5aPjtpapIrRFJJeqG2JNORO0ZZjhtJsoiuOQ0044uc39zhNVmknxaKYJDWI8EixiBBsrPVRdd1CueDVvDrRK/IJUoEBzUP7qDyVJYyoM4Vjrnu8lJsiwMoxwOnP7qaYJJhM8oj1LBY6pDrL5qTN0bpUhiqSyJQyaq78nMhxrPY1D2xljM9bLXi7+5/VSE90EGRNJaqggi0VRypGRKP8bDZmixPCpJZgoZm9FZIwVJsamk4fgL7+8StqXNb9eq99fVRoXRRwlOIUzqIIP19CAO2hCCwiM4Ble4c3hzovz7nwsWtecYuYE/sD5/AHSYYzJ</latexit>
i
<latexit sha1_base64="/567GNfqs6hI1+r9uu/P1P0myJ4=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4KkmR6rHgxWML9gPaUDbbSbt2swm7G6GE/gIvHhTx6k/y5r9x2+agrQ8GHu/NMDMvSATXxnW/nY3Nre2d3cJecf/g8Oi4dHLa1nGqGLZYLGLVDahGwSW2DDcCu4lCGgUCO8Hkbu53nlBpHssHM03Qj+hI8pAzaqzU5INS2a24C5B14uWkDDkag9JXfxizNEJpmKBa9zw3MX5GleFM4KzYTzUmlE3oCHuWShqh9rPFoTNyaZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjrZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTdGG4K2+vE7a1YpXq9Sa1+V6NY+jAOdwAVfgwQ3U4R4a0AIGCM/wCm/Oo/PivDsfy9YNJ585gz9wPn8Azj+M6A==</latexit>
Attention probability
between “saw” and
⇣ QK T
<latexit sha1_base64="OpEc/enLiBPMDCig0mcf91/67mQ=">AAACXnicbVFdaxQxFM1Mtdaptdv6IvgSXIT6sswUqX0s9UWwQqvdtrDZLplMZhs2H2Nyp3YJ+ZO+iS/+FDPbLWjrhcDhnHO5956UjRQO8vxnkq48erz6ZO1ptv5s4/lmb2v7zJnWMj5kRhp7UVLHpdB8CAIkv2gsp6qU/Lycfej082tunTD6FOYNHys61aIWjEKkJr2WAL8Bq/xXU4OiN2FyR1jzPZBDMd3BpLaU+RP86fI0eOK+WfBVCJiYygC+s3+mbhYw7jreYiI0URSuytJ/CZf+iIBQ3OGjzkCyLJv0+vkgXxR+CIol6KNlHU96P0hlWKu4Biapc6Mib2DsqQXBJA8ZaR1vKJvRKR9FqGkcN/aLeAJ+E5kK18bGpwEv2L87PFXOzVUZnd3S7r7Wkf/TRi3U+2MvdNMC1+x2UN1KDAZ3WeNKWM5AziOgzIq4K2ZXNIYJ8Ue6EIr7Jz8EZ7uDYm+wd/Kuf7C7jGMNvUKv0Q4q0Ht0gD6iYzREDP1KkiRL1pPf6Wq6kW7eWtNk2fMC/VPpyz+Z8rYM</latexit>
⌘ “yesterday”, “I”,…”saw”.
Softmaxrow p Mask 2 RL⇥L
d
Xavier Bresson 33
34
combination
⇣ QK T ⌘
<latexit sha1_base64="moYfNZvChcfjNchjDIa3lmswfRo=">AAACWnicbVFdaxQxFM1MtR9bP9bqmy/BRagvy0wp1cdSXwQVWu1uC5vtkslktmHzMU3u1C4hf9IXEfwrhWa2W9DWC4HDOedy7z0paikcZNnvJF159Hh1bX2js/nk6bPn3RdbQ2cay/iAGWnsaUEdl0LzAQiQ/LS2nKpC8pNi9rHVTy65dcLoY5jXfKzoVItKMAqRmnQvCPArsMp/NxUoehUmd4Q1PwI5ENNtTCpLmT/Cn8+OgyfuwoIvQ8DElAbwnf0rdbOAcdvxDg+J0ERROC8K/y2c+S8EhOIOlwF3Jt1e1s8WhR+CfAl6aFmHk+5PUhrWKK6BSercKM9qGHtqQTDJQ4c0jteUzeiUjyLUNE4a+0U0Ab+NTIkrY+PTgBfs3x2eKufmqojOdl93X2vJ/2mjBqoPYy903QDX7HZQ1UgMBrc541JYzkDOI6DMirgrZuc0BgnxN9oQ8vsnPwTDnX6+19872u3t7yzjWEev0Ru0jXL0Hu2jT+gQDRBDv9B1spqsJX/SNN1IN2+tabLseYn+qfTVDcDftmU=</latexit>
I
Softmaxrow p Mask V 2 RL⇥d
d …
saw
a
Yesterday I … saw a i
<latexit sha1_base64="/567GNfqs6hI1+r9uu/P1P0myJ4=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4KkmR6rHgxWML9gPaUDbbSbt2swm7G6GE/gIvHhTx6k/y5r9x2+agrQ8GHu/NMDMvSATXxnW/nY3Nre2d3cJecf/g8Oi4dHLa1nGqGLZYLGLVDahGwSW2DDcCu4lCGgUCO8Hkbu53nlBpHssHM03Qj+hI8pAzaqzU5INS2a24C5B14uWkDDkag9JXfxizNEJpmKBa9zw3MX5GleFM4KzYTzUmlE3oCHuWShqh9rPFoTNyaZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjrZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTdGG4K2+vE7a1YpXq9Sa1+V6NY+jAOdwAVfgwQ3U4R4a0AIGCM/wCm/Oo/PivDsfy9YNJ585gz9wPn8Azj+M6A==</latexit>
Yesterday 1 0 0 0 0 Yesterday
The new vector
representation of “saw” is
I 0.3 0.7 0 0 0 0.3*Yesterday + 0.7*I given by a weighted linear
combination of the vector
0.4 0.2 0.4 0 0 0.4*Yesterday + 0.2*I + 0.2*went
representations of
…
“yesterday”, “I”,…”saw”.
And the weights are the
saw 0.2 0.1 0.1 0.5 0 0.2*Yesterday + 0.1*I + 0.1*went + …
attention probabilities
between the pair of words.
a 0.3 0.1 0.1 0.2 0.3 0.3*Yesterday + 0.1*I + 0.1*went + … + 0.3*a
i
<latexit sha1_base64="/567GNfqs6hI1+r9uu/P1P0myJ4=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4KkmR6rHgxWML9gPaUDbbSbt2swm7G6GE/gIvHhTx6k/y5r9x2+agrQ8GHu/NMDMvSATXxnW/nY3Nre2d3cJecf/g8Oi4dHLa1nGqGLZYLGLVDahGwSW2DDcCu4lCGgUCO8Hkbu53nlBpHssHM03Qj+hI8pAzaqzU5INS2a24C5B14uWkDDkag9JXfxizNEJpmKBa9zw3MX5GleFM4KzYTzUmlE3oCHuWShqh9rPFoTNyaZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjrZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTdGG4K2+vE7a1YpXq9Sa1+V6NY+jAOdwAVfgwQ3U4R4a0AIGCM/wCm/Oo/PivDsfy9YNJ585gz9wPn8Azj+M6A==</latexit>
Xavier Bresson 34
35
Attention blocks
Output
Probability
Attention blocks
<latexit sha1_base64="4ilYVmQFqfNdWYDQPJxMA44Yslo=">AAADnXicjVLbjtMwEHUTLku5deENHrCoQK02GyUVWnip1BVaVFB2VRDdrrRuK8d1W6vOBdsBqshfxZ/wxt/g9LLqtiCYh2h8Zs6ck9GEKWdSed6vkmXfuHnr9t6d8t179x88rOw/OpdJJgjtkoQn4iLEknIW065iitOLVFAchZz2wtnbot77SoVkSfxZzVPaj/AkZmNGsDLQcL/04yVEin5XIsrfJQJqiCjnTc/xnYbjug4MBuvysVL60IcIlQ3lS4ZHiw9EIRZ5Ww/ygnjga9iEsD0oHvDgavQplrPD0/axrq2R4EzXlm115z+xOkQsRhFW0zDMPxnFACkWUQlHesdVe9PPjsUNY0Hnuqft3n+qrsnfmJqa9a2Em55uXo0NzNh8/ZroYe5rs1vX2YQCjYyvNdI50fW/ypaHlarneouAu4m/SqpgFZ1h5ScaJSSLaKwIx1Je+l6q+jkWihFOdRllkqaYzPCEXpo0xkaony+uS8MXBhnBsTmOcRIruEA3GTmOpJxHoeks7MrtWgH+qXaZqfGbfs7iNFM0JkuhccahSmBxqnDEBCWKz02CiWDGKyRTLDBR5qCLJfjbv7ybnDdc/8g9+viq2mqs1rEHnoLnoAZ88Bq0QBt0QBcQ64nVst5bH+xn9okd2GfLVqu04jwG18Lu/QbqXCa8</latexit>
Positional Positional
Encoding Encoding
<latexit sha1_base64="ty0PYRkDA0w4RTwBIaIbdQfyOH0=">AAAB9XicbVA9SwNBEN3zM8avqKXNYhCswl2QaBmwsbCIYD4gOcPeZi9Zsrt37M6p4bj/YWOhiK3/xc5/4ya5QhMfDDzem2FmXhALbsB1v52V1bX1jc3CVnF7Z3dvv3Rw2DJRoilr0khEuhMQwwRXrAkcBOvEmhEZCNYOxldTv/3AtOGRuoNJzHxJhoqHnBKw0n0P2BNomQ6zfnqT9Utlt+LOgJeJl5MyytHol756g4gmkimgghjT9dwY/JRo4FSwrNhLDIsJHZMh61qqiGTGT2dXZ/jUKgMcRtqWAjxTf0+kRBozkYHtlARGZtGbiv953QTCSz/lKk6AKTpfFCYCQ4SnEeAB14yCmFhCqOb2VkxHRBMKNqiiDcFbfHmZtKoVr1ap3Z6X69U8jgI6RifoDHnoAtXRNWqgJqJIo2f0it6cR+fFeXc+5q0rTj5zhP7A+fwBUfKTAw==</latexit>
Xavier Bresson 35
36
Sequence of L words
Cross
Entropy Labels: 174, 564, 13, … , 876
Criterion
L number of words
L = 0.01 B to predict
Xavier Bresson 36
<latexit sha1_base64="Dc/3Cq9NxQoPTPNfgyCOy/2v+w0=">AAAB/HicdVDLSgMxFM34rPU12qWbYBFcDZk6tHUhFN24cFHBPqAdSybNtKGZB0lGGIb6K25cKOLWD3Hn35hpK6jogcDhnHu5J8eLOZMKoQ9jaXlldW29sFHc3Nre2TX39tsySgShLRLxSHQ9LClnIW0ppjjtxoLiwOO0400ucr9zR4VkUXij0pi6AR6FzGcEKy0NzFI/wGpMMM+uprfnZ8hCdnFglpF1Wq9WnCrUAqrZFTsnlZpz4kBbKznKYIHmwHzvDyOSBDRUhGMpezaKlZthoRjhdFrsJ5LGmEzwiPY0DXFApZvNwk/hkVaG0I+EfqGCM/X7RoYDKdPA05N5VPnby8W/vF6i/LqbsTBOFA3J/JCfcKgimDcBh0xQoniqCSaC6ayQjLHAROm+8hK+fgr/J+2KZSPLvnbKDWdRRwEcgENwDGxQAw1wCZqgBQhIwQN4As/GvfFovBiv89ElY7FTAj9gvH0CJ+STvQ==</latexit>
37
Generation
The transformer network is trained in parallel (using the mask to hide the predicted words).
After training, the mask is not required anymore (there are no future words to hide).
And the sequence is generated auto-regressively, i.e. one word at a time.
Output
wt+1 ⇠ pt 2 RV
<latexit sha1_base64="UZLviB/OeiyVLo3Xb3VrvcL0w0c=">AAACC3icbVBNS8NAEN3Ur1q/qh69LC2CIJSkSPVY8OKxiv2ApobNdtsu3WzC7kQpIXcv/hUvHhTx6h/w5r9x0/agrQ8GHu/NMDPPjwTXYNvfVm5ldW19I79Z2Nre2d0r7h+0dBgrypo0FKHq+EQzwSVrAgfBOpFiJPAFa/vjy8xv3zOleShvYRKxXkCGkg84JWAkr1h68BI4dVLsah7gyAPscukGBEa+n9ykd0kr9Yplu2JPgZeJMydlNEfDK365/ZDGAZNABdG669gR9BKigFPB0oIbaxYROiZD1jVUkoDpXjL9JcXHRunjQahMScBT9fdEQgKtJ4FvOrMr9aKXif953RgGF72EyygGJuls0SAWGEKcBYP7XDEKYmIIoYqbWzEdEUUomPgKJgRn8eVl0qpWnFqldn1WrlfnceTRESqhE+Sgc1RHV6iBmoiiR/SMXtGb9WS9WO/Wx6w1Z81nDtEfWJ8/tUKa0w==</latexit>
probability
Output of Transformer model: for next word
Att
q̄ `+1 ` ` ` `
= q + MHA(LN(q ), LN(K ), LN(V )) 2 R d q `=L
<latexit sha1_base64="2c6BlZyRVlknxZoE8507zzUwov0=">AAAB9XicbVBNSwMxEM3Wr1q/qh69BIvgqewWqR4LXjxWsB/QriWbZtvQJLsks2pZ9n948aCIV/+LN/+NabsHbX0w8Hhvhpl5QSy4Adf9dgpr6xubW8Xt0s7u3v5B+fCobaJEU9aikYh0NyCGCa5YCzgI1o01IzIQrBNMrmd+54FpwyN1B9OY+ZKMFA85JWCl+z6wJ9AyHWWDFLJBueJW3TnwKvFyUkE5moPyV38Y0UQyBVQQY3qeG4OfEg2cCpaV+olhMaETMmI9SxWRzPjp/OoMn1lliMNI21KA5+rviZRIY6YysJ2SwNgsezPxP6+XQHjlp1zFCTBFF4vCRGCI8CwCPOSaURBTSwjV3N6K6ZhoQsEGVbIheMsvr5J2rerVq/Xbi0qjlsdRRCfoFJ0jD12iBrpBTdRCFGn0jF7Rm/PovDjvzseiteDkM8foD5zPH466kys=</latexit>
Xavier Bresson 37
38
Lab 01
PyTorch implementation of Language Model Transformers
Output
Probability
Positional Positional
Encoding Encoding
Xavier Bresson 38
39
Lab 01
Numerical results on PTB :
Vanilla LM Transformer:
Xavier Bresson 39
40
Attention mechanism
It is a breakthrough idea in NLP !
It is as revolutionary as CNNs in Computer Vision.
Transformers
RNNs
(degrades quickly
after 30 words)
(Bahdanau-Cho-Bengio 2014)
Xavier Bresson 40
41
Xavier Bresson 41
42
Attention interpretation
The attention score matrix provides the matching between words in a sequence :
⇣ QK T ⌘
<latexit sha1_base64="UVBbZQpecFTQ/Y+KgyzK2fgGTcY=">AAACTXicbVHLahsxFNW4eThuk7jtshsRU0g3ZiaENMuQbgrNIi8nAcsxGo3GEdFjKt1pbcT8YDaB7voX3XSRUEo0jgvN44DgcM693HuP0kIKB3H8M2q8mJtfWGwutV6+Wl5Zbb9+c+JMaRnvMSONPUup41Jo3gMBkp8VllOVSn6aXn6q/dNv3Dph9DFMCj5QdKRFLhiFIA3bGQE+Bqv8kclB0XE1/CdY870iu2K0jkluKfMH+Mv5ceWJ+2rBZ1WFMa7tD5gITRSFizT1h9W53yMgFHd4L1QQ0qoxbHfibjwFfkqSGemgGfaH7R8kM6xUXAOT1Ll+Ehcw8NSCYJJXLVI6XlB2SUe8H6imYeDAT9Oo8PugZDg3NjwNeKr+3+Gpcm6i0lBZr+0ee7X4nNcvId8eeKGLErhm94PyUmIwuI4WZ8JyBnISCGVWhF0xu6AhOwgfUIeQPD75KTnZ6CZb3a2Dzc7OxiyOJnqH1tA6StBHtIM+o33UQwxdoV/oBt1G19Hv6E/09760Ec163qIHaCzeAYRzs7o=</latexit>
Softmaxrow p 2 RL⇥L
d
Xavier Bresson 42
43
Outline
Language Models
Memory Networks
Transformers
Sequence-To-Sequence Transformers
Transfer Learning
Conclusion
Xavier Bresson 43
44
Seq2Seq Transformers
LEnc
<latexit sha1_base64="hsLCwsvcLoF3ruaKtobkw3+YAU4=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0mKVI8FETx4qGA/oI1ls920SzebsDsRQ6h/xYsHRbz6Q7z5b9zEHLT1wcDjvRlm5nkRZwps+8sorayurW+UNytb2zu7e+b+QVeFsSS0Q0Ieyr6HFeVM0A4w4LQfSYoDj9OeN7vI/N49lYqF4haSiLoBngjmM4JBSyOzen03BPoAMkgvBZlXMozMml23c1jLxClIDRVoj8zP4TgkcUAFEI6VGjh2BG6KJTDC6bwyjBWNMJnhCR1oKnBAlZvmx8+tY62MLT+UugRYufp7IsWBUkng6c4Aw1Qtepn4nzeIwT93UyaiGKh+LV/kx9yC0MqSsMZMUgI80QQTyfStFpliiQnovLIQnMWXl0m3UXea9ebNaa3VKOIoo0N0hE6Qg85QC12hNuogghL0hF7Qq/FoPBtvxvtPa8koZqroD4yPb3jek04=</latexit>
Xavier Bresson 44
45
Self-attention encoder
⇣ ⌘
= kh=1 HAh (H) W O , W O 2 Rd⇥d
H
Lin ⇥d
with H = {gin in
1 , ..., gLin } 2 R
⇣ Q KT ⌘
<latexit sha1_base64="zVs2xW5Rrc6WxjzEx5YJtMhbsAk=">AAADu3iclZJNb9NAEIY3NtBiPprCkcuKiCqVUOpUqO0lUoGLpXBooHEqxYm1Xq/jpf7q7hgaWfsj4ca/YZ2mtKQooiPZejXzzjy7qwmKhEuw7V8Nw3zw8NHG5mPrydNnz7ea2y9cmZeCsiHNk1ycBUSyhGdsCBwSdlYIRtIgYaPg/GNdH31jQvI8O4V5wSYpmWU84pSATvnbjR+WZXnALkGklfNe+XHb2cU7PXyd+5JHkJJL5X3gszb2IkFoNfBj3Pfj6amqPHkhoAr3KkcphWvTLnb92OOZlxKIg6D6rKbVJ/96HM+UBzxlEod7jm7wrJ0/KPydQ4wVrsf3sDPShAH+z0lv8cK+4g5vs1ZgGn5RknD5798w+/dj9tcx1yHdG6R7P6S7BmlZfrNld+xF4LuiuxQttIwTv/nTC3NapiwDmhApx127gElFBHCaMGV5pWQFoedkxsZaZkSzJtVi9xR+ozMhjnKhvwzwInu7oyKplPM00M76xHK1Vif/VRuXEB1N9PWLElhGr0BRmWDIcb3IOOSCUUjmWhAquD4rpjHR6wl63etH6K5e+a5w9zvdg87B4F3reH/5HJvoFXqN2qiLDtExctAJGiJqHBlTY2bEZs+k5lczubIajWXPS/RXmOVvUK80Fw==</latexit>
h
HAh (H) = Softmax p h Vh 2 RLin ⇥d/H LEnc
<latexit sha1_base64="hsLCwsvcLoF3ruaKtobkw3+YAU4=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0mKVI8FETx4qGA/oI1ls920SzebsDsRQ6h/xYsHRbz6Q7z5b9zEHLT1wcDjvRlm5nkRZwps+8sorayurW+UNytb2zu7e+b+QVeFsSS0Q0Ieyr6HFeVM0A4w4LQfSYoDj9OeN7vI/N49lYqF4haSiLoBngjmM4JBSyOzen03BPoAMkgvBZlXMozMml23c1jLxClIDRVoj8zP4TgkcUAFEI6VGjh2BG6KJTDC6bwyjBWNMJnhCR1oKnBAlZvmx8+tY62MLT+UugRYufp7IsWBUkng6c4Aw1Qtepn4nzeIwT93UyaiGKh+LV/kx9yC0MqSsMZMUgI80QQTyfStFpliiQnovLIQnMWXl0m3UXea9ebNaa3VKOIoo0N0hE6Qg85QC12hNuogghL0hF7Qq/FoPBtvxvtPa8koZqroD4yPb3jek04=</latexit>
d/H
with Qh = HWhQ 2 RLin ⇥d/H , WhQ 2 Rd⇥d/H
Kh = HWhK 2 RLin ⇥d/H , WhK 2 Rd⇥d/H
Vh = HWhV 2 RLin ⇥d/H , WhV 2 Rd⇥d/H
self-attention
Produce a representation of
words that depends on the
he drives the car context of surrounding words.
Xavier Bresson 45
46
Encoder layer
H Enc
<latexit sha1_base64="OA7V4VFRvUYoDHbR8ZoUinmXlrg=">AAAB+HicbVBNS8NAEN3Ur1o/GvXoZbEInkpSpHosiNBjBfsBbSyb7bZdupuE3YlYQ3+JFw+KePWnePPfuIk5aOuDgcd7M8zM8yPBNTjOl1VYW9/Y3Cpul3Z29/bL9sFhR4exoqxNQxGqnk80EzxgbeAgWC9SjEhfsK4/u0r97j1TmofBLcwj5kkyCfiYUwJGGtrl5t0A2AMomVwHdFEa2hWn6mTAq8TNSQXlaA3tz8EopLFkAVBBtO67TgReQhRwKtiiNIg1iwidkQnrGxoQybSXZIcv8KlRRngcKlMB4Ez9PZEQqfVc+qZTEpjqZS8V//P6MYwvvYQHUQzMvJUtGscCQ4jTFPCIK0ZBzA0hVHFzK6ZToggFk1Uagrv88irp1KpuvVq/Oa80ankcRXSMTtAZctEFaqAmaqE2oihGT+gFvVqP1rP1Zr3/tBasfOYI/YH18Q2KG5L6</latexit>
1
H̄ `+1 = H ` + MHA(LN(H ` ), LN(H ` ), LN(H ` )) 2 RLin ⇥d LDec
<latexit sha1_base64="85MQDivJuxTSBnH6YaruPNAM3yQ=">AAAB/HicbVBNS8NAEN34WetXtEcvi0XwVJIi1WNBDx48VLAf0May2U7bpZsPdidiCPWvePGgiFd/iDf/jUmbg7Y+GHi8N8PMPDeUQqNlfRsrq2vrG5uFreL2zu7evnlw2NJBpDg0eSAD1XGZBil8aKJACZ1QAfNcCW13cpn57QdQWgT+HcYhOB4b+WIoOMNU6pulm/sewiMqL7kCPi1m6Jtlq2LNQJeJnZMyydHom1+9QcAjD3zkkmndta0QnYQpFFzCtNiLNISMT9gIuin1mQfaSWbHT+lJqgzoMFBp+Uhn6u+JhHlax56bdnoMx3rRy8T/vG6EwwsnEX4YIfh8vmgYSYoBzZKgA6GAo4xTwrgS6a2Uj5liHNO8shDsxZeXSatasWuV2u1ZuV7N4yiQI3JMTolNzkmdXJMGaRJOYvJMXsmb8WS8GO/Gx7x1xchnSuQPjM8faW+TRA==</latexit>
H = H̄ + MLP(LN(H̄
Lin ⇥d
with initialization H `=0 = LL({gin in
1 , ..., gLin } + PE) 2 R
Xavier Bresson 46
47
Lout ⇥d
with H = LL({gout out
1 , ..., gLout } + PE) 2 R
Cross-attention layer
layer
h̄ = MHA(q, K, V ) 2 Rd , q 2 Rd , K, V 2 RLin ⇥d
<latexit sha1_base64="bbf7pzggQ3msE8AevKbDjzqpcZ0=">AAADLHicjVJdixMxFM2MX2v96q6PvlwsShdK6Syy7kthXRUKVVzFTheadshk0k7Y+WqSUcuYH+SLf0UQH1zEV3+Hmdqutt0HL4Q5nHtyT04mfhZxqVqtM8u+dPnK1Wtb1ys3bt66fae6vePKNBeU9WgapeLEJ5JFPGE9xVXETjLBSOxHrO+fPi37/XdMSJ4mb9UsY8OYTBI+5pQoQ3nb1hH2iShCDQ/bgBX7oERcvOw80fVpo9twdzFPcExU6PvFGz0KGoBhusF1G+CuksULbzmMJxorHjMJgQaMK6XPEZ/UAX/0irDt6FHHDF2qjbMXLrznul3oj16Vtuaz5hGszj2fAe+5CkHDFP5GCvVoCZ8xqj11QbBu2213zmXPE6r/K1TFq9Zazda8YBM4C1BDizr2ql9xkNI8ZomiEZFy4LQyNSyIUJxGTFdwLllG6CmZsIGBCTFGw2L+szU8MEwA41SYlSiYs//uKEgs5Sz2jbI8ulzvleRFvUGuxgdDkyzLFTPZ50bjPAKVQvlyIOCCURXNDCBUcHNWoCERhCrzvspLcNYjbwJ3r+nsN/dfP6od7i2uYwvdQ/dRHTnoMTpEHXSMeohan6wv1nfrzP5sf7N/2D//SG1rsecuWin712+P2AOo</latexit>
⇣ ⌘ LDec
<latexit sha1_base64="85MQDivJuxTSBnH6YaruPNAM3yQ=">AAAB/HicbVBNS8NAEN34WetXtEcvi0XwVJIi1WNBDx48VLAf0May2U7bpZsPdidiCPWvePGgiFd/iDf/jUmbg7Y+GHi8N8PMPDeUQqNlfRsrq2vrG5uFreL2zu7evnlw2NJBpDg0eSAD1XGZBil8aKJACZ1QAfNcCW13cpn57QdQWgT+HcYhOB4b+WIoOMNU6pulm/sewiMqL7kCPi1m6Jtlq2LNQJeJnZMyydHom1+9QcAjD3zkkmndta0QnYQpFFzCtNiLNISMT9gIuin1mQfaSWbHT+lJqgzoMFBp+Uhn6u+JhHlax56bdnoMx3rRy8T/vG6EwwsnEX4YIfh8vmgYSYoBzZKgA6GAo4xTwrgS6a2Uj5liHNO8shDsxZeXSatasWuV2u1ZuV7N4yiQI3JMTolNzkmdXJMGaRJOYvJMXsmb8WS8GO/Gx7x1xchnSuQPjM8faW+TRA==</latexit>
q = hDec
<latexit sha1_base64="EMdPqxz10n1W3qWZrv/RSRwQYxM=">AAACBnicbZDJSgNBEIZ74hbjNupRhMYgeAozQaIXIaAHjxHMAkkMPZ1K0qRnsbtGDENOXnwVLx4U8eozePNt7CyCJv7Q8PFXFdX1e5EUGh3ny0otLC4tr6RXM2vrG5tb9vZORYex4lDmoQxVzWMapAigjAIl1CIFzPckVL3++ahevQOlRRhc4yCCps+6gegIztBYLXv/lp7RBsI9Kj/pDW9+8AL4sIUtO+vknLHoPLhTyJKpSi37s9EOeexDgFwyreuuE2EzYQoFlzDMNGINEeN91oW6wYD5oJvJ+IwhPTROm3ZCZV6AdOz+nkiYr/XA90ynz7CnZ2sj879aPcbOaTMRQRQjBHyyqBNLiiEdZULbQgFHOTDAuBLmr5T3mGIcTXIZE4I7e/I8VPI5t5ArXB1ni/lpHGmyRw7IEXHJCSmSS1IiZcLJA3kiL+TVerSerTfrfdKasqYzu+SPrI9vwTyZSw==</latexit>
with q = hDec
t 2 Rd , K = V = H Enc 2 RLin ⇥d
cross-attention
q = Query word
(index t)
Cross-attention layer
LDec
<latexit sha1_base64="85MQDivJuxTSBnH6YaruPNAM3yQ=">AAAB/HicbVBNS8NAEN34WetXtEcvi0XwVJIi1WNBDx48VLAf0May2U7bpZsPdidiCPWvePGgiFd/iDf/jUmbg7Y+GHi8N8PMPDeUQqNlfRsrq2vrG5uFreL2zu7evnlw2NJBpDg0eSAD1XGZBil8aKJACZ1QAfNcCW13cpn57QdQWgT+HcYhOB4b+WIoOMNU6pulm/sewiMqL7kCPi1m6Jtlq2LNQJeJnZMyydHom1+9QcAjD3zkkmndta0QnYQpFFzCtNiLNISMT9gIuin1mQfaSWbHT+lJqgzoMFBp+Uhn6u+JhHlax56bdnoMx3rRy8T/vG6EwwsnEX4YIfh8vmgYSYoBzZKgA6GAo4xTwrgS6a2Uj5liHNO8shDsxZeXSatasWuV2u1ZuV7N4yiQI3JMTolNzkmdXJMGaRJOYvJMXsmb8WS8GO/Gx7x1xchnSuQPjM8faW+TRA==</latexit>
⇣ ⌘
H Dec
<latexit sha1_base64="qh9MG+z7H9Zjv2zQa5Lwo0ZIujQ=">AAAB9XicbVBNS8NAEN34WetX1aOXYBE8laRI9VjQQ48V7Ae0adlsJ+3S3STsTtQS+j+8eFDEq//Fm//GbZuDtj4YeLw3w8w8PxZco+N8W2vrG5tb27md/O7e/sFh4ei4qaNEMWiwSESq7VMNgofQQI4C2rECKn0BLX98M/NbD6A0j8J7nMTgSToMecAZRSP1ar0uwhMqmd4Cm/YLRafkzGGvEjcjRZKh3i98dQcRSySEyATVuuM6MXopVciZgGm+m2iIKRvTIXQMDakE7aXzq6f2uVEGdhApUyHac/X3REql1hPpm05JcaSXvZn4n9dJMLj2Uh7GCULIFouCRNgY2bMI7AFXwFBMDKFMcXOrzUZUUYYmqLwJwV1+eZU0yyW3UqrcXRar5SyOHDklZ+SCuOSKVEmN1EmDMKLIM3klb9aj9WK9Wx+L1jUrmzkhf2B9/gDJQJKr</latexit>
Xavier Bresson 49
50
Decoder layer
Decoder
Layer
Dec
<latexit sha1_base64="pV/DVfyUUr+uDZB87nCDl0iNDj8=">AAACAHicbVC7SgNBFJ2NrxhfqxYWNotBsAq7QaKNENAihUUE84BsEmYnN8mQ2Qczd8WwbOOv2FgoYutn2Pk3Th6FJh64cDjnXu69x4sEV2jb30ZmZXVtfSO7mdva3tndM/cP6iqMJYMaC0Uomx5VIHgANeQooBlJoL4noOGNrid+4wGk4mFwj+MI2j4dBLzPGUUtdc2jSidxQYir246L8IjST26ApWnXzNsFewprmThzkidzVLvml9sLWexDgExQpVqOHWE7oRI5E5Dm3FhBRNmIDqClaUB9UO1k+kBqnWqlZ/VDqStAa6r+nkior9TY93SnT3GoFr2J+J/XirF/2U54EMUIAZst6sfCwtCapGH1uASGYqwJZZLrWy02pJIy1JnldAjO4svLpF4sOKVC6e48Xy7O48iSY3JCzohDLkiZVEiV1AgjKXkmr+TNeDJejHfjY9aaMeYzh+QPjM8fDkyWrg==</latexit>
Dec H `=L
<latexit sha1_base64="tljXSfJ7LQlN6FESUsHB5APNtYM=">AAAF8XictVRLbxMxEN6WDS3hlcKRi0VElSjpardChUukIqiUQ4qWR9JKcRJ5HSexso9ge2mjlf8FFw4gxJV/w41/g3e72yYbHjnAHKzxPDzfN2PbmbmUC9P8sbF5TS9c39q+Ubx56/adu6Wdex0ehAyTNg7cgJ06iBOX+qQtqHDJ6YwR5DkuOXGmz2P/yXvCOA38t2I+Iz0PjX06ohgJZRrs6Fu7UJBzwbxoFDAgASSu2zDrhmHUW/3M9YJguWcBCIu7AL4L0TBZAHQQi5qyH8VJNUuCBmj2Yx3UQJZ6jPh077j5TFYyS+ulrFyEVetr2qoAUh96SEwcJ3qtCrYGWVQQCgkF9QgHQwmKEIIcxgkSOYwrsP+GNp+wgjHdHfn4z751eSTGZLjR2YQKIqMsDE7CMYkqUsr8MJpLFPOsFyi27By7XOz63Y4hZJ4zKibq+qQoGqZsXNZoqRqXBMayv3jYILJkctl+618uLqGsZXv7SFb/bT8X7nswJAyoQ2ahWKC1/CTk/6lus8BBDnWpmF8hsOOppgFvgpHw0PnVGJOZKow5eNW12tORRSWDUtk0zETAqmKlSllLxR6UvsNhgEOP+AK7iPOuZc5EL0JMUOwSWYQhJzOEp2hMukr1karVi5I2SPBIWYYg/nFGgS9AYl3MiJDH+dxzVGQMn+d9sfFXvm4oRk97EfVV04h6cUmhUegCEYD4+wNDyggW7lwpCDOqsAI8QQxhoT7JuAlWnvKq0tk3rAPj4NXj8uF+2o5t7YH2UKtolvZEO9Samq21Naz7+gf9k/65wAsfC18KXy9CNzfSnPvakhS+/QSsoxYk</latexit>
H̄ `+1 ` `
= H + Mask-MHA(LN(H ), LN(H ), LN(H )) 2 R ` ` Lout ⇥d H `+1
H Enc
<latexit sha1_base64="8GOzta95oXZt9+TPIfwwxLmgkT0=">AAAB9XicbVBNSwMxEM3Wr1q/qh69BIvgqewWqR4LIvRYwX5Auy3ZNG1Dk+ySzKpl6f/w4kERr/4Xb/4b03UP2vpg4PHeDDPzgkhwA6775eTW1jc2t/LbhZ3dvf2D4uFRy4SxpqxJQxHqTkAME1yxJnAQrBNpRmQgWDuYXi/89j3ThofqDmYR8yUZKz7ilICV+vV+D9gjaJncKDofFEtu2U2BV4mXkRLK0BgUP3vDkMaSKaCCGNP13Aj8hGjgVLB5oRcbFhE6JWPWtVQRyYyfpFfP8ZlVhngUalsKcKr+nkiINGYmA9spCUzMsrcQ//O6MYyu/ISrKAZmv0oXjWKBIcSLCPCQa0ZBzCwhVHN7K6YTogkFG1TBhuAtv7xKWpWyVy1Xby9KtUoWRx6doFN0jjx0iWqojhqoiSjS6Am9oFfnwXl23pz3n9ack80coz9wPr4B2H2StQ==</latexit>
Self-attention
( Self-
<latexit sha1_base64="ILtqT4137CxtFtJrhX6FLAgEBjU=">AAAB+XicbVBNS8NAEN3Ur1q/oh69LBZBEEpSpHoseOmxgq2FJpbNdtMu3WzC7qRQQv6JFw+KePWfePPfuG1z0NYHA4/3ZpiZFySCa3Ccb6u0sbm1vVPereztHxwe2ccnXR2nirIOjUWsegHRTHDJOsBBsF6iGIkCwR6Dyd3cf5wypXksH2CWMD8iI8lDTgkYaWDb3phA1sqfMo8JceXmA7vq1JwF8DpxC1JFBdoD+8sbxjSNmAQqiNZ910nAz4gCTgXLK16qWULohIxY31BJIqb9bHF5ji+MMsRhrExJwAv190RGIq1nUWA6IwJjverNxf+8fgrhrZ9xmaTAJF0uClOBIcbzGPCQK0ZBzAwhVHFzK6ZjoggFE1bFhOCuvrxOuvWa26g17q+rzXoRRxmdoXN0iVx0g5qohdqogyiaomf0it6szHqx3q2PZWvJKmZO0R9Ynz9QOZNq</latexit>
Ĥ `+1
= H̄ `+1
+ Mask-MHA(LN(H̄ `+1
), LN(H Enc
), LN(H Enc
)) 2 R Lout ⇥d Ĥ `+1
attention
Cross-attention
`+1 `+1 `+1 Lout ⇥d
)) 2 R LDec
<latexit sha1_base64="85MQDivJuxTSBnH6YaruPNAM3yQ=">AAAB/HicbVBNS8NAEN34WetXtEcvi0XwVJIi1WNBDx48VLAf0May2U7bpZsPdidiCPWvePGgiFd/iDf/jUmbg7Y+GHi8N8PMPDeUQqNlfRsrq2vrG5uFreL2zu7evnlw2NJBpDg0eSAD1XGZBil8aKJACZ1QAfNcCW13cpn57QdQWgT+HcYhOB4b+WIoOMNU6pulm/sewiMqL7kCPi1m6Jtlq2LNQJeJnZMyydHom1+9QcAjD3zkkmndta0QnYQpFFzCtNiLNISMT9gIuin1mQfaSWbHT+lJqgzoMFBp+Uhn6u+JhHlax56bdnoMx3rRy8T/vG6EwwsnEX4YIfh8vmgYSYoBzZKgA6GAo4xTwrgS6a2Uj5liHNO8shDsxZeXSatasWuV2u1ZuV7N4yiQI3JMTolNzkmdXJMGaRJOYvJMXsmb8WS8GO/Gx7x1xchnSuQPjM8faW+TRA==</latexit>
Enc
H̄ `+1 Cross-
<latexit sha1_base64="hsLCwsvcLoF3ruaKtobkw3+YAU4=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0mKVI8FETx4qGA/oI1ls920SzebsDsRQ6h/xYsHRbz6Q7z5b9zEHLT1wcDjvRlm5nkRZwps+8sorayurW+UNytb2zu7e+b+QVeFsSS0Q0Ieyr6HFeVM0A4w4LQfSYoDj9OeN7vI/N49lYqF4haSiLoBngjmM4JBSyOzen03BPoAMkgvBZlXMozMml23c1jLxClIDRVoj8zP4TgkcUAFEI6VGjh2BG6KJTDC6bwyjBWNMJnhCR1oKnBAlZvmx8+tY62MLT+UugRYufp7IsWBUkng6c4Aw1Qtepn4nzeIwT93UyaiGKh+LV/kx9yC0MqSsMZMUgI80QQTyfStFpliiQnovLIQnMWXl0m3UXea9ebNaa3VKOIoo0N0hE6Qg85QC12hNuogghL0hF7Qq/FoPBtvxvtPa8koZqroD4yPb3jek04=</latexit>
H `=0
Decoder output H `=LDec
2R Lout ⇥d
(
Dec
Probability output P = Softmax(MLP(H L )) 2 RLout ⇥V
Xavier Bresson 50
51
Generation
At inference, the input sequence is first encoded in parallel and provides HEnc.
Then, the output sequence is generated auto-regressively, i.e. one word at a time.
wt+1 ⇠ pt 2 RV
<latexit sha1_base64="UZLviB/OeiyVLo3Xb3VrvcL0w0c=">AAACC3icbVBNS8NAEN3Ur1q/qh69LC2CIJSkSPVY8OKxiv2ApobNdtsu3WzC7kQpIXcv/hUvHhTx6h/w5r9x0/agrQ8GHu/NMDPPjwTXYNvfVm5ldW19I79Z2Nre2d0r7h+0dBgrypo0FKHq+EQzwSVrAgfBOpFiJPAFa/vjy8xv3zOleShvYRKxXkCGkg84JWAkr1h68BI4dVLsah7gyAPscukGBEa+n9ykd0kr9Yplu2JPgZeJMydlNEfDK365/ZDGAZNABdG669gR9BKigFPB0oIbaxYROiZD1jVUkoDpXjL9JcXHRunjQahMScBT9fdEQgKtJ4FvOrMr9aKXif953RgGF72EyygGJuls0SAWGEKcBYP7XDEKYmIIoYqbWzEdEUUomPgKJgRn8eVl0qpWnFqldn1WrlfnceTRESqhE+Sgc1RHV6iBmoiiR/SMXtGb9WS9WO/Wx6w1Z81nDtEfWJ8/tUKa0w==</latexit>
LEnc
<latexit sha1_base64="hsLCwsvcLoF3ruaKtobkw3+YAU4=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0mKVI8FETx4qGA/oI1ls920SzebsDsRQ6h/xYsHRbz6Q7z5b9zEHLT1wcDjvRlm5nkRZwps+8sorayurW+UNytb2zu7e+b+QVeFsSS0Q0Ieyr6HFeVM0A4w4LQfSYoDj9OeN7vI/N49lYqF4haSiLoBngjmM4JBSyOzen03BPoAMkgvBZlXMozMml23c1jLxClIDRVoj8zP4TgkcUAFEI6VGjh2BG6KJTDC6bwyjBWNMJnhCR1oKnBAlZvmx8+tY62MLT+UugRYufp7IsWBUkng6c4Aw1Qtepn4nzeIwT93UyaiGKh+LV/kx9yC0MqSsMZMUgI80QQTyfStFpliiQnovLIQnMWXl0m3UXea9ebNaa3VKOIoo0N0hE6Qg85QC12hNuogghL0hF7Qq/FoPBtvxvtPa8koZqroD4yPb3jek04=</latexit>
`=LDec
Output probability pt = Softmax(MLP(q )) 2 RV
Sample next word probability wt+1 ⇠ pt .
{gout out
<latexit sha1_base64="BHpDNv4RzJ4vrRNZUsaodzbJscY=">AAACKHicdZDLSsNAFIYn9VbrLerSTbAILkpIilR3Fty4rGAv0NQwmU7aoZMLMydiCXkcN76KGxFFuvVJnLYRtNUDAx//fw5nzu/FnEmwrIlWWFldW98obpa2tnd29/T9g5aMEkFok0Q8Eh0PS8pZSJvAgNNOLCgOPE7b3uhq6rfvqZAsCm9hHNNegAch8xnBoCRXv3RSB+gDiCAdZHffGCWQuamdVUzTrPzrQ+Zkrl62TGtWxjLYOZRRXg1Xf3X6EUkCGgLhWMqubcXQS7EARjjNSk4iaYzJCA9oV2GIAyp76ezQzDhRSt/wI6FeCMZM/TmR4kDKceCpzgDDUC56U/Evr5uAf9FLWRgnQEMyX+Qn3IDImKZm9JmgBPhYASaCqb8aZIgFJqCyLakQ7MWTl6FVNe2aWbs5K9ereRxFdISO0Smy0Tmqo2vUQE1E0CN6Rm/oXXvSXrQPbTJvLWj5zCH6VdrnFzOKqRs=</latexit>
1 , ..., gt }
Xavier Bresson 51
52
Xavier Bresson 52
53
Understanding self-attention
Illustration of self-attention :
Language model during training
Language model at inference
These two representations of
the word “broke” are different.
Classification French
layer
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
Self-attention
layer
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
Word
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit> <latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit> <latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit> <latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit> <latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit> <latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit> <latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit> <latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit> <latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit> <latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit> <latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
embedding layer
I live in Paris. My citizenship is MASK. I eat cheese. Sandy broke the world record. Sandy broke the law.
During training, the network learns to give At inference, the network computes the
attention to the words (the context) that word representation depending on the
make sense to predict the masked word. context.
Xavier Bresson 53
54
Understanding cross-attention
Illustration of cross-attention :
Machine translation
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
Xavier Bresson 54
55
Reception field
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
Layer 2
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
Layer 1
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
<latexit sha1_base64="wnZRiOy9uRjGNvcrK/B4kdblvVc=">AAACIHicbVBNS8NAEN34bfyqevQSLYKnkohYj4IXjxVsKyShbDaTdnGzCbsToYT+FC/+FS8eFNGb/hq3bQ7aOssOj/dmZndelAuu0XW/rIXFpeWV1bV1e2Nza3untrvX0VmhGLRZJjJ1F1ENgktoI0cBd7kCmkYCutH91VjvPoDSPJO3OMwhTGlf8oQziobq1ZqBgAR9O4igz2VJlaLDUSnEyA4OzQnG2Q5AxpVkB4r3Bxj2anW34U7CmQdeBeqkilav9hnEGStSkMgE1dr33BxDMxU5E2DmFhpyyu5pH3wDJU1Bh+VkwZFzbJjYSTJlrkRnwv7uKGmq9TCNTGVKcaBntTH5n+YXmFyEJZd5gSDZ9KGkEA5mztgtJ+YKGIqhAZQpbv7qsAFVlKHx1DYmeLMrz4POacM7b3g3Z/XLk8qONXJAjsgJ8UiTXJJr0iJtwsgjeSav5M16sl6sd+tjWrpgVT375E9Y3z/p3aK8</latexit>
Layer 0
Xavier Bresson 55
56
Hierarchical representation
Multiple layers capture hierarchical
representation.
Layer 2
A simple illustration
Given the distribution of data (Layer 0).
Suppose that the attention context is
defined by the closest data points.
At each layer, the self-attention Layer 1
Xavier Bresson 56
57
Lab 02
PyTorch implementation of Seq2Seq Transformers
LDec
<latexit sha1_base64="85MQDivJuxTSBnH6YaruPNAM3yQ=">AAAB/HicbVBNS8NAEN34WetXtEcvi0XwVJIi1WNBDx48VLAf0May2U7bpZsPdidiCPWvePGgiFd/iDf/jUmbg7Y+GHi8N8PMPDeUQqNlfRsrq2vrG5uFreL2zu7evnlw2NJBpDg0eSAD1XGZBil8aKJACZ1QAfNcCW13cpn57QdQWgT+HcYhOB4b+WIoOMNU6pulm/sewiMqL7kCPi1m6Jtlq2LNQJeJnZMyydHom1+9QcAjD3zkkmndta0QnYQpFFzCtNiLNISMT9gIuin1mQfaSWbHT+lJqgzoMFBp+Uhn6u+JhHlax56bdnoMx3rRy8T/vG6EwwsnEX4YIfh8vmgYSYoBzZKgA6GAo4xTwrgS6a2Uj5liHNO8shDsxZeXSatasWuV2u1ZuV7N4yiQI3JMTolNzkmdXJMGaRJOYvJMXsmb8WS8GO/Gx7x1xchnSuQPjM8faW+TRA==</latexit>
LEnc
<latexit sha1_base64="hsLCwsvcLoF3ruaKtobkw3+YAU4=">AAAB/HicbVBNS8NAEN3Ur1q/oj16CRbBU0mKVI8FETx4qGA/oI1ls920SzebsDsRQ6h/xYsHRbz6Q7z5b9zEHLT1wcDjvRlm5nkRZwps+8sorayurW+UNytb2zu7e+b+QVeFsSS0Q0Ieyr6HFeVM0A4w4LQfSYoDj9OeN7vI/N49lYqF4haSiLoBngjmM4JBSyOzen03BPoAMkgvBZlXMozMml23c1jLxClIDRVoj8zP4TgkcUAFEI6VGjh2BG6KJTDC6bwyjBWNMJnhCR1oKnBAlZvmx8+tY62MLT+UugRYufp7IsWBUkng6c4Aw1Qtepn4nzeIwT93UyaiGKh+LV/kx9yC0MqSsMZMUgI80QQTyfStFpliiQnovLIQnMWXl0m3UXea9ebNaa3VKOIoo0N0hE6Qg85QC12hNuogghL0hF7Qq/FoPBtvxvtPa8koZqroD4yPb3jek04=</latexit>
Xavier Bresson 57
58
Transformers in 2017
Machine Translation
WMT-2014 dataset
BLEU score
CNN
Facebook Research’s Transformer is 3x faster to train than LSTM and CNN.
Convolutional sequence to
sequence learning
Transformer has 24 layers vs LSTM w/ 3 layers and
CNN w/ 40 layers.
Xavier Bresson 58
59
Outline
Language Models
Memory Networks
Transformers
Sequence-To-Sequence Transformers
Transfer Learning
Conclusion
Xavier Bresson 59
60
Xavier Bresson 60
61
Language modeling
Xavier Bresson 61
62
BERT
Bi-directional Encoder Representation for Transformers (Devlin-et-al Google Brain 2019)
Use positional encoding, class index and sentence index.
Trained with two levels of hierarchical context :
Local dependencies with word prediction.
Global dependencies with sequence prediction.
x0 x1 x2 x3
Encoder
Sum Sum Sum Sum
w0 p0 s0 w1 p1 s0 w2 p2 s0 w3 p3 s0
Discrete
Class Embedding
index
CLS0 0 SEN0 the 1 SEN0 cat 2 SEN0 sleeps 3 SEN0
Positional Sentence
Feature index
Xavier Bresson 62
63
x0 x1 x2 x3
w0 p0 s0 w1 p1 s0 w2 p2 s0 w3 p3 s0
Discrete
Class Embedding
index
CLS0 0 SEN0 the 1 SEN0 MASK 2 SEN0 sleeps 3 SEN0
Positional Sentence
Feature index
Xavier Bresson 63
64
x0 x1 x2 x3
w0 p0 s0 w1 p1 s0 w2 p2 s0 w3 p3 s0
Discrete
Embedding
Xavier Bresson 64
65
Training
BERT base
12 Transformers layers
768 hidden features
12 Attention heads
110M parameters
BERT large
340M parameters
Special tokenization of words with only 30K tokens.
Dataset of 3B words
Training took 256 TPU days (Oct 2018)
Fine-tune on sentence classification, named-entity recognition (word classification), Q&A, etc.
Xavier Bresson 65
66
GPT-2
Xavier Bresson 66
67
GPT-3
Xavier Bresson 67
68
GPT-4
Xavier Bresson 68
69
Outline
Language Models
Memory Networks
Transformers
Sequence-To-Sequence Transformers
Transfer Learning
Conclusion
Xavier Bresson 69
70
Conclusion
Human attention mechanism allows to focus biological resources on a small set of important
things (visual, sound, cognitive signals) to make decisions.
ANNs are a generic/universal architecture to process any unstructured datasets, a.k.a. sets.
Attention is “eating” deep learning.
Transformers for Computer Vision with Visual Transformers (Dosovitskiy-et-al Google Brain
2021).
Transformers for Graphs with Graph Transformers.
Issue with long sequences because complexity is (L2d).
Xavier Bresson 70
71
Reducing complexity
Long sequence issue with O(L2d), n being sequence length and d hidden dimension.
Sparse transformers s.a. BigBird (Zaheer-et-al Google Brain 2021).
Original Structured
Transformers Transformers
Xavier Bresson 71
72
Interpretability
What does BERT look at? (Clark-et-al, 2019) An Analysis of Transformer’s attention heads.
⇣ QK T ⌘
<latexit sha1_base64="UVBbZQpecFTQ/Y+KgyzK2fgGTcY=">AAACTXicbVHLahsxFNW4eThuk7jtshsRU0g3ZiaENMuQbgrNIi8nAcsxGo3GEdFjKt1pbcT8YDaB7voX3XSRUEo0jgvN44DgcM693HuP0kIKB3H8M2q8mJtfWGwutV6+Wl5Zbb9+c+JMaRnvMSONPUup41Jo3gMBkp8VllOVSn6aXn6q/dNv3Dph9DFMCj5QdKRFLhiFIA3bGQE+Bqv8kclB0XE1/CdY870iu2K0jkluKfMH+Mv5ceWJ+2rBZ1WFMa7tD5gITRSFizT1h9W53yMgFHd4L1QQ0qoxbHfibjwFfkqSGemgGfaH7R8kM6xUXAOT1Ll+Ehcw8NSCYJJXLVI6XlB2SUe8H6imYeDAT9Oo8PugZDg3NjwNeKr+3+Gpcm6i0lBZr+0ee7X4nNcvId8eeKGLErhm94PyUmIwuI4WZ8JyBnISCGVWhF0xu6AhOwgfUIeQPD75KTnZ6CZb3a2Dzc7OxiyOJnqH1tA6StBHtIM+o33UQwxdoV/oBt1G19Hv6E/09760Ec163qIHaCzeAYRzs7o=</latexit>
Softmaxrow p 2 RL⇥L
d
Xavier Bresson 72
73
Open-source
Xavier Bresson 73
74
Questions?
Xavier Bresson 74