Talk:Gated recurrent unit

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Fully gated unit picture[edit]

Unless I am mistaken, the picture given for the fully gated recurrent unit does not match up with the equation in the article for the hidden state. The 1- node should connect to the product of the output of tanh, not the product with the previous hidden state. In other words, instead of the 1- node being on the arrow above z[t], it should be on the arrow to the right.

--ZaneDurante (talk) 18:21, 2 June 2020 (UTC)[reply]

Yes, you are right! I also noticed this already in 2016 when I prepared lecture slides based on the formulas and this picture. They do not match. 193.174.205.82 (talk) 14:56, 18 January 2023 (UTC)[reply]

Article requires clarification[edit]

Is not clear on the article how the cell connects to another cell, to his own layer, or to what else it connects.

Remove CARU section?[edit]

Lots of publicity for a paper by Macao authors from a Macao IP address, with limited relevance for the GRU article. 194.57.247.3 (talk) 11:45, 28 October 2022 (UTC) Than Please describe what is y_hat(t) in the figure (it does not appear in equations) — Preceding unsigned comment added by Geofo (talkcontribs) 11:15, 29 August 2023 (UTC)[reply]

$z$ or $1-z$?[edit]

Why does this article have $h_t=(1-z_t) \odot h_{t-1} + z_t \odot \hat{h}_t$? The original paper (reference [1]) has h_t = z_t \odot h_{t-1} + (1-z_t) \odot \hat{h}_t, which is also the convention used by PyTorch (see this page) and tensorflow (not documented in the obvious place, but clear if you write some code to test it.) — Preceding unsigned comment added by Neil Strickland (talkcontribs) 23:19, 28 January 2024 (UTC)[reply]