Jump to content

State–action–reward–state–action: Revision history


For any version listed below, click on its date to view it. For more help, see Help:Page history and Help:Edit summary. (cur) = difference from current version, (prev) = difference from preceding version, m = minor edit, → = section edit, ← = automatic edit summary

(newest | oldest) View (newer 50 | ) (20 | 50 | 100 | 250 | 500)

19 July 2024

1 July 2024

1 June 2024

21 May 2024

13 December 2023

5 December 2023

27 November 2023

9 August 2023

17 May 2023

30 December 2022

1 November 2022

31 October 2022

3 July 2022

2 July 2022

4 May 2022

5 December 2021

29 September 2021

12 July 2021

6 May 2021

4 March 2021

6 January 2021

5 December 2020

21 September 2020

7 July 2020

3 July 2020

3 May 2020

6 February 2020

3 December 2019

26 November 2019

17 October 2019

10 July 2019

5 July 2019

26 February 2019

15 February 2019

9 November 2018

10 July 2018

11 March 2018

9 March 2018

28 February 2018

6 February 2018

3 February 2018

1 February 2018

29 October 2017

  • curprev 10:0910:09, 29 October 20172.242.24.134 talk 4,767 bytes +4 corrected the formular according to: http://incompleteideas.net/sutton/book/ebook/node64.html . r_t or r_{t+1} depends on whether the environment reacts instantaneaously or one time step later. it appears to make more sense to assume a temporal delay. undo

20 September 2017

4 July 2017

(newest | oldest) View (newer 50 | ) (20 | 50 | 100 | 250 | 500)