Latest Updates:
Normal Topic A statistical analysis of amateur endgame position (Read 3186 times)
God Member

"Football is like Chess,
only without the dice."

Posts: 1456
Location: Reading
Joined: 09/22/05
Gender: Male
Re: A statistical analysis of amateur endgame position
Reply #1 - 11/10/17 at 02:50:22
Post Tools
Interesting. I ran a similar but simplistic exercise on my own games that reached 0-1 pieces for each side + pawns, and found RvR in about 55% of them.

Not a huge sample, as only about 10% of my games got that far, but sufficient to point to Rook Endings being worthy of study more that all the others combined!

Those who want to go by my perverse footsteps play such pawn structure with fuzzy atypical still strategic orientations

Clowns to the left of me, jokers to the right, stuck in the middlegame with you
Back to top
IP Logged
YaBB Newbies

I Love ChessPublishing!

Posts: 1
Joined: 10/04/17
A statistical analysis of amateur endgame position
11/10/17 at 01:51:39
Post Tools
Hi all,

First time poster, here. I thought I'd share a large-scale statistical exercise I just ran. (Props to BigGreenShrek for giving me the idea.)

Any thoughts on the protocol or on the results would be much appreciated!


Some endgame books (e.g., de la Villa) recommend focusing on rook endgames, because they show up often in top-level games. This is interesting, but could be of limited interest to amateur players, whose games often play out very differently than GM games. Other endgame books (e.g., Silman) present the Philidor and Lucena position, but label most other rook endgame material as "expert-level content". Unfortunately, Silman doesn't really ofter a justification for that choice, and I'm not sure the endgames he presents first are really most "useful"/frequent.


Which endgames do amateur players encounter most frequently? Which endgames should they study?

Data: publishes all the games played on their server as PGN files. For instance, the September 2017 file includes 12,564,109 games. My computer is still running, because it takes forever to process that amount of games. Currently, I've identified and analyzed nearly 150,000 "relevant" games that include "proper" endgame positions. I think this is close to the point of diminishing returns, since results don't seem to change much as I add new games.


Here are the criteria I used to identify and extract endgame positions (do you have any suggestions to improve this?):

* Classical games only
* The game needs to include at least 40 moves by each player
* The position needs to have stayed on the board for at least 4 half-moves
* Maximum amount of material per player: 13 (Q=10,R=5,B=3,N=3,P=1)
* Maximum of 3 pawns per side
* No player has more than 2 pieces on the board (excluding king and pawns)
* No overwhelming material advantage (max difference: 4)


In amateur games, Rook endgames are absolutely dominant! If my statistical analysis is correct, amateurs should spend most of their endgame study "budget" looking at rooks, and it's not even close.

67% of games include an endgame position with rook(s)
38% of games include an endgame position with bishop(s) (with or without rook(s))
18% of games include an endgame position with bishop(s) (without rooks)
31% of games include an endgame position with knight(s) (with or without rook(s))
15% of games include an endgame position with knight(s) (without rooks)

Only 37% of all endgame positions in my database do not include a rook.

Here are the first 50 endgame positions, with the % of games in which they are found (p+ means 2 or more pawns).

Select All
pieces                games_share
   rp+ vs. rp+        14.8
    rp vs. rp+        14.2
      p vs. p+        11.3
     p+ vs. p+        10.9
     r vs. rp+         7.2
      r vs. rp         6.2
  brp+ vs. rp+         5.3
    bp+ vs. p+         5.1
       p vs. p         4.9
     rp vs. rp         4.8
    p+ vs. rp+         4.4
  nrp+ vs. rp+         4.3
    np+ vs. p+         4.0
   bp+ vs. rp+         3.6
    p+ vs. qp+         3.2
   np+ vs. rp+         3.0
     p vs. rp+         3.0
   bp+ vs. bp+         2.8
   bp+ vs. np+         2.7
     p+ vs. rp         2.5
       r vs. r         2.4
      p+ vs. r         2.3
   qp+ vs. qp+         2.3
     p vs. qp+         2.2
     bp vs. p+         2.1
 brp+ vs. nrp+         2.1
    bp vs. bp+         2.0
      bp vs. p         2.0
      p vs. qp         2.0
 brp+ vs. brp+         1.9
      p vs. rp         1.9
   brp vs. rp+         1.8
     np vs. p+         1.8
 2rp+ vs. 2rp+         1.7
     bp+ vs. p         1.7
       p vs. q         1.7
    qp vs. qp+         1.7
  2rp+ vs. rp+         1.6
      np vs. p         1.6
   np+ vs. np+         1.6
       p vs. r         1.6
     p+ vs. qp         1.6
 2rp+ vs. brp+         1.5
   brp+ vs. rp         1.5
   nrp vs. rp+         1.5
    bp vs. rp+         1.4
    bp+ vs. rp         1.4
  qrp+ vs. rp+         1.4
    np vs. np+         1.3
   p+ vs. qrp+         1.3

Back to top
IP Logged
Bookmarks: Digg Facebook Google Google+ Linked in reddit StumbleUpon Twitter Yahoo