Latest Updates:
Poll
Poll Question: What is the evaluation of the starting position?



« Created by: Marc Benford on: 06/29/14 at 17:45:05 »
Page Index Toggle Pages: [1] 2 
Topic Tools
Very Hot Topic (More than 25 Replies) What is the evaluation of the starting position? (Read 15548 times)
986
Full Member
***
Offline


I love ChessPublishing.com!

Posts: 105
Location: Germany
Joined: 06/11/05
Re: What is the evaluation of the starting position?
Reply #29 - 07/30/14 at 14:49:50
Post Tools
Quote:
I just have two questions:
What about Fritz's and Rybka's evaluations compared to other engines evaluations? Are they higher or lower and by how much precisely?


I Don't know any studies. From expierence I can say. Rybka 4 evaluations are similar to Houdini's and a little bit higher. Fritz 13 evalutions are much higher than Houdini's.
Quote:
Where did Stockfish's, Fritz's, Rybka's and Komodo's evaluations come from? Are they calibrated on the win expectancy too?

I think in the manual of Fritz says the evalution is measured in pawns.I don't know if they are also calibrated on the win expactancy but I don't think so. Seems to be a special feature of Houdini as he advertise his engine with it.
  
Back to top
 
IP Logged
 
Marc Benford
Full Member
***
Offline


I Love ChessPublishing!

Posts: 101
Joined: 07/17/13
Re: What is the evaluation of the starting position?
Reply #28 - 07/29/14 at 19:34:51
Post Tools
986 wrote on 07/23/14 at 20:48:44:
Houdinis Evaluations are related to the win expectancy in the position. "A +1.00 pawn advantage gives a 80% chance of winning the game against an equal opponent at blitz time control. At +2.00 the engine will win 95% of the time, and at +3.00 about 99% of the time." Houdini Website

For comparing the evaluations there was a test.
"Adam Hair did a linear regression comparing the 44,000+ evaluations and concluded that Stockfish DD is about 1.88x higher than Houdini 4 while Komodo TCECr is 1.18x higher." TCEC Website

So for Houdini the correct and true evaluation is the evaluation which correlates with a winning percentage of ~55%.
But there are some problems... In blitz the white advantage is according to databases higher than in rapid or tournament games, also in human games the white advantage increases when only the best players compete. 

Regards Tom

Thank you! Very interesting info.

I just have two questions:

What about Fritz's and Rybka's evaluations compared to other engines evaluations? Are they higher or lower and by how much precisely?

Where did Stockfish's, Fritz's, Rybka's and Komodo's evaluations come from? Are they calibrated on the win expectancy too?
  
Back to top
 
IP Logged
 
986
Full Member
***
Offline


I love ChessPublishing.com!

Posts: 105
Location: Germany
Joined: 06/11/05
Re: What is the evaluation of the starting position?
Reply #27 - 07/23/14 at 20:48:44
Post Tools
Houdinis Evaluations are related to the win expectancy in the position. "A +1.00 pawn advantage gives a 80% chance of winning the game against an equal opponent at blitz time control. At +2.00 the engine will win 95% of the time, and at +3.00 about 99% of the time." Houdini Website

For comparing the evaluations there was a test.
"Adam Hair did a linear regression comparing the 44,000+ evaluations and concluded that Stockfish DD is about 1.88x higher than Houdini 4 while Komodo TCECr is 1.18x higher." TCEC Website

So for Houdini the correct and true evaluation is the evaluation which correlates with a winning percentage of ~55%.
But there are some problems... In blitz the white advantage is according to databases higher than in rapid or tournament games, also in human games the white advantage increases when only the best players compete. 

Regards Tom

  
Back to top
 
IP Logged
 
ReneDescartes
God Member
*****
Offline


Qu'est-ce donc que je
suis? Une chose qui pense.

Posts: 1236
Joined: 05/17/10
Gender: Male
Re: What is the evaluation of the starting position?
Reply #26 - 07/06/14 at 00:45:20
Post Tools
GeneM wrote on 07/05/14 at 21:05:19:
dfan wrote on 07/04/14 at 13:47:48:
Marc Benford wrote on 07/04/14 at 05:37:41:
the true evaluation of the starting position

My point is that there is no such thing. But I won't belabor it any longer, since the debate is hopeless.

Therefore there are probably many more legal chess positions for which there is no such thing as an "evaluation".

This begs the question DFan:
What make some chess positions subject to meaningful evaluation, whereas other chess positions are not subject to meaningful evaluation?

.


I'd like to explain dfan's point, the thoughts behind which were also in the background of my earlier, satirical post. The existence of computer programs that spout centipawn evaluations and can beat you, and which furthermore are used to find tactical errors or opening weapons, makes some people believe in an objective or absolute true centipawn evaluation. In fact, such a thing is a contradiction in terms.

To see this, consider perfect play in the form of a tablebase, for example the seven-man tablebases just compiled and available by subscription at ChessOK. A computer (without a tablebase) may evaluate a given tablebase position as +.15, or whatever, when the tablebase has backsolved everything to either theoretical win for White, a  theoretical draw, or a theoretical win for Black. These are not numeric outcomes. Even a theoretical draw is not expressible numerically: 0.00 only has a numerical meaning where other numbers are possible. Furthermore, in that context it merely expresses the expectation that White will gather half the tournament points, but says nothing of whether this will occur by drawing.

For the same reason, the king's being on the board has no point value. Other material values (for a knight or queen or pawn, etc.) are used for estimating the relative effectiveness of various material imbalances for the purpose of eventually mating, or preventing the mating, of the king. One cannot use the value of a lost king for this purpose, because it is illegal to have a lost king. Not even an infinite value will do, first because infinity is not a number and second because the object of the game is not to accumulate material advantage. And if your computer uses +1000 as a symbol for mate, that does not make mate exactly 1000 times as useful as a pawn. Mate and a pawn are things of entirely different types that are not comparable. Do not throw out your brain when looking at a computer! Game outcomes are not numeric.

So what is the computer saying when it says White is up by .15? Isn't it saying that White has a positional advantage equivalent to .15 hundredths of a pawn?  --The computer is expressing something that only has meaning relative to imperfect play.

A person or a computer that does not play perfectly may play a  given  pawn-up position 100 times and win 79% of the points (if it's Karpov playing other grandmasters) or 54% of the points (if it's a child playing other children); on the other hand, a tablebase playing another tablebase in that same position will either win all the time, draw all the time, or lose all the time. For Karpov or the child, there is no such thing as a position with a 100% probability of winning, or of drawing or of losing; there are no certainties. For a group of tablebases or theoretically perfect players, there are nothing but certainties. Every position will produce 100% draws, 100% wins, or 100% losses.

Now one might construct a table linking a given numerical advantage to a given average number of tournament or match points garnered, but such a table would only be valid for a given class of imperfect player. Its contents would have to be revised for another who plays differently. This alone is enough to tell you that numeric evaluations are not absolute truths. It is also why different engines' evaluations are indeed in different units--each engine is expressing the estimated outcome of its own treatment of the position.

Furthermore, the very idea of material value is only meaningful for imperfect players, and these meanings are equally dependent on which imperfect players are concerned. One can observe that Black has a queen for two knights, but whether this advantage is less or greater (in expected percentage of tournament points gathered) than that of three pawns depends on the players concerned and on how they handle pieces and pawns. For a tablebase, or for "theoretical" purposes, on the other hand, material advantages do not exist--only the position exists. All one can say objectively is that one side has a queen while the other has two knights and that the position is, for example, 100% drawn.

When MarcBenford jokes that the theoretical maximum ply are sufficient, but still expects a numeric evaluation there, and says that Fritz's units are closer to "true units" than Houdini's, he is contradicting himself. If an evaluation has units, it is not true, and if it is true, it does not have units.

By the way, it's impossible to remove material without creating positional ripples that are not material--and computers' numeric valuations of positional factors vary heavily, with Stockfish, for example often giving positional factors higher material equivalents than Rybka. So much for the experiment of deleting the c-pawn.

--Finally, I would suggest that some posters would get a friendlier response if they did not give orders, or presume to give permissions, to the rest of us.

« Last Edit: 07/06/14 at 18:43:22 by ReneDescartes »  
Back to top
 
IP Logged
 
dfan
God Member
*****
Offline


"When you see a bad move,
look for a better one"

Posts: 766
Location: Boston
Joined: 10/04/05
Re: What is the evaluation of the starting position?
Reply #25 - 07/06/14 at 00:33:22
Post Tools
GeneM wrote on 07/05/14 at 21:05:19:
dfan wrote on 07/04/14 at 13:47:48:
Marc Benford wrote on 07/04/14 at 05:37:41:
the true evaluation of the starting position

My point is that there is no such thing. But I won't belabor it any longer, since the debate is hopeless.

Therefore there are probably many more legal chess positions for which there is no such thing as an "evaluation".

This begs the question DFan:
What make some chess positions subject to meaningful evaluation, whereas other chess positions are not subject to meaningful evaluation?

.

No chess positions are subject to a "true evaluation" that is accurate to within 0.02 "pawns" (whatever that means), except ones that are forced wins or draws.
  
Back to top
 
IP Logged
 
MartinC
God Member
*****
Offline


I Love ChessPublishing!

Posts: 2073
Joined: 07/24/06
Re: What is the evaluation of the starting position?
Reply #24 - 07/05/14 at 21:35:27
Post Tools
Every position has a categorical 'perfect play' evaluation of won/lost/drawn. Then some sort of subjective evaluation based on what the annotator thinks is like to happen in a real game.

What doesn't exist (and was asked for) is a categorical evaluation in terms of computer evaluations, which aren't even on the same scale as each other Smiley
  
Back to top
 
IP Logged
 
GeneM
Senior Member
****
Offline


Tournament winner gets
two fun filled knights!

Posts: 303
Location: near Seattle WA USA
Joined: 01/12/08
Re: What is the evaluation of the starting position?
Reply #23 - 07/05/14 at 21:05:19
Post Tools
dfan wrote on 07/04/14 at 13:47:48:
Marc Benford wrote on 07/04/14 at 05:37:41:
the true evaluation of the starting position

My point is that there is no such thing. But I won't belabor it any longer, since the debate is hopeless.

Therefore there are probably many more legal chess positions for which there is no such thing as an "evaluation".

This begs the question DFan:
What make some chess positions subject to meaningful evaluation, whereas other chess positions are not subject to meaningful evaluation?

.
  

GeneM , CastleLong.com , FRC-chess960
Back to top
WWW  
IP Logged
 
Pale Horse, Pale Rider
Senior Member
****
Offline


I Love ChessPublishing!

Posts: 287
Joined: 12/26/12
Re: What is the evaluation of the starting position?
Reply #22 - 07/05/14 at 09:14:40
Post Tools
Smyslov_Fan wrote on 07/05/14 at 06:05:29:
Yup, statistically, 55-45% difference equates to ~35 elo points. It's pretty straight-forward math.


Correct. Kind of a silly question. Thanks for the link
  
Back to top
 
IP Logged
 
Smyslov_Fan
YaBB Moderator
Correspondence fan
*****
Offline


Progress depends on the
unreasonable man. ~GBS

Posts: 6902
Joined: 06/15/05
Re: What is the evaluation of the starting position?
Reply #21 - 07/05/14 at 06:05:29
Post Tools
Yup, statistically, 55-45% difference equates to ~35 elo points. It's pretty straight-forward math.

Here's a link to an elo table so you can look it up yourself:

http://www.pradu.us/old/Nov27_2008/Buzz/elotable.html


  
Back to top
 
IP Logged
 
kylemeister
God Member
*****
Offline



Posts: 4906
Location: USA
Joined: 10/24/05
Re: What is the evaluation of the starting position?
Reply #20 - 07/04/14 at 22:05:58
Post Tools
I recall Larry Kaufman mentioning 35 points (implied by White's score of about 55%).
  
Back to top
 
IP Logged
 
Pale Horse, Pale Rider
Senior Member
****
Offline


I Love ChessPublishing!

Posts: 287
Joined: 12/26/12
Re: What is the evaluation of the starting position?
Reply #19 - 07/04/14 at 21:44:09
Post Tools
Smyslov_Fan wrote on 07/04/14 at 21:41:18:
From a practical perspective, the first move is worth about 30-35 rating points.


Is that number from a big database or so? Just looked this up for my games OTB in 2013 and 14... more than 100 rating points difference for me  Shocked
  
Back to top
 
IP Logged
 
Smyslov_Fan
YaBB Moderator
Correspondence fan
*****
Offline


Progress depends on the
unreasonable man. ~GBS

Posts: 6902
Joined: 06/15/05
Re: What is the evaluation of the starting position?
Reply #18 - 07/04/14 at 21:41:18
Post Tools
From a practical perspective, the first move is worth about 30-35 rating points.
  
Back to top
 
IP Logged
 
dfan
God Member
*****
Offline


"When you see a bad move,
look for a better one"

Posts: 766
Location: Boston
Joined: 10/04/05
Re: What is the evaluation of the starting position?
Reply #17 - 07/04/14 at 13:47:48
Post Tools
Marc Benford wrote on 07/04/14 at 05:37:41:
the true evaluation of the starting position

My point is that there is no such thing. But I won't belabor it any longer, since the debate is hopeless.
  
Back to top
 
IP Logged
 
MartinC
God Member
*****
Offline


I Love ChessPublishing!

Posts: 2073
Joined: 07/24/06
Re: What is the evaluation of the starting position?
Reply #16 - 07/04/14 at 10:23:15
Post Tools
They're purely relative to other measurements by the same computer. Used for it when judging the best move. Nothing else Smiley

We know fairly precisely what the evaluation of the initial position is from endless actual experience - a ~55-45 edge to white.
  
Back to top
 
IP Logged
 
Marc Benford
Full Member
***
Offline


I Love ChessPublishing!

Posts: 101
Joined: 07/17/13
Re: What is the evaluation of the starting position?
Reply #15 - 07/04/14 at 05:37:41
Post Tools
Quote:
They are not in the same units.
What are these units more precisely? And do Houdini's units correspond better to "1 Pawn = 1.00" than Fritz 12's units? Or is it the other way around?


Humm... Let me try this. When I delete Black's c-Pawn (and White to move), Houdini 3 Pro x64 gives +1.07, while Fritz 12 gives +1.34.
If we just subtract to these evaluations the evaluations that each of these two engines give of the first move advantage, we should get the evaluation that each of these engines give of being up a Pawn with no other advantage for either side.
For Houdini 3 Pro x64 we have: 1 Pawn = +0.89
While for Fritz 12 we have: 1 Pawn = +0.96
Conclusion: Fritz's units are closer to the "true" units than Houdini's units are. And therefore Fritz's evaluation of the starting position (+0.38) is probably closer to the true evaluation of the starting position.
Note: By "true" units I mean the units so that with no other advantage being up a Pawn is equivalent to +1.00


Quote:
What do you mean by "accurate"?
More accurate = closer to the true evaluation of the starting position by using "true" units with 1 Pawn = +1.00 (of course we could always argue that a very accurate evaluation would be 0.00 because chess is a draw at depth infinity like someone else said, but nobody can see up to depth infinity so it would make little sense)


Quote:
You can look at the databases at chessbase.com
I saw that at the bottom right there's something called "Let's Check" with 3 evaluations performed by 3 engines. That's cool, but is there a way to see more evaluations performed by more engines?


Quote:
This one is good, but by what engine were these evaluations done? At what depth? And after how much time?

  
Back to top
 
IP Logged
 
Page Index Toggle Pages: [1] 2 
Topic Tools
Bookmarks: del.icio.us Digg Facebook Google Google+ Linked in reddit StumbleUpon Twitter Yahoo