g-dog wrote on 09/20/10 at 19:42:18:
The methods used by chess.com staff to determine engine-using cheaters isn't known. They may well have a Top 3 matching kind of analysis going on, and then something else too. They also have an IM who leads their detection efforts; he may eyeball games for non-human seeming moves.
The Dembo analysis results you see come from members of chess.com, not staff.
I was one of those who ran her games using Batch Analyzer (BA), the same software RHP games mods use for their cheat detection. The results of her 20-game batch exceeded the human thresholds+5% that I have derived after analyzing in total >13000 moves from WC otb matches, a few modern super GM tournament performances, and importantly, the top finishers in the 5th-11th ICCF WC Finals (>7700 moves)
For all benchmark games and all games of any suspected player, the same hardware, software, and BA settings are used.
All benchmark games have their database moves screened out from analysis using MegaBase 2010 and MegaCorr 4, with the DBs being rolled back to the time of the benchmark games.
My human thresholds based on the benchmark games are 61/78/85%. Add 5% to account for error and 66/83/90%=blatant engine use.
Her results:
{ YelenaDembo (Games: 20) }
{ Top 1 Match: 493/694 ( 71.0% )
{ Top 2 Match: 609/694 ( 87.8% )
{ Top 3 Match: 644/694 ( 92.8% )
{ Top 4 Match: 660/694 ( 95.1% )
Database moves were determined using MegaBase 2010 and MegaCorr 4 and screened out from analysis.
3 other BA users had the same results with the same games using varying engines and hardware. She exceeded the heritage human thresholds +5% of 65/80/90% in all cases.
The games were the 18 most recent vs. 2200+ opposition, plus 2 more most recent vs near 2200 to reach the 20 game batch size.
Insofar as engine use can be determined by a Top 3 analysis, she has been caught.
She apparently used engines freely during her games. She's a cheater and I wouldn't dare bet against it.
Is she 100% guilty? I wouldn't say that, but we are pretty close in my opinion.
There are, after all, other ways the data can be held up to the light.
Pretty close?
100% certain? This is a
statistical problem, you understand? So what is the model, what are the statistics, and what is the degree of confidence with which your conclusion about Dembo was obtained? If you don't understand the relevance of this information you have no business drawing definite conclusions that could have consequences for IM Dembo outside chess.com.
E.g., I suggest you produce a random set of 2,000 20-game sets from games of persons rated in Dembo's class and who play on chess.com, not necessarily all games in a given 20-game set played by the same player, and see what percent of the 2,000 trigger your +5% criterion.
What exactly are your "thresholds?" Means, percentiles of given distributions, what?
Also there is something funny about your sample including ICCF games, since ICCF games are played with full-blown computer assistance with a fairly high degree of probability. You seem to be saying that Dembo is relying on an engine even more than people who rely upon engines?!
Would you care to state the exact engine and think time that you're using to establish a set of machine preferences? Any experience using other engines?