A nice feature of the internet is that it solves some of the problems it creates!
http://postimage.org/image/g0ykj3bsf/ (4.a3)
http://postimage.org/image/etttid0fj/ (4.e3)
The links above should give you access to two trees, Queen's Indian with (1.d4 Nf6 2.c4 e6 3.Nf3 b6) 4.a3 (Petrosjan) and 4.e3, both from Black's point of view (more or less according to Greet's "Play the Queen's Indian") (sorry, the trees are in German, but that shouldn't cause big difficulties: "K" = King, "D" = Queen, "T" = Rook, "L" = Bishop, "S" = Knight). The trees are generated with
yEd, free accessible over the internet.
One can (hopefully) see that the 4.e3-system (the tree is not finished yet) is much more messy than the Petrosjan, which contains only a limited amount of move order issues (the variations [4.a3 Ba6] 5.e3, 5.Nbd2 and 5.Qc2 Bb7 6.Nbd2 can transpose into each other).
Over time I've developed a sort of coding, some features are: Squares indicate positions within the variation, where there is a branching, whereas circles are sort of final positions of a variation. The colour of the circles code the evaluation of the resulting position. A continous line indicates the lines as I have them in the accompanying Chessbase file, an interrupted line is an alternative move order. Dotted lines are moves that have been played, but quite rarely, so I don't follow them any further.
The lines (or edges of the tree) contain (obviously) the moves and, below that, some figures in parentheses below. These indicate the relative frequency of the moves. Take e.g. the starting position of the Petrosjan-Variation after 4.a3 Bb7: The "2,981" under the knot is the number of times this particular position has been reached, according to Mega 2012 (+ some updates). Now I've given this hole variation a "value" of 1000 / 3 = 333 "points". This figure is somehow arbitrarily chosen: The bigger that "value", the more detailed (and complex) the tree will be in the end. These 333 points are distributed to the possible 5
th moves of White. By far the most common 5
th move in this position is 5.Qc2, played 2,210 times, another move is 5.e3, played 376 times. Distributing the value of 333 to all those 6 moves indicated in the tree in relation to their frequencies yields a value of 247 for 5.Qc2 and 42 for 5.e3 (5.b3 has been played 32 times, which yields a value of 4, which I consider being too low, so I stop here). The total of the values equals 334 (you can check that), the difference to the starting value of 333 is due to rounding inaccuracies.
This way of computing the "values" of the positions is followed through the whole tree. As a general rule, I stop a variation when its value is down to appr. 10 (that is, when the likelihood of this particular position to occur is 10 / 333 = 3%). (I still follow them up if they lead to transpositions.) This way I control the size of my repertoire: Using a "starting value" of 333 and a "stop value" of 10 yields appr. 33 variations (it might be maybe 30 or 37, as in some cases you want to stop earlier, when the variation is not forced at all and rather requires understanding, or you want to stop later, when there are some tactics you don't want to leave out). After having mastered this repertoire, you might think about going into more details, and revise the whole tree, this time with a higher starting value.
Of course, there are many options to do those statistics: You might consider only the games of the last 10 years, or those within a certain "strenght range", or ... . You can also take the preferences of your opponents into account (as far as you know them).
So what are the advantages and disadvantages of this approach? Of course, it takes quite some time to generate the trees. On the other side it enables you to focus on the most likely variations. Going through several opening books this way, I found that it often happens that the author is going into considerable detail in some positions, whereas the tree tells you that it's very unlikely to reach those positions, so just skip them! It also enables you to keep track of move order issues and be forewarned. This approach can also help to master your repertoire: Say there are two or more quite similar positions which require a different approach (= a different move from you), and you always confuse those positions. With the tree, it's easy to pin down those positions (for example, if you're doing opening training with CPT, as I do) and have a deeper look at them to finally understand what makes those positions different.
I hope this post gives you some ideas how to work with trees (at least to make the "fundamental decision" (i) "Hmm, that sounds interesting, I'll give it a try" or (ii) "Oh no!!!!").
Any questions, critics and suggestions are welcome!
Best regards,
Zwischenzugzwang