My time has been limited, but I want to make some comments here, first to the e-mail that Noel copied here for us.
That said, all authorship problems are ultimately closed set problems. You cannot test for every single person in the entire history of the world. Instead, you have some candidates and wish to figure out which among them is the most likely culprit.
I want to address this point first. Matt would do very well if he were to pick up some material and read it on the nature of attributing authorship - that is outside the fairly narrow component that is computer assisted questions of stylometry. A very good place to start would be Harold Love's
Attributing Authorship: An Introduction which can be purchased quite reasonably (less than $10.00) in the paperback edition. Of course you cannot test everyone. You could though, quite feasibly, test everyone that you had significant (and confidant) samples of their writings. Isn't that the point of computer assisted studies? Contrary to a lot of the assertions that were made in the original paper, comparisons are much more likely to tell you who could not have been the author than they are to tell you who could have been (and this is good because, of course, generally speaking, at most only one of the proposed authors can usually be the real author of the disputed text, while all of the proposed authors may not be). False positives in these kinds of cases are much more frequent than false negatives. Of course, the method that Criddle originally used would have had to have been modified when adding more authors (the way the vocabulary was chosen was problematic - a single author who didn't use a particular word would eliminate that word from the entire study). As Bruce points out though, testing for simply "some" will always create very misleading results.
One of the interesting things that has occurred in the discussions over the Spalding theory in this and other forums has been a problem that stems from this kind of assumption. Dale, for example, has repeatedly said that his analysis shows similarities that are startlingly close. However, I found the same kind of similarities between the Book of Mormon and other authors - Dale had created a kind of index number, and his index results for individual chapters he suggested was very high. I took another author for comparison, used Dale's same index process, and came back with even better results.
Likewise, I was presented with the startling list of so many consecutive parallels. Where could I find a similar list? In fact, Roger (I am pretty sure it was Roger) suggested that I find two consecutive chapters in the Book of Mormon and any two consecutive chapters in Jules Verne's
Around the World in 80 Days and show a similar list. I did (although I admit, my list was longer).
When talking about parallels, for example, as Love notes (in the book I referenced above), we run into this issue - he notes: "Once we have encountered an unusual expression in the writings of three of four different authors it ceases to have any value for attribution". But, if you only ever look at a small handful of works, every phrase becomes unusual. Presumably, had Criddle used a thousand authors instead of his carefully selected set, the authorship picture would have been completely different. And this would have told him something about his suspicions that he presents early in the paper.
Ben McGuire