Thought Game: Duplicate Driven Programming
In order to produce a piece of software, you don't write unit tests, like in test-driven development, and you don't sit programmers together, like in pair-programming. Instead you have two different teams who, write the exact same software as each other, independently. They probably even write it in two different but similar languages, C# and VB.net being a suitable pair.
And the code is the same from an outsider's point of view: the same functions, the same classes. They somehow share all of that interface information, and agree on that bit. But the inner workings of their functions are completely independent.
Some kind of software tool then automatically develops test cases for the two pieces of code, and determines correct output based not on some user's specification, but based on equality between the two code bases.
If AddNumber(3,7) returns 11 in one code base but it returns 12 in the other code base, then a warning is raised. An email is sent to the developer of each version of the function, something like "Somebody's got this wrong". Maybe you don't get to see what the answer was for the other guy's code. You only get to verify your own function.
Anyway, like I said, whacky idea.
While writing this down... I had a different and maybe not quite so whacky idea about verifying the output of your code.
I wrote it up last night and it'll appear in a moment.
'Matthew Martin' on Wed, 27 Sep 2006 00:19:05 GMT, sez: MS tested SQL Server that way (checking to see if Oracle and DB2 returned the same results as SQL for a given peice of SQL) Now that I think about it, that kind of testing works for XML parsers, CSS, HTML renderers, etc.
'lb' on Wed, 27 Sep 2006 00:23:16 GMT, sez: cheers Matthew. it is a pretty useful way of testing your product against other products for these kinds of problem.
automated html testing of this sort would be strange -- you'd have to test is the documents 'look' the same, i wonder how.
'Eber Irigoyen' on Wed, 27 Sep 2006 00:25:53 GMT, sez: we have been using that method for data entry purposes, is all good until you get to the discrepancies...
'pico' on Wed, 27 Sep 2006 00:27:20 GMT, sez: this is a bit like how you can test an automated translator.
You put in something in english, then convert it to spanish for example. convert it from spanish to english and look how wrong it looks.
it's a simple procedure where the only flaw is that you only know that either A or B failed, you don't know which.
'google do this' on Wed, 27 Sep 2006 00:30:50 GMT, sez: this is the same as how the google image game works. two strangers both say which words match a particular image. you don't get to see each other's guesses -- you only get points when the words match.
'Barry Kelly' on Wed, 27 Sep 2006 01:32:07 GMT, sez: The biggest problem with this is that the problem with software is in specification, not implementation. Indeed, that's at least 50% of the reason test-first, test-driven coding is good: it gets you to design from the perspective of a consumer, rather than an implementer.
'Pitarou' on Wed, 27 Sep 2006 08:39:49 GMT, sez: It's a good idea, and a variant of it is used in real-time safety-critical systems. Parallel teams build *three* implementations, which are deployed in a majority voting context.
Alas, this technique isn't quite as bullet-proof as it might appear. Bugs are not entirely random and, even when working in different languages, parallel teams sometimes make parallel mistakes.
'robC' on Wed, 27 Sep 2006 10:43:59 GMT, sez: I believe this technique is used live in aircraft software and as Pitarou says, the majority vote is used to actually control the aeroplane.
An interesting idea and one that can be translated to reality: I suppose the real world case could be where a tester writes functional tests parallel to the functionality being development – usually different languages, teams etc. If an end user is involved in writing the tests then you get the additional benefit of checking the specification matches real world scenarios as experienced by the end user.
'aaron' on Wed, 27 Sep 2006 12:51:04 GMT, sez: I recall an old QA compatriot of mine sharing a real life story. In the consulting company that we lived, he once had the testing services of an SQL developer that was benched for a time.
This guy (the developer) did something very similar to what you described. Rather than merely testing the app, he wrote his own version of the data-mangling engine. He then simply fed in data, and watched for mismatches.
Fascinating thought. I wonder how it's defect reduction rate would compare to something like inspections? http://www.sei.cmu.edu/str/descriptions/inspections_body.html
'Mike Gale' on Wed, 27 Sep 2006 18:53:42 GMT, sez: I've seen stuff like this done:
1) An app that could use Jet or SQL-Server at the flick of a config switch.
2) Properties of materials (physics, engineering etc.) where you give a tolerance to the match between different implementations. (Which might use different algorithms.)
'Wesley Shephard' on Wed, 27 Sep 2006 18:55:58 GMT, sez: "I believe this technique is used live in aircraft software and as Pitarou says, the majority vote is used to actually control the aeroplane."
This is true: usually three systems are designed with different hardware, operating system and software tools. On any issue, there is a "vote". If all agree, all is well. If one disagrees, it is outvoted. Any disagreements are logged ahd the software is reviewed to see where the flaw is.
Of course, that comes with a lot of cost...
'Skup' on Tue, 03 Oct 2006 08:27:03 GMT, sez: It's always a good idea when doing some optimisation. Test you simple-strait-forward-not-optimized version against your tricky-twisted-minded-optimized version... and see if they return the same result.
'engtech' on Mon, 09 Oct 2006 21:04:18 GMT, sez: Not that wacky of an idea. That's a common approach in hardware verification, you have a golden software model and the RTL hardware model and you compare that the same stimulus gives the same results.
|