The good news: programs are just finite-length strings of bits, so you can enumerate them if you want.
The bad news: proper testing is impossible (not just due to the halting problem), and heuristic testing will pass too many broken programs.
Say, you're testing a sorting algorithm on [], [1,2,3,4,5] and [3,2,5,4,1]. That might otherwise qualify as reasonable test cases. But which of the following two programs would you expect to be randomly generated first?
Code: Select all
for i = 1 to length(input)
for j = 2 to length(input)
if input[i] < input[i-1]
t = input[i]
input[i] = input[i-1]
input[i-1] = t
return input
Code: Select all
if (input == [])
return []
return [1,2,3,4,5]
Any course on theoretical computer science will teach you why brute-force programming is a bad idea. Even if you bypass the halting problem with a timeout, and even if you bypass proper testing with a handful of testcases, your runtime is still exponential against code length, and any code of value is too long for a computer to randomly generate within any reasonable timespan.
Seriously, there are hard upper bounds on the computing capacity of the whole observable universe, and if you do the math, then you're running out of universe before you've tested programs with maybe 50 characters. In some languages, that's not even enough for "hello world".