There are at least three different topics here:

1. Compute primes less than n. The fastest algorithm is a well-optimized segmented Sieve of Eratosthenes, and the fastest implementation is

primesieve as lorb pointed out. While missing lots of optimizations used in more production code, he has a nice example of how the

segmented sieve works.

A version of his demonstration segmented sieve using 1 bit per integer takes about 20 minutes to reach 10^12, with very little memory use. The fastest programs I know of at that size are yafu and primesieve which take under 5 minutes. Bernstein's primegen and Perl/ntheory take a bit under 10 minutes. Memory requirements for each are quite small. Perl/ntheory can generate

and print primes up to 10^12 in 24 minutes on my machine. Note how even with specialized print routines, the output dominates the time required. This is especially true for primesieve which has faster sieving and much slower printing.

While people who read only the first paragraph of the Wikipedia page like to point out the

Sieve of Atkin, in practice it is slower than a similarly optimized Sieve of Eratosthenes. Most of the reasons why are explained in detail in the page. In the example above, even primegen (a heavily optimized SoA implementation) is half the speed of the top two SoE implementations, and at 10^13 Perl/ntheory (using a segmented SoE) remains 2x slower but primegen now takes 4x longer. That is, it's slowing down relative to the SoEs.

2. Computing primes in a small window [a,b]. Lots of programs will do this, and significantly faster than computing them all to b. primesieve takes only a couple milliseconds to get primes +/- 10000 around 10^12. Again memory use is negligible at this size. Perl/ntheory also takes only a few milliseconds to sieve and print them. The idea is very similar to a segmented sieve.

At some threshold for a, you'll not want to do a full sieve any more, especially if the window b-a is small. This becomes obvious for examples like 10^300 to 10^300+10000. You certainly don't want to fully sieve that interval, but probably do a partial sieve followed by a primality test. Perl/ntheory takes 0.2 seconds to give the 16 primes in that range. Pari/GP takes 0.5 seconds with ispseudoprime or 28 seconds with isprime.

3. prevprime / nextprime. You certainly don't want to sieve all numbers to n just to find the previous prime of n. At some size of n you will want to use partial sieving, but that's somewhere in the 200-bit range. Until then, you use a simple wheel (e.g. mod-6 or mod-30) to skip obvious small multiples and use a fast primality test. That is the method used by Pari/GP and Perl/ntheory, among others.

You will never want to use the AKS test. It's horribly slow. Again, read the whole Wikipedia article and you find "While the algorithm is of immense theoretical importance, it is not used in practice. For 64-bit inputs, the Baillie–PSW primality test is deterministic and runs many orders of magnitude faster." Even

trial division is faster for 64-bit inputs, for goodness sake. At some point around 10^25 AKS does beat trial division, but it will effectively never beat APR-CL or ECPP. Caveat that this with the current state of the art -- new research could change things, albeit AKS proper probably won't be it: Lenstra/Pomerance (2011) "A fundamentally new idea would be required to obtain a deterministic primality testing algorithm with run time exponent smaller than 6."

Discussing optimal primality tests is another few paragraphs, and there are lots of interesting implementation details. In short for 64-bit you want some simple pretests (e.g. divisibility by a few tiny primes), then either deterministic Miller-Rabin (at most 7 tests required) or BPSW.

Optimized code for this takes about 1

microsecond at 10^12. At 10^18 or so, average of about 2 microseconds.