## Prime Factors and Proper Divisors in Python

A place to discuss the implementation and style of computer programs.

Moderators: phlip, Moderators General, Prelates

### Prime Factors and Proper Divisors in Python

For my own personal use, I wrote a Python script that, given an integer, will give me its prime factorization as well as a list of its proper divisors. I'm interested in two things here: 1. Can I be sure that the answer it gives is correct? 2. Are there any obvious optimizations I'm missing?

First I wrote the function to do the prime factorization:

Code: Select all
`def factorize(n):   if n < -1: return [[-1, 1]] + factorize(-n)   elif n == -1: return [[-1, 1]]   elif n == 0: return [[0, 1]]   elif n == 1: return [[1, 1]]   else:      factors = []      divisor = 2      while divisor <= n:         power = 0         while (n % divisor) == 0:            n //= divisor            power += 1         if power > 0:            factors.append([divisor, power])         divisor += 1      return factors`

The enemy of this function, of course, is a large prime factor. This morning I realized I could reduce that threat using sqrt(n):

Code: Select all
`import mathdef isqrt(n):   return int(math.floor(math.sqrt(float(n))))def factorize(n):   if n < -1: return [[-1, 1]] + factorize(-n)   elif n == -1: return [[-1, 1]]   elif n == 0: return [[0, 1]]   elif n == 1: return [[1, 1]]   else:      factors = []      divisor = 2      while divisor <= isqrt(n):         power = 0         while (n % divisor) == 0:            n //= divisor            power += 1         if power > 0:            factors.append([divisor, power])         divisor += 1      if n > 1:         factors.append([n, 1])      return factors`

This works very well, but is it guaranteed to be correct?

Next is getting the proper divisors. My first version used a rather naive method:

Code: Select all
`def divisors(n):   if n < -1: return divisors(-n)   elif n == -1: return [1]   elif n == 0: return []   elif n == 1: return [1]   else:      divs = []      rdivs = []      for divisor in range(1, isqrt(n)+1):         if (n % divisor) == 0:            divs.append(divisor)            rdivisor = n // divisor            if rdivisor != divisor:               rdivs.append(rdivisor)      rdivs.reverse()      return divs + rdivs`

The running time of that, of course, is O(sqrt(n)) in terms of the value, or O(2^(n/2)) in terms of the number of bits. Which is terrible. Then I realized I could use the results of the last function to dramatically improve on this, by iterating through all unique permutations of the prime factors (thank you Fundamental Theorem of Arithmetic):

Code: Select all
`def divisorsFromFactors(factors):   if not factors: return [1]   else:      base = factors[0][0]      if base == -1: return divisorsFromFactors(factors[1:])      elif base == 0: return []      elif base == 1: return divisorsFromFactors(factors[1:])      else:         divs = divisorsFromFactors(factors[1:])         alldivs = []         for power in range(0, factors[0][1]+1):            alldivs += map(lambda x: x * base ** power, divs)         alldivs.sort()         return alldivs`

The number of prime factors for a number n is, on average, on the order of log log n, so this function would have an average running time of something like O(2^(log log n)) or O(log n) in terms of the value or O(2^(log log(2^n))) = O(n) in terms of the number of bits, right? Much, much better.

Now, to use these functions. I want to see the prime factorization using only multiplication (2*2*2*3*3) as well as using both multiplication and exponentiation (2^3 * 3^2), so I do this:

Code: Select all
`import sysn = int(sys.argv[1])f = factorize(n)f2a = map(lambda x: str(x[0]) if x[1] == 1 else "^".join(map(str, x)), f)f2s = " * ".join(f2a)print str(n) + " = " + f2sf1a = map(lambda x: [x[0]] * x[1], f)f1b = reduce(lambda x,y: x + y, f1a, [])f1s = " * ".join(map(str, f1b))print str(n) + " = " + f1s`

Then I can get the proper divisors, as well as their sum to determine whether the number is prime, deficient, perfect, or abundant:

Code: Select all
`d = divisorsFromFactors(f)d1s = "N/A" if not d else ", ".join(map(str, d))print "Divisors: " + d1sd2a = reduce(lambda x,y: x + y, d, 0)d2b = d2a - abs(n)d2c = d2b - abs(n)d2s = "zero" if d2a == 0 else "unit" if d2b == 0 else "prime" if d2b == 1 else "deficient" if d2c < 0 else "perfect" if d2c == 0 else "abundant"print "sigma_0(n) = " + str(len(d))print "sigma_1(n) = " + str(d2a)print "sigma_1(n)-n = " + str(d2b)print "sigma_1(n)-2n = " + str(d2c)print str(abs(n)) + " is " + d2s + "."`

And for my last trick, I'll use the proper divisors to print out all the ways to multiply two numbers to get n (for making tables with exactly n cells, or images with exactly n pixels, or whatever):

Code: Select all
`d3a = map(lambda x: str(d[x]) + "*" + str(d[len(d)-x-1]), range(0, (1 + len(d)) // 2))d3s = "0*0" if not d3a else "\t".join(d3a)print "Pairs:"print d3s`

So what does everyone think? Is it all correct?

Here's the script in whole:

Code: Select all
`#!/usr/bin/env pythonimport sysimport mathdef isqrt(n):   return int(math.floor(math.sqrt(float(n))))def factorize(n):   if n < -1: return [[-1, 1]] + factorize(-n)   elif n == -1: return [[-1, 1]]   elif n == 0: return [[0, 1]]   elif n == 1: return [[1, 1]]   else:      factors = []      divisor = 2      while divisor <= isqrt(n):         power = 0         while (n % divisor) == 0:            n //= divisor            power += 1         if power > 0:            factors.append([divisor, power])         divisor += 1      if n > 1:         factors.append([n, 1])      return factorsdef divisorsFromFactors(factors):   if not factors: return [1]   else:      base = factors[0][0]      if base == -1: return divisorsFromFactors(factors[1:])      elif base == 0: return []      elif base == 1: return divisorsFromFactors(factors[1:])      else:         divs = divisorsFromFactors(factors[1:])         alldivs = []         for power in range(0, factors[0][1]+1):            alldivs += map(lambda x: x * base ** power, divs)         alldivs.sort()         return alldivsn = int(sys.argv[1])f = factorize(n)f2a = map(lambda x: str(x[0]) if x[1] == 1 else "^".join(map(str, x)), f)f2s = " * ".join(f2a)print str(n) + " = " + f2sf1a = map(lambda x: [x[0]] * x[1], f)f1b = reduce(lambda x,y: x + y, f1a, [])f1s = " * ".join(map(str, f1b))print str(n) + " = " + f1sprintd = divisorsFromFactors(f)d1s = "N/A" if not d else ", ".join(map(str, d))print "Divisors: " + d1sd2a = reduce(lambda x,y: x + y, d, 0)d2b = d2a - abs(n)d2c = d2b - abs(n)d2s = "zero" if d2a == 0 else "unit" if d2b == 0 else "prime" if d2b == 1 else "deficient" if d2c < 0 else "perfect" if d2c == 0 else "abundant"print "sigma_0(n) = " + str(len(d))print "sigma_1(n) = " + str(d2a)print "sigma_1(n)-n = " + str(d2b)print "sigma_1(n)-2n = " + str(d2c)print str(abs(n)) + " is " + d2s + "."printd3a = map(lambda x: str(d[x]) + "*" + str(d[len(d)-x-1]), range(0, (1 + len(d)) // 2))d3s = "0*0" if not d3a else "\t".join(d3a)print "Pairs:"print d3s`
Stephen Hawking: Great. The entire universe was destroyed.
Fry: Destroyed? Then where are we now?
Al Gore: I don't know. But I can darn well tell you where we're not—the universe!

RebeccaRGB

Posts: 336
Joined: Sat Mar 06, 2010 7:36 am UTC
Location: Lesbians Love Bluetooth

### Re: Prime Factors and Proper Divisors in Python

The obvious optimization is to test as factors only prime numbers. You can write a sieve of some sort and pull primes incrementally from it. Besides that, I've only see some nit-picky things: use xrange instead of range, etc. Also, in divisorsFromFactors, consider using an inner function and wrapping the whole thing in a sort, instead of sorting on each recursion.
Ben-oni

Posts: 276
Joined: Mon Sep 26, 2011 4:56 am UTC

### Re: Prime Factors and Proper Divisors in Python

I don't have time to go through your code now, but it seems correct.

One obvious optimization is to compare divisor squared to n, rather than divisor to root n (to avoid unnecessary computation of roots). Also, once you've tested 2, you can jump forward from 3 by 2s (3,5,7..), since all other primes are odd. There are other tweaks, but those are the most important, I think. Apart from that, there are various more efficient algorithms (based on more complicated maths): - you can find a summary in Wikipedia (under integer factorization). Elliptic curve factorization is supposed to be best (for large numbers at least, not sure about small numbers), but I've never implemented it myself.

For a simple algorithm, what I tend to do is, write a generator for prime numbers and keep already-generated primes cached somewhere, so you've got a list of primes and when you need to extend it, you just run the generator some more (you can build memoization - ie. caching - into the generator if you like). Then instead of iterating through all the integers looking for divisors, I just iterate through the primes.

Here's some code I dug up for illustration (pretty sure its correct, but there may be functions missing):
Code: Select all
`from itertools import countdef prime_iter():    yield 2    candidates = count(3,2)    for candidate in candidates:        if candidate%2 == 0: continue        n = 3        while n*n <= candidate:            if candidate%n == 0:                break            n += 2        else:            yield candidate# memoization decoratordef memoize(f):   cache = {}   def memoized(*x, **k):      if x in cache:         return cache[x]      else:         result = f(*x, **k)         cache[x] = result         return result   return memoized@memoizedef factorize_rec(n):   "Recursive factorization function, suitable for memoization"   if n==1:      return {}        for p in prime_iter():      if p*p > n:         return {n:1}      if n%p == 0:         return add_counts({p:1}, factorize_rec(n/p))def sigma_fn(n):     "sigma function: returns sum of divisors"    fzn = factorize_rec(n)    return product((sum((p**j for j in xrange(i+1))) for p,i in fzn.items()))def s(n):    "s function: returns aliquot sum (sum of proper divisors)"    return sigma_fn(n) - n`

In general, I'd recommend breaking your code up into small functions and choosing more meaningful names - you'll appreciate it if you ever have to read your code again after not looking at it for months.

edit: tidied up code a bit(added missing import, corrected function name, retrieved lost docstring mislaid in middle of function, etc...)
troyp

Posts: 483
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

### Re: Prime Factors and Proper Divisors in Python

troyp wrote:One obvious optimization is to compare divisor squared to n, rather than divisor to root n (to avoid unnecessary computation of roots).

What? No!

Square roots are fast to calculate, and only need to be found once (well... make sure you're only doing it once!), whereas multiplying to find the square would have to be done on each iteration.
Ben-oni

Posts: 276
Joined: Mon Sep 26, 2011 4:56 am UTC

### Re: Prime Factors and Proper Divisors in Python

Project euler has a lot of problems along these lines. Maybe you could complete a few and have a look at the solutions posted in the forums?

Mat

Posts: 401
Joined: Fri Apr 21, 2006 8:19 pm UTC
Location: London

### Re: Prime Factors and Proper Divisors in Python

I've implemented troyp's two obvious optimizations: divisor * divisor <= n instead of divisor <= isqrt(n) (I am recalculating on each iteration, because n may have gotten smaller, so this is worth it), and testing only odd numbers (with 2 a separate case). This has gotten the time for factoring 88888888888888888 down from 2.4s to 0.6s.

(I've also implemented two of Ben-oni's suggestions: xrange instead of range, and sorting only at the end of divisors_from_factors instead of after each recursion. This doesn't give me a dramatic improvement, but while I'm improving the code I might as well.)

Here's what I have now:

Code: Select all
`#!/usr/bin/env pythonimport sysdef factorize(n):   if n < -1: return [[-1, 1]] + factorize(-n)   elif n == -1: return [[-1, 1]]   elif n == 0: return [[0, 1]]   elif n == 1: return [[1, 1]]   else:      def factor_out(n, divisor):         power = 0         while (n % divisor) == 0:            n //= divisor            power += 1         return (n, power)      factors = []      n, power = factor_out(n, 2)      if power > 0:         factors.append([2, power])      divisor = 3      while divisor * divisor <= n:         n, power = factor_out(n, divisor)         if power > 0:            factors.append([divisor, power])         divisor += 2      if n > 1:         factors.append([n, 1])      return factorsdef divisors_from_factors(factors):   def unsorted_divisors_from_factors(factors):      if not factors: return [1]      else:         base = factors[0][0]         if base == -1: return unsorted_divisors_from_factors(factors[1:])         elif base == 0: return []         elif base == 1: return unsorted_divisors_from_factors(factors[1:])         else:            divs = unsorted_divisors_from_factors(factors[1:])            alldivs = []            for power in xrange(0, factors[0][1]+1):               alldivs += map(lambda x: x * base ** power, divs)            return alldivs   alldivs = unsorted_divisors_from_factors(factors)   alldivs.sort()   return alldivsn = int(sys.argv[1])f = factorize(n)f2a = map(lambda x: str(x[0]) if x[1] == 1 else "^".join(map(str, x)), f)f2s = " * ".join(f2a)print str(n) + " = " + f2sf1a = map(lambda x: [x[0]] * x[1], f)f1b = reduce(lambda x,y: x + y, f1a, [])f1s = " * ".join(map(str, f1b))print str(n) + " = " + f1sprintd = divisors_from_factors(f)d1s = "N/A" if not d else ", ".join(map(str, d))print "Divisors: " + d1sd2a = reduce(lambda x,y: x + y, d, 0)d2b = d2a - abs(n)d2c = d2b - abs(n)d2s = "zero" if d2a == 0 else "unit" if d2b == 0 else "prime" if d2b == 1 else "deficient" if d2c < 0 else "perfect" if d2c == 0 else "abundant"print "sigma_0(n) = " + str(len(d))print "sigma_1(n) = " + str(d2a)print "sigma_1(n)-n = " + str(d2b)print "sigma_1(n)-2n = " + str(d2c)print str(abs(n)) + " is " + d2s + "."printd3a = map(lambda x: str(d[x]) + "*" + str(d[len(d)-x-1]), xrange(0, (1 + len(d)) // 2))d3s = "0*0" if not d3a else "\t".join(d3a)print "Pairs:"print d3s`

I'm guessing to improve it any further I'll need to implement a sieve and only iterate over prime numbers. And past that, go for elliptic curves.
Stephen Hawking: Great. The entire universe was destroyed.
Fry: Destroyed? Then where are we now?
Al Gore: I don't know. But I can darn well tell you where we're not—the universe!

RebeccaRGB

Posts: 336
Joined: Sat Mar 06, 2010 7:36 am UTC
Location: Lesbians Love Bluetooth

### Re: Prime Factors and Proper Divisors in Python

RebeccaRGB wrote:1. Can I be sure that the answer it gives is correct?

Yes! You can explicitly test it.

Suppose X is a divisor of Y. Then there must be a k such that X*k = Y.

Calculate k := Y/X. Then, either round down (ie, do integer division), or test if k is indeed an integer. Really, I'd round k to the nearest integer (to deal with the possibility that Y/X was done via floating point math and ended up a tad off or something).

Now, test X*k. Does it equal Y? If you want to be paranoid, find the sign of Y-X*k, then move k in that direction until you find two integers k_0 and k_1 such that X*k_0 < Y < X*k_1, with k_1 = k_0+1.

In short, you can ensure your answers are correct by checking them. Writing code to check them is a good idea, because many problems are far easier to check for a valid answer than they are to actually answer.

Btw, writing a prime sieve is surprisingly easy.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

Yakk
Poster with most posts but no title.

Posts: 10210
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

### Re: Prime Factors and Proper Divisors in Python

Ben-oni wrote:
troyp wrote:One obvious optimization is to compare divisor squared to n, rather than divisor to root n (to avoid unnecessary computation of roots).

What? No!

Square roots are fast to calculate, and only need to be found once (well... make sure you're only doing it once!), whereas multiplying to find the square would have to be done on each iteration.

As RebeccaRGB said, if you're factorizing, the number you're square rooting is going to change every time you find a factor. If you mean in the context of a prime generating algorithm...well, squaring is faster for small primes. It is slower for larger primes, but only if you're calculating the squares again for every prime. As long as you're willing to keep a sqrt(p)-sized list of squares cached, it's still significantly faster. Those were my results, anyway. (Besides, let's face it, it's just a bit ugly bringing these crass reals into a routine full of pristine integers.)

@RebeccaRGB: I know you're you're not using this code anymore, but I thought I'd mention this anyway:
Code: Select all
`def isqrt(n):   return int(math.floor(math.sqrt(float(n))))`

Well, first off, you don't need to convert to an integer: you can just compare the divisors to the floating point square root (so just "from math import sqrt" would do). But assuming you did want the integer part of a square root, all you need is "return int(math.sqrt(n))"

You reasoning seems to be that each function has a very specific "type signature" and can't take any other types: "sqrt" takes an int and returns a float, so you have to convert to float before you use it. "floor" takes a float and returns a float (ending with ".0"), so you have to convert to an *actual* int afterwards.

Python's not that fussy about types. The Python philosophy is more: "if this argument can actually do what the function tells it to do, then the function should accept it as an argument". So math functions will usually work with either floats or ints.
caution: in Python 2, the division operator "/" works on both floats and ints, but if both args are ints, it does integer division. So 5.0/2 or 5/2.0 gives 2.5, but 5/2 gives 1. (This has changed in Python3 where "/" always does floating point division, and "//" does integer division
troyp

Posts: 483
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

### Re: Prime Factors and Proper Divisors in Python

RebeccaRGB wrote:I've implemented troyp's two obvious optimizations: divisor * divisor <= n instead of divisor <= isqrt(n) (I am recalculating on each iteration, because n may have gotten smaller, so this is worth it), and testing only odd numbers (with 2 a separate case). This has gotten the time for factoring 88888888888888888 down from 2.4s to 0.6s.

It would be better to recalculate the root each time n changes.
Ben-oni

Posts: 276
Joined: Mon Sep 26, 2011 4:56 am UTC

### Re: Prime Factors and Proper Divisors in Python

Mat wrote:Project euler has a lot of problems along these lines.

Yes, I have heard of this, but I haven't done it before. I feel I'll have to now.

Yakk wrote:
RebeccaRGB wrote:1. Can I be sure that the answer it gives is correct?

Yes! You can explicitly test it.

I've added this little bit, and now I can feel confident that the program will continue to work correctly even as I tweak it further:

Code: Select all
`from time import clockif argv[1] == "test":   start = clock()   n = 0   while True:      f = factorize(n)      fa = map(lambda x: x[0] ** x[1], f)      fb = reduce(lambda x,y: x * y, fa, 1)      if fb != n:         print "FACTORIZE FAILED AT " + str(n)      d = divisors_from_factors(f)      da = map(lambda x: d[x] * d[len(d)-x-1], xrange(0, len(d)));      for db in da:         if db != n:            print "DIVISORS FAILED AT " + str(n)      f = factorize(-n)      fa = map(lambda x: x[0] ** x[1], f)      fb = reduce(lambda x,y: x * y, fa, 1)      if fb != -n:         print "FACTORIZE FAILED AT " + str(-n)      d = divisors_from_factors(f)      da = map(lambda x: d[x] * d[len(d)-x-1], xrange(0, len(d)));      for db in da:         if db != n:            print "DIVISORS FAILED AT " + str(-n)      if (n % 10000) == 0:         print "up to " + str(n) + " at " + str(clock() - start) + "s"      n += 1`

troyp wrote:Well, first off, you don't need to convert to an integer: you can just compare the divisors to the floating point square root (so just "from math import sqrt" would do). But assuming you did want the integer part of a square root, all you need is "return int(math.sqrt(n))"

Aha. I had fallen under the impression that Python was stricter than it was because it wouldn't implicitly convert numbers to strings for me e.g. when doing string concatenation. But of course that's no reason to believe it would be so strict for numeric types.

Ben-oni wrote:It would be better to recalculate the root each time n changes.

You're actually right, now that I've done it properly. The key word here is changes; only recalculate the square root when a prime factor has been found, and thus n has changed. This brings the time to factor 12345678901234567 from 11.9s down to 11.7s, and the factorization of all whole numbers <= 200000 from 12.59s down to 12.54s. Not too significant a speedup, but a speedup nonetheless.
Stephen Hawking: Great. The entire universe was destroyed.
Fry: Destroyed? Then where are we now?
Al Gore: I don't know. But I can darn well tell you where we're not—the universe!

RebeccaRGB

Posts: 336
Joined: Sat Mar 06, 2010 7:36 am UTC
Location: Lesbians Love Bluetooth

### Re: Prime Factors and Proper Divisors in Python

RebeccaRGB wrote:(I've also implemented two of Ben-oni's suggestions: xrange instead of range, and sorting only at the end of divisors_from_factors instead of after each recursion. This doesn't give me a dramatic improvement, but while I'm improving the code I might as well.)

xrange might be slightly faster than range (or it might actually be slower for very small ranges), but it's main benefit is that it is more economical with memory, since it only generates the values as they are needed rather than creating a whole list of numbers.

I've only had a quick look at your code, and it mostly looks ok, but you might consider changing the prime factor list to a list of tuples rather than a list of lists. Always use tuples for immutable sequences - they are more efficient than lists, both the interpreter and anyone reading your code will instantly know that the contents won't change.

FWIW, I have some old Python code here that does prime factorization and divisor pair generation, but I won't post it (unless you want me to), since my Python coding style has improved considerably since I wrote it.

troyp mentioned the Lenstra Elliptic Curve Method for factorizing large numbers. There's a Python program on Sourceforge called pyecm that can handle numbers up to 50 digits. If you're using a Debian based Linux distro, it should be in your repositories. The code is fairly well documented, but it's over 1400 lines long and unless you're familiar with the ECM algorithm it won't necessarily be easy to read.

PM 2Ring

Posts: 2961
Joined: Mon Jan 26, 2009 3:19 pm UTC
Location: Mid north coast, NSW, Australia

### Re: Prime Factors and Proper Divisors in Python

Just nitpicking, but... The decomposition of 1 should be the empty list, not [[1, 1]]. Treating 1 as a prime number breaks the unicity of the decomposition, because 1 to the power of pretty much anything is still 1.

Grimgar

Posts: 12
Joined: Fri Jun 24, 2011 6:04 pm UTC

### Re: Prime Factors and Proper Divisors in Python

PM 2Ring wrote:I've only had a quick look at your code, and it mostly looks ok, but you might consider changing the prime factor list to a list of tuples rather than a list of lists. Always use tuples for immutable sequences - they are more efficient than lists, both the interpreter and anyone reading your code will instantly know that the contents won't change.

I've done that now and it gave a very slight speedup (shaved about a second off running the test up to 200000). I've done some refactoring as well and now my program looks like this:

Code: Select all
`#!/usr/bin/env pythonfrom math import sqrtfrom sys import argvfrom time import clockdef factorize(n):   if n < -1: return [(-1, 1)] + factorize(-n)   elif n == -1: return [(-1, 1)]   elif n == 0: return [(0, 1)]   elif n == 1: return [(1, 1)]   else:      def factor_out(n, divisor):         power = 0         while (n % divisor) == 0:            n //= divisor            power += 1         return (n, power)      factors = []      n, power = factor_out(n, 2)      if power > 0:         factors.append((2, power))      divisor = 3      sqrtn = sqrt(n)      while divisor <= sqrtn:         n, power = factor_out(n, divisor)         if power > 0:            factors.append((divisor, power))            sqrtn = sqrt(n)         divisor += 2      if n > 1:         factors.append((n, 1))      return factorsdef divisors_from_factors(factors):   def unsorted_divisors_from_factors(factors):      if not factors: return [1]      else:         base, max_power = factors[0]         if base == -1: return unsorted_divisors_from_factors(factors[1:])         elif base == 0: return []         elif base == 1: return unsorted_divisors_from_factors(factors[1:])         else:            divisors = unsorted_divisors_from_factors(factors[1:])            all_divisors = []            for power in xrange(0, max_power+1):               all_divisors += map(lambda x: x * base ** power, divisors)            return all_divisors   all_divisors = unsorted_divisors_from_factors(factors)   all_divisors.sort()   return all_divisorsdef test_factorize():   start = clock()   n = 0   while True:      f = factorize(n)      fa = map(lambda x: x[0] ** x[1], f)      fb = reduce(lambda x,y: x * y, fa, 1)      if fb != n:         print "FACTORIZE FAILED AT " + str(n)      d = divisors_from_factors(f)      da = map(lambda x: d[x] * d[len(d)-x-1], xrange(0, len(d)));      for db in da:         if db != n:            print "DIVISORS FAILED AT " + str(n)      f = factorize(-n)      fa = map(lambda x: x[0] ** x[1], f)      fb = reduce(lambda x,y: x * y, fa, 1)      if fb != -n:         print "FACTORIZE FAILED AT " + str(-n)      d = divisors_from_factors(f)      da = map(lambda x: d[x] * d[len(d)-x-1], xrange(0, len(d)));      for db in da:         if db != n:            print "DIVISORS FAILED AT " + str(-n)      if (n % 10000) == 0:         print "up to " + str(n) + " at " + str(clock() - start) + "s"      n += 1def run_factorize(n):   def str_from_factors_exp(factors):      def str_from_factor(factor):         if factor[1] == 1: return str(factor[0])         else: return "^".join(map(str, factor))      return " * ".join(map(str_from_factor, factors))   def str_from_factors_mul(factors):      factors = map(lambda x: [x[0]] * x[1], factors)      factors = reduce(lambda x,y: x + y, factors, [])      return " * ".join(map(str, factors))   def pairs_from_divisors(d):      return map(lambda x: (d[x], d[len(d)-x-1]), xrange(0, (1 + len(d)) // 2))   def str_from_pairs(pairs):      pairs = map(lambda x: "*".join(map(str, x)), pairs)      return "0*0" if not pairs else "\t".join(pairs)   f = factorize(n)   print str(n) + " = " + str_from_factors_exp(f)   print str(n) + " = " + str_from_factors_mul(f)   print   d = divisors_from_factors(f)   print "Divisors: " + ("N/A" if not d else ", ".join(map(str, d)))   s = reduce(lambda x,y: x + y, d, 0)   sn = s - abs(n)   s2n = sn - abs(n)   ss = "zero" if s == 0 else "unit" if sn == 0 else "prime" if sn == 1 else "deficient" if s2n < 0 else "perfect" if s2n == 0 else "abundant"   print "sigma_0(n) = " + str(len(d))   print "sigma_1(n) = " + str(s)   print "sigma_1(n)-n = " + str(sn)   print "sigma_1(n)-2n = " + str(s2n)   print str(abs(n)) + " is " + ss + "."   print   p = pairs_from_divisors(d)   print "Pairs:"   print str_from_pairs(p)for i in xrange(1, len(argv)):   if i > 1: print   if argv[i] == "test": test_factorize()   else: run_factorize(int(argv[i]))`

I've tried twice to use a sieve instead of counting off odd numbers, and while my sieves generate prime numbers just fine, they drive the time for factorization up at least fourfold. :/

This was my first sieve:

Code: Select all
`class Sieve:   def __init__(self):      self.primes = [2, 3, 5, 7, 11, 13, 17, 19]      self.index = 0   def next_prime(self, max):      num_primes = len(self.primes)      if self.index >= num_primes:         last_prime = self.primes[num_primes - 1]         pot_prime = last_prime + 2         max_pot_prime = last_prime * last_prime         while True:            for divisor in self.primes[:num_primes]:               if (pot_prime % divisor) == 0:                  break # goto next_pot_prime            else:               self.primes.append(pot_prime)               if pot_prime >= max or pot_prime >= max_pot_prime:                  break # goto return_next_prime            # next_pot_prime:            pot_prime += 2      # return_next_prime:      result = self.primes[self.index]      self.index += 1      return result`

Here is my second:

Code: Select all
`class Sieve:   def __init__(self):      self.primes = [2, 3, 5, 7, 11, 13, 17, 19]      self.index = 0      self.next_range_min = 20      self.next_range_max = 100   def next_prime(self):      while self.index >= len(self.primes):         new_primes = list(xrange(self.next_range_min, self.next_range_max))         for divisor in self.primes:            start = divisor * (self.next_range_min // divisor)            for multiple in xrange(start, self.next_range_max, divisor):               if multiple in new_primes:                  new_primes.remove(multiple)         self.primes += new_primes         self.next_range_min = self.next_range_max         self.next_range_max += 100      result = self.primes[self.index]      self.index += 1      return result`

Unless I'm doing something totally wrong here, I don't think this will be helpful in factoring a single number. For generating primes or factoring multiple numbers, sure, but that's not what I'm doing. I think what I have now is as good as I can get without switching methods, and good enough for my purposes (factoring mostly small numbers).
Stephen Hawking: Great. The entire universe was destroyed.
Fry: Destroyed? Then where are we now?
Al Gore: I don't know. But I can darn well tell you where we're not—the universe!

RebeccaRGB

Posts: 336
Joined: Sat Mar 06, 2010 7:36 am UTC
Location: Lesbians Love Bluetooth

### Re: Prime Factors and Proper Divisors in Python

If you don't want to go full-sieve, try a more limited form. Counting all the odd numbers just excludes those divisible by 2. This cuts out half, obviously. If you also exclude those divisible by 3, you can cut out fully another third of your testing. This is really easy - just increment by 2 and 4 alternating. (This gives you the number 6n+1 and 6n+5.)

You can go further if you're willing to complicate things a little more, and cut out those divisible by 5 as well. This gets complicated, because now the wheel of increments goes up to 30, but you'll only be testing 8 numbers out of every 30, instead of 10 out of every 30. Depending on how easy your implementation of *that* was, you can even go up to excluding 7, but that means testing 49 out of every 210 (instead of 56 out of every 210).
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))

Xanthir
My HERO!!!

Posts: 4142
Joined: Tue Feb 20, 2007 12:49 am UTC

### Re: Prime Factors and Proper Divisors in Python

RebeccaRGB wrote:Unless I'm doing something totally wrong here, I don't think this will be helpful in factoring a single number. For generating primes or factoring multiple numbers, sure, but that's not what I'm doing. I think what I have now is as good as I can get without switching methods, and good enough for my purposes (factoring mostly small numbers).

Yeah. A full sieve will only slow things down if you're only factoring single numbers, especially if they're relatively small. When doing that sort of thing in Python I just use a simple generator function to produce 2, 3, 6n-1, 6n+1, although it'd be easy to make a generator that implements Xanthir's suggestion of 2, 3, 5, 30n + [1, 7, 11, 13, 17, 19, 23, 29].

It's kinda neat that there are 8 numbers in that set, since there are 8 bits in a byte. So in languages that make bitwise operations convenient you can create a table of primes, with each byte in the table holding the prime/composite status of a block of 30 numbers. Such tables are fairly compact for storing smallish primes, although they start to get a bit sparse for larger primes.

PM 2Ring

Posts: 2961
Joined: Mon Jan 26, 2009 3:19 pm UTC
Location: Mid north coast, NSW, Australia

### Re: Prime Factors and Proper Divisors in Python

Awesome, I've implemented Xanthir's suggestion of [2, 3, 5, 30n + [7, 11, 13, 17, 19, 23, 29, 31]]. Factoring 88888888888888888 is down from 0.6s to 0.4s. Factoring 12345678901234567 is down from 11.7s to 7.1s. Running the test up to 200000 is down from 24s to 21s.

Here's what I have now:
Code: Select all
`#!/usr/bin/env pythonfrom math import sqrtfrom sys import argvfrom time import clockdef factorize(n):   if n < -1: return [(-1, 1)] + factorize(-n)   elif n == -1: return [(-1, 1)]   elif n == 0: return [(0, 1)]   elif n == 1: return [(1, 1)]   else:      def factor_out(n, divisor):         power = 0         while (n % divisor) == 0:            n //= divisor            power += 1         return (n, power)      factors = []      sqrtn = sqrt(n)      for divisor in (2, 3, 5):         if divisor > sqrtn:            break;         n, power = factor_out(n, divisor)         if power > 0:            factors.append((divisor, power))            sqrtn = sqrt(n)      divisor_group = 0      while divisor_group < sqrtn:         for divisor_index in (7, 11, 13, 17, 19, 23, 29, 31):            divisor = divisor_group + divisor_index            if divisor > sqrtn:               break;            n, power = factor_out(n, divisor)            if power > 0:               factors.append((divisor, power))               sqrtn = sqrt(n)         divisor_group += 30      if n > 1:         factors.append((n, 1))      return factorsdef divisors_from_factors(factors):   def unsorted_divisors_from_factors(factors):      if not factors: return [1]      else:         base, max_power = factors[0]         if base == -1: return unsorted_divisors_from_factors(factors[1:])         elif base == 0: return []         elif base == 1: return unsorted_divisors_from_factors(factors[1:])         else:            divisors = unsorted_divisors_from_factors(factors[1:])            all_divisors = []            for power in xrange(0, max_power+1):               all_divisors += map(lambda x: x * base ** power, divisors)            return all_divisors   all_divisors = unsorted_divisors_from_factors(factors)   all_divisors.sort()   return all_divisorsdef test_factorize():   start = clock()   n = 0   while True:      f = factorize(n)      fa = map(lambda x: x[0] ** x[1], f)      fb = reduce(lambda x,y: x * y, fa, 1)      if fb != n:         print "FACTORIZE FAILED AT " + str(n)      d = divisors_from_factors(f)      da = map(lambda x: d[x] * d[len(d)-x-1], xrange(0, len(d)));      for db in da:         if db != n:            print "DIVISORS FAILED AT " + str(n)      f = factorize(-n)      fa = map(lambda x: x[0] ** x[1], f)      fb = reduce(lambda x,y: x * y, fa, 1)      if fb != -n:         print "FACTORIZE FAILED AT " + str(-n)      d = divisors_from_factors(f)      da = map(lambda x: d[x] * d[len(d)-x-1], xrange(0, len(d)));      for db in da:         if db != n:            print "DIVISORS FAILED AT " + str(-n)      if (n % 10000) == 0:         print "up to " + str(n) + " at " + str(clock() - start) + "s"      n += 1def run_factorize(n):   def str_from_factors_exp(factors):      def str_from_factor(factor):         if factor[1] == 1: return str(factor[0])         else: return "^".join(map(str, factor))      return " * ".join(map(str_from_factor, factors))   def str_from_factors_mul(factors):      factors = map(lambda x: [x[0]] * x[1], factors)      factors = reduce(lambda x,y: x + y, factors, [])      return " * ".join(map(str, factors))   def pairs_from_divisors(d):      return map(lambda x: (d[x], d[len(d)-x-1]), xrange(0, (1 + len(d)) // 2))   def str_from_pairs(pairs):      pairs = map(lambda x: "*".join(map(str, x)), pairs)      return "0*0" if not pairs else "\t".join(pairs)   f = factorize(n)   print str(n) + " = " + str_from_factors_exp(f)   print str(n) + " = " + str_from_factors_mul(f)   print   d = divisors_from_factors(f)   print "Divisors: " + ("N/A" if not d else ", ".join(map(str, d)))   s = reduce(lambda x,y: x + y, d, 0)   sn = s - abs(n)   s2n = sn - abs(n)   ss = "zero" if s == 0 else "unit" if sn == 0 else "prime" if sn == 1 else "deficient" if s2n < 0 else "perfect" if s2n == 0 else "abundant"   print "sigma_0(n) = " + str(len(d))   print "sigma_1(n) = " + str(s)   print "sigma_1(n)-n = " + str(sn)   print "sigma_1(n)-2n = " + str(s2n)   print str(abs(n)) + " is " + ss + "."   print   p = pairs_from_divisors(d)   print "Pairs:"   print str_from_pairs(p)for i in xrange(1, len(argv)):   if i > 1: print   if argv[i] == "test": test_factorize()   else: run_factorize(int(argv[i]))`
Stephen Hawking: Great. The entire universe was destroyed.
Fry: Destroyed? Then where are we now?
Al Gore: I don't know. But I can darn well tell you where we're not—the universe!

RebeccaRGB

Posts: 336
Joined: Sat Mar 06, 2010 7:36 am UTC
Location: Lesbians Love Bluetooth

### Re: Prime Factors and Proper Divisors in Python

It's considered good style to avoid code duplication. For example, you have repeated
Code: Select all
`         if divisor > sqrtn:            break;         n, power = factor_out(n, divisor)         if power > 0:            factors.append((divisor, power))            sqrtn = sqrt(n)               sqrtn = sqrt(n)`

Doing that kind of thing is easy to do via copy & paste, but it's also easy for errors to creep in if you make a typo. And if you decide to modify your algorithm you have to make changes in two places instead of one. Also, anyone reading your code will wonder if the two sections are really supposed to be identical, or if there's supposed to be some subtle difference. Tiny bits of code duplication are tolerable (and maybe even desirable in some circumstances), but it's best to get into the habit of trying to avoid it when you can.

Generally, code duplication can be avoided by placing the offending code into a function; in Python it's often convenient to do this using a local function, i.e., a function defined inside the function that uses it. In this particular case, you have another option: you can use a generator function to produce the sequence of candidate factors for you.

Here's one way to write such a generator function:
Code: Select all
`#! /usr/bin/env pythonimport sysdef potential_primes():    ''' Make a generator for 2, 3, 5, & thence all numbers coprime to 30 '''    s = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29)    for i in s:        yield i    s = (1,) + s[3:]     j = 30    while True:        for i in s:            yield j + i        j += 30def main():    m = len(sys.argv) > 1 and int(sys.argv[1]) or 20    f = potential_primes()    for i in xrange(m):         print i, f.next()if __name__ == '__main__':    main()`

PS.
Since you're currently playing with divisors and related matters, you maybe interested in the Möbius mu function, and the beautiful Möbius inversion formula.

PM 2Ring

Posts: 2961
Joined: Mon Jan 26, 2009 3:19 pm UTC
Location: Mid north coast, NSW, Australia

### Re: Prime Factors and Proper Divisors in Python

Oooh, generators are awesome!

(I am a software engineer, so I know the DRY principle, but I write Java all day, so I'm not used to all the awesome features Python has. )

Using a generator makes the code the slightest bit slower, but it allows me to remove the factor_out function with its returning of a tuple only to be untupled later, which gives me a really nice speedup. Running the test up to 200000 is now 19.3s. Factoring 88888888888888888 is down to 0.3s. Factoring 12345678901234567 is down to 5s. Woohoo!

And here's the code now.

Code: Select all
`#!/usr/bin/env pythonfrom math import sqrtfrom sys import argvfrom time import clockdef factorize(n):   if n < -1: return [(-1, 1)] + factorize(-n)   elif n == -1: return [(-1, 1)]   elif n == 0: return [(0, 1)]   elif n == 1: return [(1, 1)]   else:      def potential_primes():         base_primes = (2, 3, 5)         for base_prime in base_primes:            yield base_prime         base_primes = (7, 11, 13, 17, 19, 23, 29, 31)         prime_group = 0         while True:            for base_prime in base_primes:               yield prime_group + base_prime            prime_group += 30      factors = []      sqrtn = sqrt(n)      for divisor in potential_primes():         if divisor > sqrtn:            break         power = 0         while (n % divisor) == 0:            n //= divisor            power += 1         if power > 0:            factors.append((divisor, power))            sqrtn = sqrt(n)      if n > 1:         factors.append((n, 1))      return factorsdef divisors_from_factors(factors):   def unsorted_divisors_from_factors(factors):      if not factors: return [1]      else:         base, max_power = factors[0]         if base == -1: return unsorted_divisors_from_factors(factors[1:])         elif base == 0: return []         elif base == 1: return unsorted_divisors_from_factors(factors[1:])         else:            divisors = unsorted_divisors_from_factors(factors[1:])            all_divisors = []            for power in xrange(0, max_power+1):               all_divisors += map(lambda x: x * base ** power, divisors)            return all_divisors   all_divisors = unsorted_divisors_from_factors(factors)   all_divisors.sort()   return all_divisorsdef test_factorize():   start = clock()   n = 0   while True:      f = factorize(n)      fa = map(lambda x: x[0] ** x[1], f)      fb = reduce(lambda x,y: x * y, fa, 1)      if fb != n:         print "FACTORIZE FAILED AT " + str(n)      d = divisors_from_factors(f)      da = map(lambda x: d[x] * d[len(d)-x-1], xrange(0, len(d)))      for db in da:         if db != n:            print "DIVISORS FAILED AT " + str(n)      f = factorize(-n)      fa = map(lambda x: x[0] ** x[1], f)      fb = reduce(lambda x,y: x * y, fa, 1)      if fb != -n:         print "FACTORIZE FAILED AT " + str(-n)      d = divisors_from_factors(f)      da = map(lambda x: d[x] * d[len(d)-x-1], xrange(0, len(d)))      for db in da:         if db != n:            print "DIVISORS FAILED AT " + str(-n)      if (n % 10000) == 0:         print "up to " + str(n) + " at " + str(clock() - start) + "s"      n += 1def run_factorize(n):   def str_from_factors_exp(factors):      def str_from_factor(factor):         if factor[1] == 1: return str(factor[0])         else: return "^".join(map(str, factor))      return " * ".join(map(str_from_factor, factors))   def str_from_factors_mul(factors):      factors = map(lambda x: [x[0]] * x[1], factors)      factors = reduce(lambda x,y: x + y, factors, [])      return " * ".join(map(str, factors))   def pairs_from_divisors(d):      return map(lambda x: (d[x], d[len(d)-x-1]), xrange(0, (1 + len(d)) // 2))   def str_from_pairs(pairs):      pairs = map(lambda x: "*".join(map(str, x)), pairs)      return "0*0" if not pairs else "\t".join(pairs)   f = factorize(n)   print str(n) + " = " + str_from_factors_exp(f)   print str(n) + " = " + str_from_factors_mul(f)   print   d = divisors_from_factors(f)   print "Divisors: " + ("N/A" if not d else ", ".join(map(str, d)))   s = reduce(lambda x,y: x + y, d, 0)   sn = s - abs(n)   s2n = sn - abs(n)   ss = "zero" if s == 0 else "unit" if sn == 0 else "prime" if sn == 1 else "deficient" if s2n < 0 else "perfect" if s2n == 0 else "abundant"   print "sigma_0(n) = " + str(len(d))   print "sigma_1(n) = " + str(s)   print "sigma_1(n)-n = " + str(sn)   print "sigma_1(n)-2n = " + str(s2n)   print str(abs(n)) + " is " + ss + "."   print   p = pairs_from_divisors(d)   print "Pairs:"   print str_from_pairs(p)for i in xrange(1, len(argv)):   if i > 1: print   if argv[i] == "test": test_factorize()   else: run_factorize(int(argv[i]))`
Stephen Hawking: Great. The entire universe was destroyed.
Fry: Destroyed? Then where are we now?
Al Gore: I don't know. But I can darn well tell you where we're not—the universe!

RebeccaRGB

Posts: 336
Joined: Sat Mar 06, 2010 7:36 am UTC
Location: Lesbians Love Bluetooth

### Re: Prime Factors and Proper Divisors in Python

Probabilistic primality tests? (to figure out when to stop trying to factor)
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

Yakk
Poster with most posts but no title.

Posts: 10210
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

### Re: Prime Factors and Proper Divisors in Python

Doing a quick Fermat check can definitely save you a lot of time, as for non Carmichael numbers, most numbers have few to no Fermat liars. A quick check up to 200k turns up 115 composite numbers that have lairs, 82 of which have only 1 liar.

Spoiler:
We know all Carmichael numbers have at least three factors, so the least factor of a Carmichael number N must be at most N1/3, so we can detect all possible Carmichael numbers (the worst case of the FPT) by trying the FPT on bases up to and including N1/3 (since fast modular exponentiation allows an FPT test of one base to be log2(n), this is already faster than naive trial division for detecting primes). Thus, counting numbers with less than or equal to N1/3 liars will find all numbers with liars.

In other words, >99.9% of all composite numbers up to 200k can be identified with a single trial. Beyond a few rounds however, very little looks to be gained and some other test should be used. I suspect upon detection of even a single potential liar, it would be better to switch to trial division by primes.
All Shadow priest spells that deal Fire damage now appear green.
Big freaky cereal boxes of death.

WarDaft

Posts: 1553
Joined: Thu Jul 30, 2009 3:16 pm UTC

### Re: Prime Factors and Proper Divisors in Python

You could also try to find a prime factor by crowd sourcing prime-based cryptography (like RSA).

If your large number N differs by some even value that also looks to be prime (after dividing by 2)from some public value M = pq for some primes pq, then it's possible (wlog) that gcd(N,M) = p.

According to wiki answers, a reputable source, there are about 1 million banks in the world. If they all had online banking and used unique p's and q's, that's 2 million primes, and let's just suppose they're about 512 bit primes.

That's about 18.85 x 10^150 primes, and so you have a 10^-147 chance in factoring your N.

Yay.
mr-mitch

Posts: 473
Joined: Sun Jul 05, 2009 6:56 pm UTC

### Re: Prime Factors and Proper Divisors in Python

WarDaft wrote:Doing a quick Fermat check can definitely save you a lot of time, as for non Carmichael numbers, most numbers have few to no Fermat liars. A quick check up to 200k turns up 115 composite numbers that have lairs, 82 of which have only 1 liar.

Liars? Don't you think that's a bit unfair?

Those poor Carmichael Numbers. First we label them as "pseudoprimes"...now we ask them misleading questions and call them liars when they tell the truth!
troyp

Posts: 483
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW