A formula based approach to Arithmetic Coding

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Moderators General, Prelates

siara
Posts: 3
Joined: Mon Sep 07, 2015 11:08 am UTC

A formula based approach to Arithmetic Coding

Postby siara » Mon Sep 07, 2015 11:14 am UTC

I have been doing some research on Entropy coding and discovered some interesting relationships between Arithmetic Coding and Huffman Coding. I made an article and am presenting it here for review:

h t t p (colon) (slash) (slash) siara.cc (slash) arithmetic_coding_new_approach

Sorry about the above URL. It does not allow me to post if I include the regular URL. Please copy paste to browser. My article is too lengthy to paste here.

Please let me know your views.

Regards
Arun
Last edited by siara on Sat Sep 12, 2015 3:44 am UTC, edited 1 time in total.

siara
Posts: 3
Joined: Mon Sep 07, 2015 11:08 am UTC

Re: A formula based approach to Arithmetic Coding

Postby siara » Sat Sep 12, 2015 3:43 am UTC

The URL is :

h t t p (colon) (slash) (slash) siara.cc (slash) arithmetic_coding_new_approach

Arun

Tub
Posts: 325
Joined: Wed Jul 27, 2011 3:13 pm UTC

Re: A formula based approach to Arithmetic Coding

Postby Tub » Sat Sep 12, 2015 7:26 am UTC

http://siara.cc/arithmetic_coding_new_approach/

Arithmetic coding is easy if you're doing maths, but a nightmare if you're doing computer science and can't just assume arbitrary precision.

Your approach seems mathematically sound. You could independently calculate each symbol's contribution to the final value and then add everything up. But that requires you to do all calculations with at least as many bits of precision as the shannon entropy of your plaintext. Which is possible with "Hello world" (a 64-byte float should be enough for <32 bits of compressed data), but will get you into a lot of trouble once you try to apply the principle to any reasonably large amount of data. Requiring this precision means that your runtime complexity goes from O(n) to O(n^2) or worse, and no amount of parallelization can save you from that.

siara
Posts: 3
Joined: Mon Sep 07, 2015 11:08 am UTC

Re: A formula based approach to Arithmetic Coding

Postby siara » Sat Sep 12, 2015 8:56 am UTC

Tub wrote:Arithmetic coding is easy...


I agree with you on all points. You seem to have gone through it fully. Thanks. Particularly I liked the statement "no amount of parallelization can save you from that.".

I didn't intend it for practical implementation. The formula could be useful for further research, thats why I made the article. I am also hoping that someday I would be able to find a way to get over the precision problem.

I implemented it on C++. As you said it loses precision after 64 bits and can't decode any further. But thats a problem with any approach. Thats why they are talking of reduced precision, range encoding etc.

By the way, how were you able to post URL in the reply? I couldn't.

Thanks again. Nice meeting you.

User avatar
PM 2Ring
Posts: 3620
Joined: Mon Jan 26, 2009 3:19 pm UTC
Location: Mid north coast, NSW, Australia

Re: A formula based approach to Arithmetic Coding

Postby PM 2Ring » Sat Sep 12, 2015 9:25 am UTC

siara wrote:By the way, how were you able to post URL in the reply? I couldn't.

The answer to that question can be found in the xkcd Forum Rules, along with other useful information.


Return to “Computer Science”

Who is online

Users browsing this forum: No registered users and 6 guests