Pattern analysis

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Larson, Moderators General, Prelates

Pattern analysis

Postby elminster » Tue Jan 17, 2012 11:09 pm UTC

Preface: I'm kinda drunk, so bear with me.

I've always wondered about the complexity of arbitrary pattern analysis. The way I'd define the solution to "the problem" is finding the simplest logical pattern which can define any given pattern within a range of data.
For example (Very basic), given the data set of 2 4 8 16 32 64, you can clearly define it as a produce of 2^n, although technically you could define it using an infinite number of infinitely complex algorithms which produce the same numbers.
My question is, can we predict the optimal structure of program and/or architecture to solve such an arbitrary pattern analysis?

I assume it's more down to the given technology, e.g. Quantum computers would obviously allow more of a particular kind of calculation compared to regular calculations.
Also note, this question was probably better thought out when sober.

edit: I guess it's kind of like the p=np problem when taking it to it's logical extremity. Although I was thinking only a few levels of "meta" before it's logical end.
Image
elminster
 
Posts: 1383
Joined: Mon Feb 26, 2007 1:56 pm UTC
Location: London, UK, Dimensions 1 to 42.

Re: Pattern analysis

Postby Laguana » Wed Jan 18, 2012 12:07 am UTC

From a purely theoretical perspective, this reminds me of a universal AI which makes use of Kolmogorov complexity and occam's razor to try to learn how best to predict events.

Essentially, given a prefix of some input, say "a b a b a b", it guesses that the sequence was generated by the simplest Turing machine that could output it, and thus guesses how the sequence will evolve based on the behaviour of that Turing machine. In this case, the simplest turing machine which outputs "a b a b a b" might go on to output "a" next. Of course, the pattern might actually continue with a "z", but there is no reason to expect that at this point.

Re-reading your original question, it looks like you are more interested in the "magic happens" step here of determining this simplest Turing machine (or some other representation of an algorithm) that would output the sequence. One (again theoretical) way would be to enumerate all Turing machines in order of complexity, introducing a new machine and running all introduced machines one step each iteration, and stop when you find one giving the desired output. If there is a machine which generates your output, then you are guaranteed to find it in finite time!

In terms of a practical approach, it is a difficult problem. There's a "no free lunch" theorem which essentially says that you can't be great at everything. If you have a system which can detect / fit some patterns very well or quickly, then there must be some patterns where the system performs badly.
Laguana
 
Posts: 49
Joined: Sat Jan 19, 2008 10:13 pm UTC

Re: Pattern analysis

Postby WarDaft » Thu Jan 19, 2012 3:49 am UTC

In terms of a practical approach, it is a difficult problem. There's a "no free lunch" theorem which essentially says that you can't be great at everything. If you have a system which can detect / fit some patterns very well or quickly, then there must be some patterns where the system performs badly.
Enumerating Turing machines is actually worse than naive trial and error optimization for time complexity, so I don't think that applies here.

We can also consider Solomonoff induction (a similar scheme, except it effectively considers all machines which output the prefix, and weights them according to their complexity) but bounded in execution time by some function of input. We will be left with some uncertainty as to which the most probable next symbol is. We can reduce the uncertainty to arbitrarily close to a value dependent upon the Chaitin constant by running for longer, but never dependably eliminate it. Asymptotically, we cannot do better for an estimation.
All Shadow priest spells that deal Fire damage now appear green.
Big freaky cereal boxes of death.
User avatar
WarDaft
 
Posts: 1538
Joined: Thu Jul 30, 2009 3:16 pm UTC

Re: Pattern analysis

Postby Proginoskes » Thu Jan 19, 2012 6:35 am UTC

Laguana wrote:Re-reading your original question, it looks like you are more interested in the "magic happens" step here of determining this simplest Turing machine (or some other representation of an algorithm) that would output the sequence. One (again theoretical) way would be to enumerate all Turing machines in order of complexity, introducing a new machine and running all introduced machines one step each iteration, and stop when you find one giving the desired output. If there is a machine which generates your output, then you are guaranteed to find it in finite time!


You don't need to worry about the "If" part here ... You always have the TM which prints a, moves to the right, prints b, moves to the right, prints a, moves to the right, etc., up through the end of the finite sequence, and then prints an a and stops.
User avatar
Proginoskes
 
Posts: 309
Joined: Mon Nov 14, 2011 7:07 am UTC
Location: Sitting Down

Re: Pattern analysis

Postby elminster » Sun Jan 22, 2012 12:15 am UTC

This is partially rambling about things on my mind because drink (Wow, drunk for two posts in a row on the subforum... great start)

Considering it further I'd imagine it would heavily rely on precomputed knowledge. Consider that the term "cube" would represent a well defined mathematical concept of a 3D object containing only 90 degree angles with equal length sides. Although given that, you could abstract to a level to say that a pattern of X belongs to a concept of Y, which is further explained by Z. At some level you would need to determine if a particular level of abstraction is useful (e.g. "Shape", "Cube"); this is partially arbitrary beyond terminology which defines their logical boundaries. e.g. A "rectangle" contains the subset of "square", but it's logically impossible for every "rectangle" to be contained within the set of "squares", there's a logical line between them; however, "table"* is partially subjective as there's no logical boundaries to the definition.

*You could theoretically define a table to it's logical limits of being a parallel to it's base (as part of the definition) but humans would consider things a few degrees off as a table, furthermore even a 45degree surface above base level could be considered a table if objects were designed to adhere to it in some manner. When taking such considerations into account, essentially you consider any surface raised above it's base to be a table or only parallel. Point being: most things in language don't have a logical definition.

Really, I guess you'd have to consider human language to really define anything via computers. If a human understands it, then technically you can consider it defined to enough detail (Beyond logical restraints).

This partially leads onto another topic I been meaning to post:
In my mind it's inevitable that human language interpretation becomes a middleware industry. Essentially such a middleware would map human interactions with a defined set of goals. It's rather complicated when you considering asking the question of "is a particular merger going to raise net profits within X amount of time", for humans it's immediately obvious that when a huge conglomerate overtakes a fledgling company, that it's chances of success dramatically increase, but quantifying it would essentially require a computer to quantify everything. However, like how method of trimming the probability tree is the most important thing in solving any particular game using game theory... you could heuristically calculate what level of calculation in the probability tree can be obtained given a particular set of calculation resources.

Given that human interaction is relatively fixed in terms of scope, it's inevitable that computers would excel in computational ability. Although given that human interaction isn't logical (taken to it's logical extremes you'd have to consider quantum theories and essentially the question of can the future be perfectly predicted given as much knowledge as possible, but I digress), yet you can define it within "useful" logical constraints.

Anyyyywayyy... I wonder how long it will be be before companies start offering middleware to map arbitrary human interactions with defined goals.

tbh, I've thought way too long about it to condense into a single post, let alone while (now sobering due to time taken to post) partially drunk. I'm gonna post this anyway, despite the fact that this would usually be relegated to unposted things. I must hold some kind of record for time spent writing posts which just get deleted at the end *existential sigh*
Image
elminster
 
Posts: 1383
Joined: Mon Feb 26, 2007 1:56 pm UTC
Location: London, UK, Dimensions 1 to 42.

Re: Pattern analysis

Postby DrZiro » Sat Jan 28, 2012 12:01 am UTC

It looks like you have a pretty big and perhaps somewhat vague question here. It's not easy to say what the "simplest logical pattern" is. But looking at your first example, we could restrict the problem to:
"Given a list of n numbers no larger than m, what is the asymptotic time complexity for finding the optimal generator for the list?"
Then we have to define "optimal generator" for this purpose. We could think in terms of Turing machines, that's one quite natural way. But there are other ways. We can also think of it this way:
Define a set of unary arithmetic operators. A generator for L is a list of operators such that when applied (in order) to the integer n, they return the nth element in L. A generator is optimal if there is no other generator for that L with fewer operators.

In that case, the first thing we can say is that since each generator only gives one L, there has to be at least as many generators as L. So if we express each generator as a bitstring, there has to be some generator which has at least lb(n^m) bits (well, maybe minus one or something). It's impossible to print that in less than O(lb(n^m)) time, so there is no way your algorithm can run faster than that.

But I doubt you can come up with such an algorithm. In fact, for unlimited n, I'm not sure there is any set of operators that are sure to solve all L at all. (For limited n, we can have operators which say "if the number is a, write b", for a from 0 to n.)

That's actually a very interesting question. Damn, now I won't get any sleep tonight.
DrZiro
 
Posts: 83
Joined: Mon Feb 09, 2009 3:51 pm UTC


Return to Computer Science

Who is online

Users browsing this forum: No registered users and 2 guests