Computer Science

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Larson, Moderators General, Prelates

Computer Science

Hey guys. I'm in need of some guidance regarding beginners research into computer science. Some background information: I am going into my sophomore year of high school. So far, I have gained quite a lot of ground in both Java and Perl, and will most likely start learning C during the school year (thank you Robotics Club!). In general though, I think I am at a point were learning additional languages should come easily and much quicker since I have a, more or less, complete understanding of the fundamentals that are present throughout programming (logic, loops, OOP, data structures, etc). Recently, it has dawned on me that it would be interesting to understand the foundations of programming. So, I managed to acquire a pdf version of the Art of Computer Programming-Knuth and Introduction to Algorithms-MIT Press. I had to step away from both after a couple of chapters because of my lack of math skills. So, any suggestions as to how to better approach this? Should I try again with those books and slip in a discrete maths textbook as a reference? Or maybe I should choose different material?

Thanks.

<(bLYaUn)>

Posts: 21
Joined: Fri May 02, 2008 1:11 am UTC

Re: Computer Science

you mean Intro to Algorithms by C,L,R, and S? that book has lots of wonderful pseudocode in it, so you may enjoy implementing the algorithms. I used it to implement a fibonacci heap to run dijkstra's with.

As for the math, most of it isn't important if you don't have a professor to explain it to you. You may find certain proofs approachable, however, like perhaps the proof of the master theorem if you can solve recurrence relations, and IIRC the proofs of the randomized-select algorithm. The explanation of asymptotic running times is elementary as well.

quintopia

Posts: 2790
Joined: Fri Nov 17, 2006 2:53 am UTC
Location: atlanta, ga

Re: Computer Science

"Introduction to Algorithms" is probably going to be a bit rough without several courses in discrete math under your belt. That book is typically used in upper-level CS theory courses. The word "Introduction" is used in the loosest sense possible (almost to the point that it's ironic). It's an "introduction" to algorithms provided you have sufficient experience and knowledge in the supporting topics of complexity theory, graph theory, probability, etc. No doubt you felt a bit out of your depth.

In general with math and science texts, it's hard to judge the intended audience of a textbook from the title alone. My old "A First Course in Probability" textbook is also good example. It is a first course in probability theory, provided you have the requisite multi-variable calculus background. Usually textbooks have a preface that oulines what the prerequisites are.

You probably want to start with a couple of first-level discrete math courses before tackling "Introduction to Algorithms".

To follow "Introduction to Algorithms" a short list of topics to focus on (if memory serves):
complexity theory
graph theory
probability
automata theory

Here's MIT's first-level discrete math course. It has lecture slides, notes, and problem sets included. Everything you need to get started.

Also, wikipedia is great. Computer science is one of the topics that (at least for general concepts) actually tends to be covered very well by wikipedia.

Posts: 687
Joined: Mon May 05, 2008 2:14 am UTC

Re: Computer Science

@quintopia: Yup, that's the one. I understood the pseudo-code of the earliest algorithm presented (Insertion Sort). I even understood the notation used to label its run time. But then the book started writing linear functions that I couldn't decipher. I guess ill just skip the equations and focus on the concepts for now.

@0xBADFEED: Are there any prerequisites to Discrete Math ? Besides those slides, do you have any suggestions for textbooks on this subject?

Thanks for your help so far guys.

<(bLYaUn)>

Posts: 21
Joined: Fri May 02, 2008 1:11 am UTC

Re: Computer Science

Discrete math is a huge branch of mathematics. It's more just a label that covers a whole lot of different fields. When people talk about taking a "discrete math" course that usually means that it is an introductory course that provides an overview of some of the more important fields. If it's a more in-depth course it would just be called what it is. For instance, it's a course in combinatorics, or graph theory, or numerical methods, etc.

I'll assume what you mean is "Are there any prerequisites for a typical first-year discrete math course?" The typical prereqs are just high-school level algebra and a good head for logic. The link to the slides I posted in my last reply are not the meat of that course. Here is a link to the main material of that course. You should have a look around the OpenCourseware offerings. This course is just the one that is most relevant as it is a first-year introduction to several topics in discrete math.

I don't know what the defacto standard textbook for introductory discrete math is.

Posts: 687
Joined: Mon May 05, 2008 2:14 am UTC

Re: Computer Science

If you've had only basic math, the proof of insertion sort should still be well within your grasp if you spend some time with it. The loop invariant (statement that is true at the beginning of each loop) is easy to state (the first j elements are sorted on the jth loop) and just as easy to prove by induction (the base case is j=1, but a list of one element is, of course, already sorted) that this invariant leads to a sorted array. The invariant itself is easy to prove (on the jth loop, the jth element is put in the correct order with respect to the already-sorted previous j-1 elements, thus leading to the first j elements being in sorted order).

That's really all the insertion sort section is saying.

quintopia

Posts: 2790
Joined: Fri Nov 17, 2006 2:53 am UTC
Location: atlanta, ga

Re: Computer Science

Well I took your advice and sat down with Intro to Alg. again and read and re-read the pseudo-code of the insertion sort algorithm. Man, I was so stoked when I finally got it and it just clicked in my head!
Here is my Perl implementation of the insertion sort algorithm.

Code: Select all
`#!usr/bin/perluse warnings;use strict;my @array = (4, 6, 3, 7, 5, 9);my \$key;my \$i;my \$x;for(\$x = 1; \$x < \$#array; \$x++){   \$key = \$array[\$x];   \$i = \$x;   while(\$i > 0 and \$array[\$i-1] > \$key){      print "@array \n";      \$array[\$i] = \$array[\$i - 1];      \$i--;      print "@array \n";   }   \$array[\$i] = \$key;}print "@array \n";`

Now its time to tackle the part concerning the analysis of its run time.Wow! I never knew algorithms could be so much fun

<(bLYaUn)>

Posts: 21
Joined: Fri May 02, 2008 1:11 am UTC

Re: Computer Science

With an algorithm like insertion sort, analyzing the running time is even easier than proving the correctness

The outer loop happens n times for array of length n.
The inner loop happens at most n times (because you can't do worse than moving the last element to the beginning).
Thus the inner loop runs less than n*n times, which means the algorithm is O(n2).

I'm not trying to spoonfeed you answers or anything (which I can't do really, because the answers are all right there in the book), just illustrate how I go about thinking about algorithms.

quintopia

Posts: 2790
Joined: Fri Nov 17, 2006 2:53 am UTC
Location: atlanta, ga

Re: Computer Science

quintopia wrote:With an algorithm like insertion sort, analyzing the running time is even easier than proving the correctness

The outer loop happens n times for array of length n.
The inner loop happens at most n times (because you can't do worse than moving the last element to the beginning).
Thus the inner loop runs less than n*n times, which means the algorithm is O(n2).

I'm not trying to spoonfeed you answers or anything (which I can't do really, because the answers are all right there in the book), just illustrate how I go about thinking about algorithms.

So when analyzing an algorithm's runtime do you always assume worst-case scenario?

<(bLYaUn)>

Posts: 21
Joined: Fri May 02, 2008 1:11 am UTC

Re: Computer Science

Generally, yes. But not always. ^_^

Frex, quicksort is usually reported as O(n*log(n)), but really it's O(n2), just like any other generic sort. The difference is that quicksort is *almost always* roughly n*log(n), except when it hits certain specially-crafted inputs (well, not too special - the naive quicksort hits O(n2) if passed an already-sorted list).

So, even though quicksort is same as any other sort in the worst case, the fact that it almost always achieves substantially better than that justifies calling it a lower run-time class.

You've also got things like 'amortized complexity', where an event that will be repeated many times is *usually* very fast and easy, but *occasionally* very expensive and slow. If you can guarantee that the expensive operation will only happen at most a given fraction of the time, you can 'spread it out' over the rest of the cases. Frex, say you're implementing a vector (arbitrary-length array) in C. The standard way to do this is to use an ordinary array (which has a fixed length), and then copy everything into a larger array when necessary (normally, you double the size of the array when it grows). So, in the normal case inserting an element into the array is O(1), just like an array. But occasionally your insert will end up being too much for the current array, and trigger a copy, which is O(n). If you double the array size each time, then this overflow only happens at most ln(n) times, where n is the number of inserts. When you average these together, you can say that the average (or amortized) complexity of an insert is O(ln(n)).

There are several other ways to measure complexity, each with their own purpose.
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))

Xanthir
My HERO!!!

Posts: 4021
Joined: Tue Feb 20, 2007 12:49 am UTC

Re: Computer Science

Xanthir wrote:Frex, quicksort is usually reported as O(n*log(n)), but really it's O(n2), just like any other generic sort. The difference is that quicksort is *almost always* roughly n*log(n), except when it hits certain specially-crafted inputs (well, not too special - the naive quicksort hits O(n2) if passed an already-sorted list).

So, even though quicksort is same as any other sort in the worst case, the fact that it almost always achieves substantially better than that justifies calling it a lower run-time class.

Merge sort and heapsort have worse cases of O(n*log(n))
qbg

Posts: 586
Joined: Tue Dec 18, 2007 3:37 pm UTC

Re: Computer Science

qbg wrote:Merge sort and heapsort have worse cases of O(n*log(n))

But implementation issues and stability concerns can make quicksort implementations faster and more desirable in practice. However, merge-sort is often better depending on the datastructure to be sorted. For instance, sorting a linked-list is more natural using merge-sort than quicksort.

In fact, it can be shown that n*lg(n) is a lower bound on the worst-case for any comparison-based sorting algorithm. All other sorting algorithms with lower worst-case complexities do not use direct comparisons, e.g. radix-sort or bucket-sort. However, these are not general sorting routines and place more restrictions on the types of elements that can be sorted. I think all of these are covered in "Intro to Algorithms".

Xanthir wrote: .... you can say that the average (or amortized) complexity of an insert is O(ln(n))

Actually, assuming resizing and copying the array is linear with the number of elements in the array, you get n inserts for a cost of C*n. So, amortizing the cost of expansion over the number of inserts yields C. The amortized complexity of inserting into the dynamically resized array is O(C), constant.

EDIT: When I say "insert" I mean insert into to the next empty space at the back of the array (i.e. a constant time operation)
Last edited by 0xBADFEED on Thu Jul 31, 2008 12:33 pm UTC, edited 2 times in total.

Posts: 687
Joined: Mon May 05, 2008 2:14 am UTC

Re: Computer Science

qbg wrote:Merge sort and heapsort have worse cases of O(n*log(n))

But implementation issues and stability concerns can make quicksort implementations faster and more desirable in practice. However, merge-sort is often better depending on the datastructure to be sorted. For instance, sorting a linked-list is more natural using merge-sort than quicksort.
Also, it's easy to either adapt Quicksort so that it changes when the worst-case behavior occurs so that already-sorted arrays are also n*log n and you need much more special inputs to trigger the worst case (look up median-of-three quicksort or something like that), or truely runs in n*log n worst-case time (though probably with a large constant that doesn't make it worth it).
EvanED

Posts: 3781
Joined: Mon Aug 07, 2006 6:28 am UTC

Re: Computer Science

blyaun: this discussion of amortized complexity is probably a bit above what you should know just starting out, so, unless you're ambitious, don't bother.

As a more concise answer to your question, most people are interested in the worst-case or average time of an algorithm. The big-O can refer to either of these, which you can easily see by the definition of big-O. It may be of some use to you to convert my short argument above into a formal proof using the definition of big-O, which you will find in Intro to Algs.

You may also want to consider the average running time for insertion sort, which you do by essentially assuming the average number of steps happened in any operation that can have a variable number of steps. The inner loop of insertion sort is such an operation, so you would assume that the next unsorted value up would always be the median of the values that have been sorted. You should find that the average running time is also O(n2), but you should also be able to find a difference between the average number of steps and the worst case number of steps on the order of a constant multiple.

quintopia

Posts: 2790
Joined: Fri Nov 17, 2006 2:53 am UTC
Location: atlanta, ga

Re: Computer Science

When you say "fundamentals of programming", do you mean algorithm design, or how the code actually functions/is implemented at a lower level (if so, then i would recommend a book titled "From Bits and Gates to C and Beyond").
Albino

Posts: 5
Joined: Wed Jul 30, 2008 5:16 am UTC

Re: Computer Science

SICP and the video lectures (here or here) may be enlightening/useful.
qbg

Posts: 586
Joined: Tue Dec 18, 2007 3:37 pm UTC

Re: Computer Science

@Albino: What I meant is Computer Science, which I guess is the base foundation on which programming sits on. I know that Computer Science itself doesn't necessarily require programming, but from what I've read and heard, to be a good and competent programmer a solid knowledge of computer science is necessary. And other then that, its a pretty interesting field by itself.

<(bLYaUn)>

Posts: 21
Joined: Fri May 02, 2008 1:11 am UTC