CM10228 / Programming II:   Lecture 4


Sorting Algorithms & Complexity


I. Big O Notation.

  1. Normally, certain aspects of the algorithm (for example, its complexity, or the duration of its longest operations) have the most impact on how long it takes.  These aspects are said to dominate.
  2. To make it clear that we are only talking about the dominant factor when we analyze an algorithm, we talk about the order of the algorithm. 
    1. The order has its own notation, called "big O" and written like this:  O(n) (read "Order n").
    2. Order notation throws out constants.  For example, if an algorithm takes 2n time (for example, reads through a whole list twice) it is still O(n), the same order as if you only went through the list once.
    3. Sometimes constants matter – for example, if it's going to take a year to go through the list – but usually they don't.
  3. Since "N is the length of the original list, the Number of items", a function of linear complexity is said to be O(n).
  4. Just as constants don't matter, neither do lower-order components of an algorithm.  So if an algorithm actually takes n5 + n3 + 7, we say its complexity is O(n5).
  5. This is only for addition!  Obviously if the algorithm takes n5 * n3 its complexity is O(n8)!
  6. Since ultimately the most important thing about algorithms isn't their exact rate of growth, but rather what the curves look like (remember the graph) we often don't even bother to specify the base for a logarithm or the exponent for a polynomial-time algorithm.  We just say "logarithmic" or "polynomial".

II. Types of Complexity & Their Dominance Relations

This table is shamelessly cribbed from Gerald Kruse's page on Algorithm Efficiency (which also tells you about big-O arithmetic!  Something I've never used, but then I do AI & psychology, not theory.)

Let  a, b, c, and d denote constants

In order of dominance (top is most dominant).  Realize that algorithms of higher dominance are bad!

Function

condition 

common name

Nn

 

 

N!

 

N factorial 

an

dominates bn if a > b 

exponential

Nc

dominates Nd if c > d 

polynomial (cubic, quadratic, etc.)

n log n

 

n log n

n

 

linear 

log n

 

logarithmic (regardless of base) 

1

 

constant 



III. Criteria for Evaluating an Algorithm, Re-revisited 

  1. What do you need to know about algorithms and complexity?
    1. Worst case,
      1. If everything is organized in the worst possible way, how long will it take?
    2. Best case,
      1. If you are really lucky, what's the fastest it could run?
    3. Average or expected case.
      1. What is the most probable situation?
  2. For example, with our tree-searching algorithm, the best case is log2n, the worst case (if everything was already sorted when we started, so it just makes one big list) is n.  But if we know that the original ordering is random, then we can be fairly certain that the real order is pretty close to log2n.
  3. In the old days, people mostly were obsessed with the worst case, but nowadays we often care more about the expected case.  This is because if we approach the problem as a system issue rather than strictly an algorithmic one, we can often recognize & terminate a worst-case scenerio.  But this depends on how bad and how frequent worst-cases are.
  4. Technically, Big O notation should be used to give a boundary for the worst case.  Follow that link (to Wikipedia) to see the formal definition of other sorts of notation for defining complexity bounds.  But computer scientists tend to be sloppy & use the same notation & just say in words "average case is..."
  5. Old notes from previous lecturer said: "Average case analysis is mathematically much harder to perform, and for many algorithms is not known."
  6. One practical way to determine average case is with a stopwatch (well, anyway, to run statistics!)
  7. Here's some folks who used a stopwatch on the sorting algorithms (then went away so I had to get that from the Wayback Machine).
    1. Notice they get more precise results than we'll get below.
    2. But notice also the difference between O(n2) and O(nlogn) matters a lot more!
  8. Things to remember about big O values:
    1. Since it an upper bound on running time, performance could be much better.
    2. The input that causes worst case performance could be very rare --- and it might be recognizable or avoidable.
    3. You don't know what the constants are ---
      1. May be very large.
      2. Even a constant of 2 matters a lot if your operation is going to take 3 years (e.g. database conversions.)

IV. How Bad is Bad?

Another table shamelessly cribbed from Gerald Kruse's page on Algorithm Efficiency.

Table of growth rates

Linear
N

logarithmic
log2N

n*log2N

quadratic
N2

cubic
N3

exponential
2N

exponential
3N

factorial
N!

1

0

0

1

1

2

1

2

1

2

4

8

4

2

4

2

8

16

64

16

81

24

8

3

24

64

512

256

6561

40320

16

4

64

256

4096

65,536 

43,046,721

2.09E+013

32

5

160

1024

322,768

4,294,967,296 

…1.85E+15

2.63E+035

64

6

384

4096

262,144

1.84E+17
(Note 1) 

…3.43E+30

1.27E+089

128

7

896

16,384

2,097,152

3.4E+38
(Note 2) 

…1.18E+61

3.86E+215

256

8

2048 

65,536

1,677,216

1.16E+77 ??? 

…1.39E+122

Find other calculator

Note 1: The value here is approximately the number of machine instructions executed by a 1 gigaflop computer in 5000 years, or 5 years on a current supercomputer (teraflop computer)

Note 2: The value here is about 500 billion times the age of the universe in nanoseconds, assuming a universe age of 20 billion years.
  1. Some people think that it doesn't really matter how complex an algorithm is because computers are getting so much faster.
  2. They are wrong.  It matters.
  3. See the table above --- not going to get much done in less than a nanosecond!
  4. Let's say that for each type of problem space, there's a largest size N for how long we can wait around for our program to finish.
    1. The time we are waiting is always the same, but the size is different because the complexity is different.
    2. N1-N5 
  5. How much will faster computers help us?  This is a weird chart I got off the previous lecturer for this course 8 years ago, but it gives the general idea...  Suppose you can achieve N in a certain amount of time with a current computer, and your computer gets faster, how many more than N can you get done?  Depends on the algorithm:


    log2n
    n
    nlog2n
    n2
    2n
    Current Computers
    N1
    N2
    N3
    N4
    N5
    Ten times faster
    N1 x 30
    N2 x 10
    N3 x 3
    N4 x 3
    N5 + 3
    Thousand times faster
    N1 x 9,000
    N2 x 1,000
    N3 x 111
    N4 x 31
    N5 + 10
    Million times faster
    N1 x 19E+6
    N2 x 1E+6
    N3 x 5,360
    N4 x 1,000
    N5 + 20

  6. You can see that there are no big wins once you have an exponential algorithm!

V. The Complexity of Our Sort Algorithms

For the algorithms you've already seen, I've linked to the pseudo code from Lecture 2.  You may want to open the links in another browser window (use the right mouse button) so you can look at the analysis & the algorithm at the same time.

i) `Real' Selection Sort

  1. Most expensive / basic operation is the comparison, if (sorta[searchIx] < min).
  2. How often do we do it?
    1. Amounts to: 
      for n do
          for 1, 2, 3, 4...n-1 do  (average value (n-1)/2)
    2. This is equiv to the arithematic series (1+2+3+4+...n-1) == n*(n+1)/2 
    3. n*(n+1)/2 = 1/2*n2 + 1/2*n  which means its O(n2)  -- n2 dominates
  3. Caveat: the equation of that arithmatic series might be wrong --- I'm no mathematician, and the notes & web pages I've found don't agree with each other!  But it's something quite like that...  Anyway, that's what O notation is for, so we know what's really important!

ii) Easy Selection Sort

  1. Remember the one I did with two arrays?
  2. How often do we do if (messy[searchIx] < min)?
    1. Amounts to: 
      for n do
          for n do
    2. n2 which means its O(n2)
  3. So no difference at least in O notation!  (though see the charts on this page I mentioned earlier --- these things do matter a little.)

iii) Insertion Sort

  1. How often do we do the comparison? (sorta[searchingIx] < currentItem)
  2. In the worst case, started in reverse order, have to move every item past all the sorted items.
    1. for n do
          for 1, 2, 3, 4...n-1 do...
    2. Look familiar?  Also O(n2)
  3. So again, no difference at least in O notation!  
  4. Insertion & Selection sort are said to be algorithms in the same class, because effectively they take the same time.
  5. Here is my favorite sort algorithm in O(n2) (yes, I have favorite sort algorithms. Yes, I'm a nerd! Nerd Pride)

iv) Bubble Sort

  1. What you've hopefully noticed is that all of these sorts have an outer for loop as long as the array, and an inner loop that's almost as long as the array, so I'll just call the indecies "out" and "in".
  2. In this one, you just swap / "bubble" any large numbers up out of the way!
    array int sorta;  // this comes from somewhere in a messy state
    int temp; // for the swap
    int outIx, inIx; // these are indecies

    // outer loop over the sorted array -- when everything's sorted we're done
    for (outIx gets the values sorta.length down to 1) {
    // from the beginning of the list to the end of the unsorted bit
    for(inIx = 1; inIx < outIx-1; inIx++) {
    // If adjacent items are out of order, swap them!
    if (sorta[inIx-1]>sorta[inIx]) {
    temp = sorta[inIx-1];
    sorta[inIx-1] = sorta[inIx];
    sorta[inIx] = temp;
    }
    }
  3. How simple is that?  But we may as well do it this way since they're all O(n2) anyway.
  4. Here's a drawing of what happens (I did one in class on the board too.)  Basically on each pass, the largest number will get carried up to the top of the unsorted area.
  5. Don't forget (this page is linked from the class page) that John Morris has sorting movies.

v) Quick Sort

  1. Quick sort is different (hence its name.)
  2. Actually, it's worst case (array is already sorted backwards) is the same as the others.
  3. But, in the best case (random distribution), the pivot point will be at the middle, so the array keeps getting sliced in half.
    1. If this is true, you will have log2n times that you slice the array up,
    2. For each time you do the slicing, you do n comparisons in total.
    3. So best case is exactly O(n log n).
    4. Apparently, the average case is 38% worse than this.
  4. My favorite O(n log n) sort is called Merge Sort.  (Yes, I have a favorite O(n log n) sort...)

vi) Merge Sort

  1. Split the array in halves recursively until each half has only one element.
  2. You can now assume the sub lists are sorted (1 number can't be unsorted.)
  3. Merge the two halves together by interleaving them:
    1. Run two indecies, one from the beginning of each array.
    2. Which ever index points at the smallest number, take that number as the next in your merged array.
  4. This is guaranteed to be O(n log n), since you are certain to split things correctly.
  5. But empirically it's usually slower than quick sort --- probably because it uses twice as much memory.

VI. What have we learned? 

  1. Complexity matters (again.)
  2. The sorts you learned come in two different classes, O(n2) and O(n log n).
  3. A couple of easy sorts in each of these classes.
  4. Hopefully getting an intuition about how to recognize a log n algorithm (or algorithm component.)
  5. This strategy of chopping problems in half recursively to get a log n algorithm is often called divide and conquer.
  6. It's a good strategy!

page author: Joanna Bryson
16 February 2012