I was going through this heavily discussed and highly voted question on SO and stumbled across one comment that got 5 upvotes.
So I assume this was a great comment.But it got my head spinning at the same time. Can any one explain this comment? The comment says:
You can’t compare big-O values directly without thinking about constant factors. For small lists (and most lists are small), ArrayList’s O(N) is faster than LinkedList’s O(1).
7
all O(1) means is that it takes constant time, but not necessarily 1 unit of time. So for a concrete example, let’s say the LinkedList takes 3 seconds to access any element. This is in constant time, but it is fairly slow. It is not hard to imagine a situation where we have an Arraylist with O(n) access, but where n is small enough that no element would take more than 2 seconds to access. In such cases, the ArrayList would actually be faster than the LinkedList.
1
The definition of BigO means that you’ve defined a function where you’ve bounded the running time as a function of the input size, for all values greater than some constant factor.
What is a constant factor? Basicaly it’s the part of the running time that doesn’t depend on the input size. Lets say your linked list was stored in China. Even though it’s O(1), every operation involves a flight to china. That’s a constant factor, it never changes. However, the scalability of the algorithm is still O(1) because the flight to China is the same amount of time independent of the number of items in the list.
So yes, this statement is true in practice: “You can’t compare big-O values directly without thinking about constant factors.”
this statement is likely not true, it’s implementation dependent: ” For small lists (and most lists are small), ArrayList’s O(N) is faster than LinkedList’s O(1).”
1
Ever heard of a Fibonacci heap? That’s an infamous example of huge constant factor. See here: SO on Fibonacci heap
Every big-O notation hides a constant factor behind it. A O(N) function could be f(N) = N or g(N) = 100000000N; similarly, O(1) could be t(N) = 1 or k(N) = 100000000000.
In the above example, for f(N) and k(N), when N < 100000000000, the linear time f(N) is actually faster than the constant time k(N).
The definition of O(f(n)) means that the specified algorithm’s running time is less than or equal to c*f(n), for some c > 0, and for some n > n’ (i.e. large values of n going towards infinity).
So O(1) implies that the running time is less than some constant, say j.
And O(n) implies that the running time is less than some constant, say k, times n. So k*n.
Since we’re assuming the lists are small, that means that n is a very small number. Now, depending on implementation, the values of j and k can vary as well. It is entirely possible for the O(1) constant time to be very long, say on the order of seconds. And it is also possible that the O(n) constant factor could be very short, say on the order of microseconds. In this case, with a small value of n, it is clear that the O(n) algorithm can outperform the O(1) algorithm (n microseconds vs. a few seconds).
The original statement is definitely correct – you really can’t compare Big-O easily without considering constants, but mainly at small values of n. Once you get to larger values, the differences become more clear.