Background
Thinking about OOP I feel that it binds data and behavior together, taking the real world example we already have array data type which is a collection of homogeneous type and in Java we have build a nice abstraction over it hence, we have methods like clone
which does its job perfectly i.e. we have data and we have associated an operation to it making client code
simple and logic of clone encapsulated behind a nice API.
Hopefully I am correct till now.
Question
So, suppose we need to sort an array, it would be good to have sort
method exposed as the API which does sorting on the array elements whose ordering is based on the element type which seems perfect but wait I see no such methods on Array ADT and here comes the confusion we have page full of documentation here which lists a number of static methods. If I remember properly it is a most hated pattern in TDD community to have static methods because it makes testing a hell, even we ignore their concern then also I see a violation of OOP concept here, we already have data then why not have all these methods in the array itself?
Update
The presence of Array example here doesn’t mean that I am confused about the Array in Java, my main concern is regarding the static method in general. Normally I get myself in the situation where I think it would be best to have a static method but many times I have seen community go against it hence there should be a solid reasoning about whether my design is flawed or static method is the best alternative in that situation.
9
This is a Java-specific design. For example in D, sort
is an instance method on arrays, not a static method as in Java.
But in Java arrays are special. They are defined as objects, but they are not instances of classes. The only instance methods they have are the ones inherited directly from Object
. There are no Array-specific instance methods since there is no Array
class where they could be defined. (Note that the Arrays
class you link to is just a utility class – arrays are not instances of this class.)
A more logical design would be to have arrays be instances of the class Array<T>
, but since the initial version of Java did not have generics, this design was not possible.
On the generic lists (like List<E>
) the sort
method is an instance method. So we can conclude the designers of Java agrees sort
should ideally be an instance method, but it was not possible for Arrays due to limitations in the core language design.
In C# a different design was chosen, and arrays are actually instances of an Array
class, which allows array-specific instance methods (Sort is still defined as static though, for whatever reason).
In short: Arrays in Java are not “true” OO, so they cannot confirm to OO design principles.
6
The notion that static methods are impossible to unit test is a myth that has proven difficult to kill.
What makes a method hard to test in isolation is stuff like hidden dependencies and accessing static state. There is no difference whatsoever between a static method and an instance method, except for the fact that invocations of instance methods get a hidden this
reference parameter that is used to access private members.
If your method has an honest API, that is, it clearly states what dependencies it has, and if it is pure in every other sense, then it is trivially testable regardless of whether or not it’s static.
If your method updates global variables, new
s up database connections and writes to disk, then it’s untestable regardless of whether or not it’s an instance method.
Yes, (virtual) instance methods can be overridden, but that’s quite beside the point. Accessing global state and writing to disk AND marking the method virtual doesn’t make it testable. That just makes it stubbable, so you can decouple from it when testing OTHER things. And that’s not even mentioning the fact that an abundance of fakes and mocks and spies etc. is a testing anti-pattern in itselft and leads to rigid and fragile test code.
The key is, and has always been to make cohesive and decoupled modules and classes. As long as you’re honest in your API and don’t reach out and pull things out of the void, then there is no problem.
Now, as for why one would make a method like sort
a static helper method instead of an instance method, it could have several reasons. At the end of the day it’s just a design decision and you could go either way. One thing that speaks against putting too many (or indeed any) utility methods on entities is that their api can easily become bloated. The class will also grow and grow. Everyone requires different helpers, so how do you decide which ones are “important” enough to put on the type and which ones to put in an external class? And as a programmer, how do you know where to look?
There is a good case to be made for keeping pure data-carrying classes separate from computational/processing classes (of course while not breaking encapsulation). This can allow you to make a general sorting algorithm for any type implementing List<? extends Comparable>
instead of having to write one for arrays, one for arraylists, one for linked lists and so on (without having to resort to using a common abstract base class).
1
To make it simple:
- Static state is bad because it is effectively global state since everyone has access to it. Wrapping it in a singleton doesn’t change anything. In general it is good to avoid global state which is hard to test.
- Static functions that are pure functions or that only mutate their arguments are perfectly fine. If they don’t do any bookkeeping or keep any state there is nothing wrong with them.
As for arrays, they’re just quirky in Java as the other answers mention. That said other collections are also sorted through Collections.sort
rather than a sort method on them. The rationale is that a collection should not be aware of how it is supposed to sort itself. This is more debatable since different languages expose this on the collection directly (JavaScript’s someArray.sort()
for example, or Swift’s someArray.sort()
or C#’s .Sort
and .OrderBy
).
This decision is up to the designer and both approaches are fine and depend on your mental model. It is more clear cut when we discuss adding something to a collection or removing something from it – I believe that the underlying collection should be in charge in this case so it can enforce invariants. Sorting is another story since anything you can iterate you can sort which is an layer you can use abstraction. (Although, not as efficiently and polymorphic dispatch still makes sense in my opinion).
2
I’m with you on the non-OOP nature of static classes part. There are several drawbacks.
It is hidden dependency by the very nature of static-ness. You just can’t inject a class, you can inject only an object. The most vivid example is, as @kai mentioned in comments, utility classes.
Classes tend to be big to huge. Since classes with static methods have nothing to do with objects, they don’t know who they are, what they should do and what they should not do. The boundaries are blurred, so we just write one instruction after another. It’s hard to stop until we’re done with our task. It is inevitably imperative and non-OOP process.
Hence the class is getting less and less cohesive. If the class has a lot of dependencies, chances are that it does more than it should.
Static methods mean that they can be called from anywhere. They can be called from a lot of contexts. They have a lot of clients. So if one class needs some little special behavior to be implemented in static method, you need to make sure that none of the other clients got broken. So such reuse simply doesn’t work. I can compare it with noble (and failed) attempt to compose and reuse microservices. The resulting classes are too generic and completely unmaintainable. This results in the whole system being tightly coupled.
To illustrate my point, I’l; take your example of sorting. I want my code to be declarative, so I separate their creation and work. So instead of having a sort
method, I’d have a Sorted
class implementing an Array
interface (not talking about any specific language now). So my code could look like:
$array =
new Sorted(
new Filtered(
new Mapped(
[1, 2, 3],
function (int $i) {
return $i * 3;
}
),
function (int $i) {
return $i %2 == 0;
}
),
function ($a, $b) {
return $b > $a;
}
)
;
print_r($array->value()); // here the objects actually work