In the fourth edition of this series of blogs on best practices in Java I will discuss two more cases from the list I introduced in the original article. These will be “Using native arrays instead of ArrayList” and “Not using Java 5 features where appropriate”. Both are perhaps somewhat milder bad practices than those discussed in the previous installments. Nevertheless, attention to detail never hurts, so here goes:
Using native arrays instead of ArrayList
An array was once a very common data structure in many programming languages. Who didn’t grew up with using constructs like numbers[i] = 10; ? Lately however arrays have become a somewhat deprecated language feature to use directly. That is, not the concept of the array itself, but the bare implementation of it in languages like Java or C++. To contrast these ‘build-in’ arrays with some OO variant the term ‘native array’ has become in use for quite some time. In Java the native array is not really native of course. It’s an Object, although one with some special rules. Nevertheless, the same objections that I will outline below still apply to it.
The main problem with the native array in modern (Java) business code is its lack of a user friendly interface (in C/C++ there’s the additional risk of -really- dangerous buffer overflows). Java’s most usable replacement of the native array is probably the ArrayList. Using this class you can easily add something to the array even if you would try to add that something beyond its current limits (it automatically grows). Even if you stay within the limits of the array, the code to add something is simpler; you don’t have to maintain a separate index but just add something to the end. Now this seems like a mood point, but when you have to maintain hundreds of thousands of lines of business code all those small simplifications add up quickly. In code the differences would look like this:
NewsItem newsItems = new NewsItem[allNews.rowCount()];
int currentRow = 0;
newsItems[currentRow] = new NewsItem( allNews.getString("text") );
List<NewsItem> newsItems = new ArrayList<NewsItem>(allNews.rowCount());
newsItems.add( new NewsItem( allNews.getString("text") ));
Using the first form is of course not the end of the world, far from it, but the omission of the separate currentRow variable slightly reduces the complexity of this code without suffering any loss of flexibility. Lesser complexity statistically means less bugs in the long run.
A really great disadvantage of using the native array is that you can’t code to an interface when using it. I will be talking about this issue specifically when discussing the “Coding to a class instead of to an interface” bad practice, so it’ll suffice for now to say that this fact greatly limits the usability and extendability of methods that only accept native arrays.
Now some people may claim that the native array variant is faster and is therefor the preferable solution. The first part of that statement is indeed correct, native arrays -are- faster. The second part of the statement largely depends on the situation. For high performance scientific code the statement is certainly true. This kind of code depends on highly tweaked routines that implement algorithms that may access an array millions of times in very tight loops. (I got my M.Sc. in high performance computing, so even though I’m not claiming to be the ultimate expert on this field, I know a thing or two related to it 😉 ).
However, in Java business code where a managed bean returns one single array of items in processing a user’s request for display on a screen, this is a totally worthless performance optimization. For one single instantiation and one single iteration, the difference would be completely negligible. Even when serving several hundreds of simultaneous users the difference would still be nearly unmeasurable.
Wanting to use native arrays for this reason is actually another well known bad practice; premature optimization. As Donald Knuth once said: â€œpremature optimization is the root of all evil.â€. In short it boils down to the fact that possible optimizations like using a native array should only be done when you can actually prove (e.g. with a profiler) that the ArrayList is indeed a performance bottleneck in your specific situation.
So, as a rule of thumb make use of ArrayList whenever you need an array structure and only resort to using native arrays when you’re absolutely sure that you need them.
Not using Java 5 features where appropriate
With Java 5 Sun introduced a number of new language features aimed at among others simplifying code (for each loop, autoboxing) and making it more typesafe (generics, enums). As with any change, there are always people who highly protest against it. Especially generics had a more than average number of people protesting against its addition to Java (this was mainly due to the somewhat clumsy interaction of legacy non-generic code and new code written to use generics).
However it has been many years since (Java 5 was released in 2004), and those additions are definitely here to stay. Since then even two major new Java versions have been released (Java 6 in 2006, Java 7 in 2011), so even companies with a “1 year behind” policy as well as those with a “1 version behind” policy are by now allowed to use Java 5. Next to that, most Java books and tutorials have been updated for the Java 5 syntax.
There is currently no excuse anymore for not making use of Java 5 language features. Therefor, things like raw types (unparameterized generic types), lists of final static ints instead of enums, using an index when simply iterating over every instance in a List, using new Long(0) etc are now considered a bad practice.
Generally speaking there are three groups of people who aren’t using Java 5 language features. In the first group you’ll find beginners who simply read the wrong (outdated) tutorials. Typically it’s enough to ‘enlighten’ these people by simply telling them about the existence of the additional syntax.
The second group is more problematic though. Here you’ll find seasoned Java programmers that have been using the old syntax forever and just refuse to adapt or learn new things. In the small these people behave in the same way the old C programmers did when they had to switch from procedural to object oriented programming, or current OO programmers behave when they have to change from sequential to parallel programming. So, in this case the more general bad practice that lies under the specific bad practice of not using Java 5 language features is the unwillingness of people to adapt to change and getting stuck in old habits. For a programmer, who has to move in the fast paced world of technology, this is a serious flaw.
Finally there’s the third group who just wants to stay compatible with all existing pre-1.5 VMs people might still have installed. In some cases this may be a valid reason, but after some time the desire to stay compatible should be resisted. It hampers innovation if new technology is systematically refused. If the people you want to stay compatible with refuse to upgrade their VM, why would they go ahead and upgrade to your latest application? Probably those people are just happy running your old version on their old VMs. The most you should do is offer some really critical security updates for these old versions and spend the rest of your time on working with reasonable recent technology.
If we look a bit outside of the Java world we’ll see that not moving to a new technology has for instance seriously hurt the development of PHP. Many well known PHP applications choose to remain based on PHP 4 (2000), instead of on the newer and much improved PHP 5 (2004). See for instance http://boren.nu/archives/2007/05/11/wordpress-and-php-5. Next to that, many programmers also choose to remain at the PHP 4 level. The result is that PHP as a whole is practically many generations behind solutions like Java EE, ASP.NET or RoR. A similar thing holds for MySql 5 (one wonders, would it be the 5 that hinders adoption? 😉 ).
I won’t be elaborating on the exact advantages of the Tiger language additions; many articles and books have already been written about them and besides that, this entry is not really about those advantages. Instead, the moral here is simply that still not using them by now is a bad practice.
Well, that’s it for today again. Stay tuned for the next installment where we’ll be talking about “Coding to a class instead of to an interface”, “Selecting ResultSet columns by index, instead of by name” and “Needless use of instance variables”.