Java best practices 4 – Native Arrays and Not Using Java 5.

3 September 2007, by: Arjan Tijms

In the fourth edition of this series of blogs on best practices in Java I will discuss two more cases from the list I introduced in the original article. These will be “Using native arrays instead of ArrayList” and “Not using Java 5 features where appropriate”. Both are perhaps somewhat milder bad practices than those discussed in the previous installments. Nevertheless, attention to detail never hurts, so here goes:

Using native arrays instead of ArrayList

An array was once a very common data structure in many programming languages. Who didn’t grew up with using constructs like numbers[i] = 10; ? Lately however arrays have become a somewhat deprecated language feature to use directly. That is, not the concept of the array itself, but the bare implementation of it in languages like Java or C++. To contrast these ‘build-in’ arrays with some OO variant the term ‘native array’ has become in use for quite some time. In Java the native array is not really native of course. It’s an Object, although one with some special rules. Nevertheless, the same objections that I will outline below still apply to it.

The main problem with the native array in modern (Java) business code is its lack of a user friendly interface (in C/C++ there’s the additional risk of -really- dangerous buffer overflows). Java’s most usable replacement of the native array is probably the ArrayList. Using this class you can easily add something to the array even if you would try to add that something beyond its current limits (it automatically grows). Even if you stay within the limits of the array, the code to add something is simpler; you don’t have to maintain a separate index but just add something to the end. Now this seems like a mood point, but when you have to maintain hundreds of thousands of lines of business code all those small simplifications add up quickly. In code the differences would look like this:

Native arrays:

NewsItem[] newsItems = new NewsItem[allNews.rowCount()];
int currentRow = 0;
while(allNews.nextRow()) {
	newsItems[currentRow] = new NewsItem( allNews.getString("text") );
	currentRow++;
}

ArrayList:

List<NewsItem> newsItems = new ArrayList<NewsItem>(allNews.rowCount());		
while(allNews.nextRow()) {
	newsItems.add( new NewsItem( allNews.getString("text") ));		
}

Using the first form is of course not the end of the world, far from it, but the omission of the separate currentRow variable slightly reduces the complexity of this code without suffering any loss of flexibility. Lesser complexity statistically means less bugs in the long run.

A really great disadvantage of using the native array is that you can’t code to an interface when using it. I will be talking about this issue specifically when discussing the “Coding to a class instead of to an interface” bad practice, so it’ll suffice for now to say that this fact greatly limits the usability and extendability of methods that only accept native arrays.

Now some people may claim that the native array variant is faster and is therefor the preferable solution. The first part of that statement is indeed correct, native arrays -are- faster. The second part of the statement largely depends on the situation. For high performance scientific code the statement is certainly true. This kind of code depends on highly tweaked routines that implement algorithms that may access an array millions of times in very tight loops. (I got my M.Sc. in high performance computing, so even though I’m not claiming to be the ultimate expert on this field, I know a thing or two related to it 😉 ).

However, in Java business code where a managed bean returns one single array of items in processing a user’s request for display on a screen, this is a totally worthless performance optimization. For one single instantiation and one single iteration, the difference would be completely negligible. Even when serving several hundreds of simultaneous users the difference would still be nearly unmeasurable.

Wanting to use native arrays for this reason is actually another well known bad practice; premature optimization. As Donald Knuth once said: “premature optimization is the root of all evil.”. In short it boils down to the fact that possible optimizations like using a native array should only be done when you can actually prove (e.g. with a profiler) that the ArrayList is indeed a performance bottleneck in your specific situation.

So, as a rule of thumb make use of ArrayList whenever you need an array structure and only resort to using native arrays when you’re absolutely sure that you need them.

Not using Java 5 features where appropriate

With Java 5 Sun introduced a number of new language features aimed at among others simplifying code (for each loop, autoboxing) and making it more typesafe (generics, enums). As with any change, there are always people who highly protest against it. Especially generics had a more than average number of people protesting against its addition to Java (this was mainly due to the somewhat clumsy interaction of legacy non-generic code and new code written to use generics).

However it has been many years since (Java 5 was released in 2004), and those additions are definitely here to stay. Since then even two major new Java versions have been released (Java 6 in 2006, Java 7 in 2011), so even companies with a “1 year behind” policy as well as those with a “1 version behind” policy are by now allowed to use Java 5. Next to that, most Java books and tutorials have been updated for the Java 5 syntax.

There is currently no excuse anymore for not making use of Java 5 language features. Therefor, things like raw types (unparameterized generic types), lists of final static ints instead of enums, using an index when simply iterating over every instance in a List, using new Long(0) etc are now considered a bad practice.

Generally speaking there are three groups of people who aren’t using Java 5 language features. In the first group you’ll find beginners who simply read the wrong (outdated) tutorials. Typically it’s enough to ‘enlighten’ these people by simply telling them about the existence of the additional syntax.

The second group is more problematic though. Here you’ll find seasoned Java programmers that have been using the old syntax forever and just refuse to adapt or learn new things. In the small these people behave in the same way the old C programmers did when they had to switch from procedural to object oriented programming, or current OO programmers behave when they have to change from sequential to parallel programming. So, in this case the more general bad practice that lies under the specific bad practice of not using Java 5 language features is the unwillingness of people to adapt to change and getting stuck in old habits. For a programmer, who has to move in the fast paced world of technology, this is a serious flaw.

Finally there’s the third group who just wants to stay compatible with all existing pre-1.5 VMs people might still have installed. In some cases this may be a valid reason, but after some time the desire to stay compatible should be resisted. It hampers innovation if new technology is systematically refused. If the people you want to stay compatible with refuse to upgrade their VM, why would they go ahead and upgrade to your latest application? Probably those people are just happy running your old version on their old VMs. The most you should do is offer some really critical security updates for these old versions and spend the rest of your time on working with reasonable recent technology.

If we look a bit outside of the Java world we’ll see that not moving to a new technology has for instance seriously hurt the development of PHP. Many well known PHP applications choose to remain based on PHP 4 (2000), instead of on the newer and much improved PHP 5 (2004). See for instance http://boren.nu/archives/2007/05/11/wordpress-and-php-5. Next to that, many programmers also choose to remain at the PHP 4 level. The result is that PHP as a whole is practically many generations behind solutions like Java EE, ASP.NET or RoR. A similar thing holds for MySql 5 (one wonders, would it be the 5 that hinders adoption? 😉 ).

I won’t be elaborating on the exact advantages of the Tiger language additions; many articles and books have already been written about them and besides that, this entry is not really about those advantages. Instead, the moral here is simply that still not using them by now is a bad practice.

Well, that’s it for today again. Stay tuned for the next installment where we’ll be talking about “Coding to a class instead of to an interface”,  “Selecting ResultSet columns by index, instead of by name” and “Needless use of instance variables”.

Arjan Tijms 

7 comments to “Java best practices 4 – Native Arrays and Not Using Java 5.”

  1. Alex Miller says:

    On arrays, there are some other downsides to arrays over collections as well:1) Cannot be made immutable (esp a problem if returning local state from a class and expecting it not to be modified by the caller).  2) Cannot be made thread-safe without adding external synchronization.3) Can only contain reifiable types, which can lead to some weird situations when combined with generics (like the inability to create a new array based on a generic type).  The "Java Generics and Collections" book goes so far as to recommend that native arrays should be treated as a "deprecated data type" in Java! 

  2. arjan says:

    You are indeed right about the inability to making arrays immutable. This is just one of the many consequences of not being able to code to an interface, although probably one of the most important ones. I was willing to save this example for the talk about coding to an interface, so you spoiled the excitement a bit 😉

     

    The thread-safeness issue could be attributed too to not being able to code to an interface (so a specific thread safe implementation can be passed in). On the other hand, arrays have a small advantage in thread-safeness when iterating over them. Even the best internally synchronized collection requires external synchronization when iterating (even when only readers access it). E.g. Collections.synchronizedCollection‘s Javadoc states: "It is imperative that the user manually synchronize on the returned collection when iterating over it"

     

    Since for arrays the index is by definition managed by the code using the array (and is thus not internal to the array), it’s thread-safe to iterate over a shared array when you can guarantee that only readers access it.

     

    The reifiable types issue is a good point too. I typically blame Java for this because of the type erasure. I.e. you also can’t do a new T(); so it feels only logical you can’t do a new T[]; either, but it’s a very practical disadvantage for native arrays indeed.

     

    There is a small workaround for this though, involving java.lang.reflect.Array.newInstance. This workaround does require that code that binds generic parameters also passes a Class object in.

    E.g. instead of writing something like: MyStructure<Foo> ms = new MyStructure<Foo>();, you would have to write MyStructure<Foo> ms = new MyStructure<Foo>(Foo.class); Inside this structure you can then use Foo.class to create a new instance and cast it to a T[]. I know it’s not pretty 😉

     

    I agree that native arrays should as much as possible be treated as a ‘deprecated data type’, although they shouldn’t be banned from the language. Data structures (like ArrayList) of course still have to use it for their internal workings.

  3. kirill says:

    You’ve forgot one more option against Java 5 – legacy applications and EE platform. E.g. IBM Websphere Application Server is still using Java 1.4

  4. arjan says:

    Kirill, I understand what you’re saying. A small percentage of highly critical application platforms are very slow in upgrading. It’s not uncommon for these kind of businesses to still run on a 2.2 kernel with a Java 1.3 VM and an ancient appserver.

    Most companies however have a one year behind or one version behind policy, instead of a 5 years behind or a 3 versions behind policy.

    Also, websphere itself is certified for running on JDK 5 since more than a year. See the info page about Websphere 6.1 (2006): http://www-306.ibm.com/software/webservers/appserv/was/features/

     

  5. Mike Miller says:

    You forgot one group – those of us stuck on 1.4 because their app is still running on Weblogic 8.1.

  6. arjan says:

    >You forgot one group – those of us stuck on 1.4 because
    >their app is still running on Weblogic 8.1.

     

    Isn’t that the same as saying being stuck on 1.4 since your app is still running on Tomcat 4?

     

    The fact is that BEA Weblogic Server 10 has been released and it (of course) supports JDK 1.5. Heck, even Weblogic 9, which has been out for quite some time, supported JDK 1.5 and even took advantage of some JDK 1.5 specific features. E.g. annotations for webservices, using the XML parser from Java 5, etc. See http://edocs.bea.com/wls/docs90/notes/new.html

     

    Again, some ultra conservative businesses simply never upgrade any of their software (e.g. banks still running on mainframes dating back to the 70-ties), if you’re in such a business the blog obviously doesn’t apply to you. But for companies that ‘merely’ have a 1 year behind or 1 version behind policy there is very little excuse not the use Java 5 (including upgrades for the layers on top of it).

  7. Java best practices 3 – Eating Exceptions and Mixing JSTL with JSF | J-Development says:

    […] Eating Exceptions and Mixing JSTL with JSF « Java best practices 2 – Explicit cases Java best practices 4 – Native Arrays and Not Using Java 5. […]

Type your comment below:

Time limit is exhausted. Please reload CAPTCHA.

css.php best counter