Categorie: Java

Experiencing working overseas

16 oktober 2007

Working in a team with members located in different parts of the world is common today, but it’s maybe not that common for a company to bring the freedom of work regarding time and place to such an extend as M4N does.

I have been dreaming of a job with some flexibility in time. It turned out that M4N can provide me with more than I had dreamed about: located in a part of the world (Beijing) 6 hours earlier than the central office (Amsterdam), I can work at any time as long as I can make 16 hours per week, at any place with an Internet connection. M4N uses quite a few methods and tools for me to work as in a virtual local office: project management and issue tracking is done through the Trac system; an Intranet website and an office Calendar facilitate communication and coordination. I’m in the development team and the project version control is through the SVN system. For all these systems, I can use them just as the local members do. You may point out that communication with the team members could be a problem. Well, it is true that that we can not have actual face-to-face discussion or a real-life meeting, which is inconvenient, but we communicate through text or voice and video chat with MSN or skype. In fact, most often than not, team members in the same office use text chat too in order to keep the office in silence.

My working day can be like this: one o’clock in the afternoon, I make some tea and start the computer, then start MSN or skype. Working on my local computer with Internet connection, I program on my local computer but use the remote database in the M4N central office. Around 4 to 5 o’clock, my colleagues go on line. I chat with my team members of the projects now and then. Around 11 o’clock, I commit my work for the day and go off work.

M4N has quite a few distant employees like me. She has cut off the place and time boundary of the traditional cooperation, I feel so lucky to have the opportunities to enjoy the most freedom I can get from a job so far.

Hongqin Chen

JDK6 Web start Cache location

8 oktober 2007

For a web start deployment of an application it is sometimes
necessary to be able to get the location of the cached and downloaded
jar files which come with the web start deployed application.

I needed to be able to specify some files to the external javac class path
to compile some dynamically generated jasper reports on the client side.

There are a lot of forum topics about this, but no copy-paste solution.
So even with a "year behind" policy on the java jvm I had to fix
this by myself :P 

The jdk6 broke this because they updated the JNLPClassLoader in jdk6
to extend the URLClassLoader which is hooked to the java cache api.
This is all nicely explained at:

http://java.sun.com/developer/community/askxprt/2006/jl0410.html

But now the URL to the jar files returned from the  JNLPClassLoader are URLs
pointing to the external http location.
In jdk5 these were pointing to the cached version on the  filesystem.

To get the cached version of the files in jdk6 we use a JarURLConnection to get
the jar file from the URL, The jar file returned is cache entry of the already
downloaded file, but its name is not preserved by the cache.
It is something like "~/.java/deployment/cache/6.0/55/65eff4b7-3af949f2"
So we copy it to an temp file which has an jar suffix and we are good to go
to include it in a class path.

I hope this will help some other wondering developers which are still in
need for a workaround, below is the full code version;

 /**

 * Fix jasper compling class path.

 * This is needed for webstart, because then the jvm classpath doesn’t contrain the client libs.

 * @throws Exception

 */

private void setJasperBuildPath() throws Exception {

String reportBuildPath = “”;

 // All jars have an manifest

   Enumeration<URL> e2 = Thread.currentThread().getContextClassLoader().getResources(“META-INF/MANIFEST.MF”);

   while(e2.hasMoreElements()) {

 URL u = e2.nextElement();

 String urlString = u.toExternalForm();

 

 // only do needed libs

 if (!(urlString.indexOf(“jasper”)>0 | urlString.indexOf(“itext”)>0 | urlString.indexOf(“showplanner-1″)>0)) {

  continue;

 }

 

 // index of .jar because the resource is behind it; “foo.jar!META-INF/MANIFEST.MF”

 int jarIndex = urlString.lastIndexOf(“.jar”);

 

 // skip non jar code

 if (jarIndex<1) {

  continue;

 }

 

 String jarLocation = null;

 if (urlString.startsWith(“jar:file:”)) {

  // jdk5 webstart cache AND development

  jarLocation = urlString.substring(4,jarIndex+4);    

 } else {

  // jdk6, uses java caching api in urlclassloader which is extended by the jndiclassloader in jdk6

  // we should get the same file from the cache, we copy because of classpath needs .jar suffix

  JarFile cachedFile = ((JarURLConnection)u.openConnection()).getJarFile();

  File tempFile = File.createTempFile(“cached-”, “.jar”);

  tempFile.deleteOnExit();

  copyFile(new File(cachedFile.getName()),tempFile);

  jarLocation = “file:”+tempFile.getAbsolutePath();

 }

 reportBuildPath+=jarLocation+File.pathSeparator;

 logger.info(“Adding to report compile path: “+jarLocation+” made from: “+urlString);

}

   System.setProperty(“jasper.reports.compile.class.path”,reportBuildPath);

   logger.finer(“Set japser com: “+reportBuildPath);

}

private void copyFile(File in, File out) throws Exception {

   FileChannel sourceChannel = new FileInputStream(in).getChannel();

   FileChannel destinationChannel = new FileOutputStream(out).getChannel();

   sourceChannel.transferTo(0, sourceChannel.size(), destinationChannel);

   sourceChannel.close();

   destinationChannel.close();

}

 

 Willem Cazander

 M4N.NL

Java best practices 5 – Code to Interface, Access by name and Instance Data

7 oktober 2007

In this 5th and final installment of my series on best practices in Java I’ll wrap up with the last 3 items from the list I compiled in the first installment. Today we’ll be looking at generalizing parameters to counter the “Coding to a class instead of to an interface” bad practice, discuss why “Selecting ResultSet columns by index, instead of by name” is not always a good idea and finally we take a look at some of the usage patterns that are involved with the “needless use of instance variables”.  

Coding to a class instead of to an interface

One of the advantages of Java is its pervasive OO model. I’m aware that on the one hand not everyone thinks OO is the best thing since sliced bread and at the other hand some languages take object orientation to an even higher level. But let’s not digress, suppose you’re a Java programmer and that your team has decided OO is in fact a Good Thing.

A classic mistake people often make when starting to program in an OO language is not actually making use of the explicit features OO offers. Of course, the paradigm entails more than just putting “public class { … }” around your existing code. For people just coming from a procedural language, it may be a little understandable if they did just that. I have however seen people that have been programming in Java for 6, 7 years, even calling themselves ‘Sr Java programmer”, and in fact still do little more than just putting “public class { … }” around their otherwise procedural code.

Not using OO manifests itself at many levels in a software design. A particular devious way, and the one about which I’ll be talking here is “Coding to a class instead of to an interface”.

One of the (well known) benefits in an OO design is that a subclass can be used wherever code expects a superclass of that type. To actually reap this benefit a programmer has to be careful when e.g. crafting the signature of her methods. The parameters being used should not just be -a- superclass, but in fact the most general one that still captures all functionality that the code -actually- uses. For flexibility concerns and because of the fact that Java only supports single inheritance, an interface (aka a ‘pure abstract superclass’ in some other languages) is often the most preferred one.

As an example, consider the follow code:

public void doStuff( ArrayList<Integer> list ) {		
	list.add(1);
	// do stuff
	list.get(0);	
}

Using the ArrayList in the method’s signature severely limits the usage of the doStuff method, as only ArrayLists and their subclasses can be passed in. In this case it’s an unnecessary restriction. Nothing in the method uses anything that is specific to an ArrayList, making it a bad practice to require it.

A better practice is to use a superclass or an interface here, but which one? ArrayList has many, including superclasses AbstractList and AbstractCollection and interfaces List, Collection and Iterable. The choice between the superclass and the interface is easy here; the superclasses are merely default implementations of parts of the (complex) interfaces, so we’ll go with the interface. Looking at the method’s body we see add() and get() is being used. This means Collection is actually too general (it doesn’t have get()) so we’ll go for List:

public void doStuff( List<Integer> list ) {		
	list.add(1);
	// do stuff
	list.get(0);	
}

We can generalize this one step further by generalizing the generic parameter:

public void doStuff( List<? super Integer> list ) {		
	list.add(1);
	// do stuff
	list.get(0);	
}

Some readers might ask why the more intuitively List<Number> can’t be used here. Indeed, we could try to define the method as taking a List<Number> or a List<? extends Number>, but the first definition would exclude the possibility for passing in an actual ArrayList<Integer>, while the second definition would disallow the add() method (as someone may otherwise be passing an ArrayList<Float> in, and would find an Integer to be between the Floats after the call to doStuff).

In a way one could informally say that ‘coding to an interface’ for generic arguments works the other way around than for regular parameters in the OO model; instead of specifying the most general type and silently accept subtypes (e.g. specify List, accept ArrayList), you specify the most specific type and ‘silently’ accept more general types (e.g. specify <? super Integer>, accept <Number>). Of course, the latter is not completely silent as the Java generic syntax forces you to explicitly state your willingness to accept super types.

Selecting ResultSet columns by index, instead of by name

In Java a ResultSet represents a table of data that is typically obtained as the result from executing an SQL query. The two primary ways to retrieve data from such a Resultset’s columns is by using either the column’s index number or the column’s name.

Although certainly not everyone would agree with it, I think retrieving data by index from a resultset in Java business code is often a bad practice; it leads to code that is less readable and less maintainable. E.g. consider the following code fragment:

formatItem( allNews.getString("title"), allNews.getString("summary") );

Accessing by index it would look like this:

formatItem( allNews.getString(1), allNews.getString(2) );

With the former statement it’s fairly clear what’s happening, while with the latter we have a lot more guessing to do, especially if we are not also the writer of this piece of code. In fact, writing code in such a way has many parallels with the bad practice of using cryptic variable names (e.g. names like aa, bz, x, etc). While writing, it may be somewhat clear which variable holds what, but when reading it later on the meaning becomes incredibly hard to grasp.

As mentioned, there’s also a maintenance problem with this code. Suppose the Resultset is generated from this simple SQL query:

SELECT
	description,
	summary,
	text
FROM
	news_items

Now, later on someone may decide to add an extra column, say “author”:

SELECT
	author, 
	description,
	summary,
	text
FROM
	news_items

Especially if the query is shared it may not be obvious that somewhere there is some code with a dependency on the column order. If this new column is thus added before “description”, the Java statement given above will silently use the wrong columns.

SQL is not Java, but compare this with a Java class definition with some simple fields:

class NewsItem {
	String description;
	String summary;
	String text;	
}

When adding an additional field, nobody in his right mind would worry about code that might access a field by the order in which it was defined (although this is technically possible using reflection) and thus the new field would be added at a random location in the class definition.

Proponents sometimes argue that accessing Resultsets by index has several advantages; namely performance, the ability to select an unnamed column and the ability to distinguish between columns with equal names.

To start countering these arguments; for general business code the difference in performance between selecting by index and name is most probably unmeasurable. It’s the same kind of premature optimization that we encountered when looking at native arrays vs a Collection based alternative; the typical business code that retrieves a handful of values from a Resultset is not going to notice any difference here.

The ability to select an unnamed column or distinguish among equally named columns is more often than not a way around a bad practice in SQL. Instead of employing yet another bad practice to cover up an earlier one, it’s better to just fix this earlier one. E.g. by using the SQL clause “AS in the original query.

Does this mean all methods for accessing results by column index should be deprecated? Of course not. For code that simply needs to iterate over all columns in a resultset, index based getters are most suitable. Other specialized code may have some specific use for them too.

However, in pure business oriented code these cases may be rare. In the code example above (a very common case in business software) accessing by index is surely a bad practice.

Needless use of instance variables

When discussing the ‘coding to a class instead of to an interface’ bad practice, we mentioned that this was one symptom of not using OO correctly. Another such symptom is the “Needless use of instance variables”, which we too often see in code being written by people with a procedural background.

If OO means little more to you than putting “public class { … }” around your code, then instance variables in Java are the equivalent to (file scoped) global variables in procedural code. Many people agree that using global variables, even when limited to just the file scope, is not very often a good idea.

During auditing sessions I encountered classes consisting of 2000~3000 lines of code (which is actually a bad practice by itself), with in excess of 100 instance variables. In such a situation it becomes totally unclear which variable is effected by what method; a mess that looks and feels quite similar to the ‘global variable hell’ in some other languages.

In unraveling this mess I often encounter a number of patterns, some good, some bad.

A good practice for using instance variables is when those variables actually represent state that needs to be stored inside the mentioned object. It just may be the case that if there are many of these variables, grouping them into separate classes is recommended. E.g. consider a subscription class with 15 instance variables for personal user data and some 15 other for company data. It may be a better idea to group the first 15 variables to a separate class called User and the second 15 to a class called Company. Other than that, there’s nothing wrong with this usage of instance variables.

A truly bad practice for using instance variables is when these variables have no meaningful value outside of any method. In other words, they are actually just local variables that for some reason were declared at class scope. It’s a mistake sometimes made by beginners who seem to think that a declaration of a variable is somehow equal to actually instantiating an object and thus ‘optimize’ their code by declaring the variable as instance data. E.g.

class Foo {

	int temp1, temp2;
	
	public int multiplyAdd( int nr1, int nr2, int nr3, int nr4 ) {	
		temp1 = nr1 * nr2;
		temp2 = nr3 * nr4;
		
		return temp1 + temp2;		
	}
	
}

In the above example, temp1 and temp2 should of course be local to the multiplyAdd method.

Another bad practice for using instance variables is ‘inappropriate caching’. Of course it highly depends on the specifics of some piece of code what is actually inappropriate. It may be appropriate to instantiate some helper class that all methods are using as instance data, whenever that helper class is expensive to create. It’s however often not appropriate to use such caching when we’re dealing with stuff that is already cached at some other level or that represents some resource which needs to be closed after usage and is actually expensive to keep open. A Database connection in a Java EE environment is often a prime example of such inappropriate caching using instance data. E.g. :

class Bar {

	Connection connection = null;
	
	public void init() {
		connection = SomeFactory.getConnection();
	}

	public int doFooStuff() {
		Statement stmt = connection.createStatement();
		// execute some query and do foo stuff
	}
	
	public int doBarStuff() {
		Statement stmt = connection.createStatement();
		// execute some query and do bar stuff
	}
	
	public void close() {
		connection.close();
	}
}

In a typical Java EE AS, DB connections are already cached at the AS level (through a connectionpool). Caching a connection in instance data introduces far more problems than it solves. A typical usage pattern of a class such as Bar in the example above is to instantiate it, keep it around for some time, and call any of its doStuff methods when necessary. The problem with that is that it keeps the connection open for a much longer time than strictly required. Many application servers impose a limit on the number of simultaneously opened connections, so it’s often much better to close the connection as soon as we’re done with it. An additional problem with this pattern is the fact that the user of the Bar class needs to manually close the connection. This is not only an extra burden but also leads to the possibility of the user forgetting it and thus causing a connection leak.

Using instance data for ‘pseudo state’ may be another bad practice. With ‘pseudo state’ I refer to data that enters an object through a method call, is being used by several other methods that execute during this original call and which loses its usefulness for the object after this original call returns. Due to the fact that multiple methods make use of such data, it may be tempting for a programmer to simply store this data in an instance variable and let all methods refer to that. Sometimes this may indeed result in code that is more readable, but when used carelessly we’ll find ourselves again in the situation depicted by the “global variable hell”. Consider the following code:

class Kaz {
	
	public int data;
	
	public int doComputation(int data) {	
		this.data = data;
		increase();
		multiply(4);
		
		return this.data
	}
	
	public void increase() {
		data++;
	}
	
	public void multiply(int factor) {
		data*=factor;
	}	
}

In this example, an integer value is passed into doComputation() which assigns it to an instance variable upon which increase() and multiply() operate. After the call to doComputation(), the integer stored in data has no meaningful value for Bar anymore. It actually was a ‘shared local’ variable, where local doesn’t refer to a single method, but a small group of methods. Some people argue that this usage is exactly what instance data is for and that the alternative (passing the date value along to the different methods making use of it) actually degrades the coding style to procedural. Yet other people argue that instance data should only be used for data that has a meaningful value for the object as a whole.

In the example given above we have the additional problem that invoking the increase() or multiply() methods by themselves requires knowledge of the internals of these methods. If the method is long and complicated, it may not be obvious at all what instance data functions as ‘input’ for the method. E.g. to invoke increase() by itself we would need to do something like this:

Kaz kaz = new Kaz();
kaz.data = 4;
kaz.increase();
int result = kaz.data;

Indeed, a usage pattern that has a strong resemblance with invoking functions that depend on global variables in procedural languages. Naturally, most people would instantly say that the above is madness. Nevertheless, more often than not these patterns appear in code as the result of the ‘pseudo data’ practice. A programmer initially starts out with a class like Kaz, making only doComputation() public. After a while however there emerges a use to call increase() directly and instead of refactoring the code, the insane usage pattern of needing to assign directly to kaz.data before invoking increase() appears in the code.

Well, with this we arrived at the end of this series on best practices in Java. I hope it has been useful to people. Undoubtedly I’ll write a follow up sometime in the future. If there’s one certainty in my job, then it’s the fact that people always find ways to code using exciting new bad practices ;) In the meantime, please take a look at some of the other blogs my team-mates and I have written.

 

Arjan Tijms

 

The NullPointerException

4 oktober 2007

The NullPointerException in Java (often abbreviated as NPE) is a frequently occurring exception in Java. As most of you probably know, it simply means that you (implicitly) try to dereference a pointer that has the special reserved null value.

As it’s a frequently occurring exception every Java programmer must have stumbled upon it at some point in time. It’s simply unavoidable.

So, why this rather trivial introduction? Well, as it seems, some 12 years after the introduction of the Java programming language and thus some 12 years after the introduction of the NullPointerException, it remains an urban myth among some groups of people that Java does not have pointers.

That’s right, some people honestly still think that Java does not have pointers. There have been many discussions about this subject throughout the years (just do a little searching on the internet). The outcome is always the same. Java (of course) does have pointers. Better yet, everything except primitives is handled by a pointer. It’s just not possible to do any arithmetic with them.

So how come this misconception still exists? I can understand that there might have been some confusion when Java was just released, but nearly 12 years later? Don’t those people ever look at the java.lang.NullPointerException and wonder why it says ‘pointer’ ?

Arjan Tijms

The near future…

26 september 2007

What will the future be like, given the current rapid developments in the world of internet? I shall try to sketch my ideas here. In all times there have been people that have said: “the time we are in now is so important, so different from the past. Things are happening, things will change!” Not only in the sixties, you can read similar opinions dating back to even the old greeks. And at the moment I think the same about our time:  the last twenty years have been full of so many inventions that have indeed changed our lives completely: mobile phones, internet. How will this continue?

In a decade from now, people will probably be online all the time, using portable devices, like pdas, probably with a video camera. With wireless access to the net you can retrieve all your documents everywhere. And the same goes for companies and the government. At this moment you can use an e-ticket. The same could apply to all other forms and formalities such as passports, credit cards, membership cards and so on. The wallet will disappear. Keys will disappear.

Supplies and demands will be better taken care of: grocery lists can be automatically generated when chips or tags are attached to consumer goods. And if that is the case, you will never have to pay in a store, because the store registers what articles you take. Oh, did I say store? These will all be online, backed by huge warehouses shipping goods to the place where they are needed. And what goods are these? No longer cds, books, computers, photo cameras…

 

Houses can become smaller because of the absence of books, people will be using less paper, traveling is no longer necessary when you can do

videoconferences 24/7. Society will change from physical to wireless, floating… So we could become more environmentally friendly.

The technical possibilities of these things are there already. It is just a case of scaling the techniques that have already been invented. But the question is if we want to do without all these things that can become unnecessary. I like shopping in real life more than shopping online. I prefer reading a ‘real’ book. I like it to have cds and records in my house, even though I know I could just as well burn them onto my computer. And do we want to be online 24/7? Who has responsibility for the data in cyberspace? When my virtual money is stolen will I blame the bank? the provider? the place that provides the storage? myself? These developments make as more connected, which lso makes us more fragile. I am really curious to what is going to happen….

I  posted this message to myself (using futureme), and it will be delivered in 2017. If email still exists in that time I will be able to compare this story to the truth… If this blog still exists I will… repost this message

 

Dineke Tuinhof

 

Is it possible for a small company to switch their development language?

25 september 2007

Lately I’ve been looking at the erlang programming language and found it very much to my liking.

Even though I’m not sure it’s the right language for my company it made me think about the possibility of finding the ‘perfect’ language some day.

If that would happen, would it help us in any way?

Would it be possible to rewrite the entire application, which might have taken years to develop, with a small team?

After giving it some thought I came to the conclusion that unless you have a large enough workforce where you can let a significant amount of people work on rewriting the application while an other group keeps updating and maintaining the ‘old’ version, it’s not possible to switch.

An other point is that all the employees are probably specialized in the programming language, which is currently used.

So even if we assume they are all intelligent enough to learn the new language quickly , it will still take time before they know it as well as the one they were probably hired for and have been using for years.

I always wondered why some companies who are clearly using the wrong tool (for example php) for the job (for instance an application with millions of users spread over multiple servers) did not just switch but after thinking about it I understand it better, most companies just do not have the resources to do it.



The moral of the story is:

Think hard and long about what language to use when you still can, because switching later might be impossible.

 

           Daniel  Versteegh

 

Java best practices 4 – Native Arrays and Not Using Java 5.

3 september 2007

In the fourth edition of this series of blogs on best practices in Java I will discuss two more cases from the list I introduced in the original article. These will be “Using native arrays instead of ArrayList” and “Not using Java 5 features where appropriate”. Both are perhaps somewhat milder bad practices than those discussed in the previous installments. Nevertheless, attention to detail never hurts, so here goes:

Using native arrays instead of ArrayList

An array was once a very common data structure in many programming languages. Who didn’t grew up with using constructs like numbers[i] = 10; ? Lately however arrays have become a somewhat deprecated language feature to use directly. That is, not the concept of the array itself, but the bare implementation of it in languages like Java or C++. To contrast these ‘build-in’ arrays with some OO variant the term ‘native array’ has become in use for quite some time. In Java the native array is not really native of course. It’s an Object, although one with some special rules. Nevertheless, the same objections that I will outline below still apply to it.

The main problem with the native array in modern (Java) business code is its lack of a user friendly interface (in C/C++ there’s the additional risk of -really- dangerous buffer overflows). Java’s most usable replacement of the native array is probably the ArrayList. Using this class you can easily add something to the array even if you would try to add that something beyond its current limits (it automatically grows). Even if you stay within the limits of the array, the code to add something is simpler; you don’t have to maintain a separate index but just add something to the end. Now this seems like a mood point, but when you have to maintain hundreds of thousands of lines of business code all those small simplifications add up quickly. In code the differences would look like this:

Native arrays:

NewsItem[] newsItems = new NewsItem[allNews.rowCount()];
int currentRow = 0;
while(allNews.nextRow()) {
	newsItems[currentRow] = new NewsItem( allNews.getString("text") );
	currentRow++;
}

ArrayList:

List<NewsItem> newsItems = new ArrayList<NewsItem>(allNews.rowCount());		
while(allNews.nextRow()) {
	newsItems.add( new NewsItem( allNews.getString("text") ));		
}

Using the first form is of course not the end of the world, far from it, but the omission of the separate currentRow variable slightly reduces the complexity of this code without suffering any loss of flexibility. Lesser complexity statistically means less bugs in the long run.

A really great disadvantage of using the native array is that you can’t code to an interface when using it. I will be talking about this issue specifically when discussing the “Coding to a class instead of to an interface” bad practice, so it’ll suffice for now to say that this fact greatly limits the usability and extendability of methods that only accept native arrays.

Now some people may claim that the native array variant is faster and is therefor the preferable solution. The first part of that statement is indeed correct, native arrays -are- faster. The second part of the statement largely depends on the situation. For high performance scientific code the statement is certainly true. This kind of code depends on highly tweaked routines that implement algorithms that may access an array millions of times in very tight loops. (I got my M.Sc. in high performance computing, so even though I’m not claiming to be the ultimate expert on this field, I know a thing or two related to it ;) ).

However, in Java business code where a managed bean returns one single array of items in processing a user’s request for display on a screen, this is a totally worthless performance optimization. For one single instantiation and one single iteration, the difference would be completely negligible. Even when serving several hundreds of simultaneous users the difference would still be nearly unmeasurable.

Wanting to use native arrays for this reason is actually another well known bad practice; premature optimization. As Donald Knuth once said: “premature optimization is the root of all evil.”. In short it boils down to the fact that possible optimizations like using a native array should only be done when you can actually prove (e.g. with a profiler) that the ArrayList is indeed a performance bottleneck in your specific situation.

So, as a rule of thumb make use of ArrayList whenever you need an array structure and only resort to using native arrays when you’re absolutely sure that you need them.

Not using Java 5 features where appropriate

With Java 5 Sun introduced a number of new language features aimed at among others simplifying code (for each loop, autoboxing) and making it more typesafe (generics, enums). As with any change, there are always people who highly protest against it. Especially generics had a more than average number of people protesting against its addition to Java (this was mainly due to the somewhat clumsy interaction of legacy non-generic code and new code written to use generics).

However it has been 3 years since (Java 5 was released in 2004), and those additions are definitely here to stay. Since then even a major new Java version has been released (Java 6 in 2006), so even companies with a “1 year behind” policy as well as those with a “1 version behind” policy are by now allowed to use Java 5. Next to that, most Java books and tutorials have been updated for the Java 5 syntax.

There is currently no excuse anymore for not making use of Java 5 language features. Therefor, things like raw types (unparameterized generic types), lists of final static ints instead of enums, using an index when simply iterating over every instance in a List, using new Long(0) etc are now considered a bad practice.

Generally speaking there are three groups of people who aren’t using Java 5 language features. In the first group you’ll find beginners who simply read the wrong (outdated) tutorials. Typically it’s enough to ‘enlighten’ these people by simply telling them about the existence of the additional syntax.

The second group is more problematic though. Here you’ll find seasoned Java programmers that have been using the old syntax forever and just refuse to adapt or learn new things. In the small these people behave in the same way the old C programmers did when they had to switch from procedural to object oriented programming, or current OO programmers behave when they have to change from sequential to parallel programming. So, in this case the more general bad practice that lies under the specific bad practice of not using Java 5 language features is the unwillingness of people to adapt to change and getting stuck in old habits. For a programmer, who has to move in the fast paced world of technology, this is a serious flaw.

Finally there’s the third group who just wants to stay compatible with all existing pre-1.5 VMs people might still have installed. In some cases this may be a valid reason, but after some time the desire to stay compatible should be resisted. It hampers innovation if new technology is systematically refused. If the people you want to stay compatible with refuse to upgrade their VM, why would they go ahead and upgrade to your latest application? Probably those people are just happy running your old version on their old VMs. The most you should do is offer some really critical security updates for these old versions and spend the rest of your time on working with reasonable recent technology.

If we look a bit outside of the Java world we’ll see that not moving to a new technology has for instance seriously hurt the development of PHP. Many well known PHP applications choose to remain based on PHP 4 (2000), instead of on the newer and much improved PHP 5 (2004). See for instance http://boren.nu/archives/2007/05/11/wordpress-and-php-5. Next to that, many programmers also choose to remain at the PHP 4 level. The result is that PHP as a whole is practically many generations behind solutions like Java EE, ASP.NET or RoR. A similar thing holds for MySql 5 (one wonders, would it be the 5 that hinders adoption? ;) ).

I won’t be elaborating on the exact advantages of the Tiger language additions; many articles and books have already been written about them and besides that, this entry is not really about those advantages. Instead, the moral here is simply that still not using them by now is a bad practice.

Well, that’s it for today again. Stay tuned for the next installment where we’ll be talking about “Coding to a class instead of to an interface”,  “Selecting ResultSet columns by index, instead of by name” and “Needless use of instance variables”.

Arjan Tijms 

Java best practices 3 – Eating Exceptions and Mixing JSTL with JSF

26 augustus 2007

Today we arrived at the third installment about best practices in Java. This time I will be talking about the well known, but often sinned against practice of eating up exceptions. Next to that we will be looking at some cases where mixing JSTL and JSF might not be the best way to go.  

Eating up exceptions; continuing with invalid data

Handling exceptions in Java (and to be fair, in most other languages) is a rather difficult topic. One can try to handle the exception directly (e.g. trying a fall back server after a TimeOutException is thrown when accessing the primary server), but this is certainly not always possible. Confronted with their inability to handle an exception, some programmers choose to just ignore the exception and carry on as if nothing has ever happened. This practice is also known as “eating the exception”. In code it looks like this:

try {
   fooObject.doSomething();
}
catch ( Exception e ) {
   // do nothing
}

The thing is that something bad -has- happened and pretending it hasn’t doesn’t magically make it go away. Compare this with ignoring that little red light which tells you you’re almost out of gas. You can drive on pretending all is fine, but inevitably you’ll find yourself stranded at some desolate place, wishing you hadn’t been so careless.

A major problem with eating up exceptions is that most likely your application is in an invalid state afterwards. Continuing on only drags your problems along until ultimately something crashes anyway.

Such a crash however may occur at a completely different location and at a much later moment, making if very difficult to find out the true cause.

Besides a downright crash, another risk you’ll run is that invalid data might be persisted somewhere or that operations will be carried out with missing data. Suppose you’re the programmer that ate the exception that occurred when a user registered itself and your code allowed an order to take place without having anyone to bill. Chances are high more than a few people won’t be too happy with you.

The least, the very least you can do is log the exception when it happens so you’ll have at least some clue where to look if ‘mysterious’ things happen. Better yet, throw the exception upwards if you can’t handle it. Ultimately it might reach the user in some form (e.g. through an error page). It’s true, users don’t like error messages, but they like a system that tells them something went well when in fact it didn’t even less.

Mixing JSTL and JSF for common cases

For starters, don’t get me wrong. Using both JSTL and JSF on the same page is not a bad practice by itself. It used to be unsupported in the separate JSF 1.1 release, but starting from the version that came with Java EE 5 (JSF 1.2) mixing JSTL and JSF is explicitly supported.

The fact that you can mix them however tends to lure some programmers into using JSTL in ways for which there are better or cleaner JSF alternatives. Two common cases where this happens is for conditional rendering using the c:if tag and building a table from a collection using the c:foreach tag.

Use of the c:if tag in JSF is rarely required as most components make use of a rendered property. If you need to render multiple components conditionally, just wrap them in a h:panelGroup and set the rendered property on that one.

Likewise, use of the c:foreach tag just to build a table can often easily be replaced by the h:dataTable tag. Practically, the results are comparable. Both iterate over a List, array, etc and in effect execute their body’s content. In code they look similar too:

JSTL/JSF:

<c:forEach items="${newsItems.newsItems}" var="newsItem">
	<h:outputText value="#{newsItem.date}" styleClass="textItalic" />
	<h:outputText value="#{newsItem.item}" styleClass="paragraph" escape="false" />
</c:forEach>

JSF:

<h:dataTable value="#{newsItems.newsItems}" var="newsItem">				
	<h:column>				
		<h:outputText value="#{newsItem.date}" styleClass="textItalic" />
		<h:outputText value="#{newsItem.item}" styleClass="paragraph" escape="false" />
	</h:column>
</h:dataTable>


However, not only does an h:dataTable based component look more at place in a JSF page, there are also important technical differences. The JSTL taghandlers aren’t components. They serve to build the component tree and subsequently disappear. For some programmatic processing of the component tree this may pose a problem. Another technical difference is that an iterating JSF component doesn’t in fact create an individual component for each iteration. Instead, its children are typically created only once and are evaluated each iteration. This is in stark contrast with the JSTL foreach tag. Without being a ‘managing’ parent and without explicit knowledge about JSF, the c:foreach can do nothing more than add a new component instance to the tree for each iteration. This may or may not be the end of the world, but you should be aware of this difference.

One advantage of the c:foreach approach may be the fact that it allows you to render markup without using the HTML table element. This is often considered an advantage when creating renderings that are in fact not tables. The limited set of standard components in JSF don’t provide support for this, but the 3rd party library Tomahawk contains a t:dataList component that can be used for exactly this.

A third variant, which by itself isn’t technically bad but just looks out of place, is the usage of JSTL for rendering a table in a page that otherwise consists of JSF components:

JSF/JSTL:

<f:view locale="#{locale.locale}">	
	<h:form >

	<%-- Other JSF components here --%>

		<c:forEach items="${newsItems.newsItems}" var="newsItem">
			<br /><i><c:out value="${newsItem.date}"/></i>
			<br /><c:out value="${newsItem.item}" escapeXml="false"/>
		</c:forEach>

	</h:form>
</f:view>

So the moral of the story is. Yes, you can mix JSTL and JSF (1.2), but don’t do it if there are JSF specific solutions available. Only use JSTL if you’re absolutely sure that the rendering you wish to create really needs it and be aware of any technical consequences.

Stay tuned for the next installment.

Arjan Tijms 

 

Java best practices 2 – Explicit cases

15 augustus 2007

This is the second installment of my discussion about various bad practices in Java that I encountered during my work. As outlined in the first installment, this entry will be about “Not structuring different cases explicitly”.

After the first installment some readers wondered why the discussion is called “best practices”, while I actually talk about “bad practices”. The idea here is that recognizing these bad practices helps you in avoiding them and doing the opposite, which is a good practice ;-)  

Not structuring different cases explicitly

A particularly nasty bad practice is when programmers don’t structure different cases in their code simply as, well… different cases. Oftentimes this bad practice is introduced into a software system whenever an extension is made to existing code.

We all know the deal; we’ve created a nice and simple Servlet that only takes an ID of something and does some work with that in a clean and straightforward way. Inevitably however a boss or customer comes along, asking for an addition to be made. Now how do you handle this?

A beginning or perhaps less talented developer tends to just keep adding parameters to the URL calling the Servlet, sorting the now implicit cases out as the code progresses. At first this may seem reasonable, but it very soon becomes a total maintenance nightmare. Bug fixing becomes hard (which set of parameters belongs together?) and refactoring becomes near impossible if you need to unravel the tightly knitted fabric of 20 or more possible lines of execution, just to find out what cases the code actually handles. After some given threshold is reached even the original programmer is unable to make any changes at all to the code and development grinds to a halt.

If you ever come to work somewhere and a ‘senior’ developer tells you some piece of code can’t be touched since “it’s to dangerous to make changes”, it’s often because of exactly this bad practice.

To give you some idea of what this would look like in practice, take a look at the following code example. Let’s suppose a Servlet can be called using a URL with the following parameters:

“ownerID, customerID, productID, salesDescription, changeText and changeID”

Now suppose the code handling these would look something like this:

processor.customerID = customerID;
if (ownerID != null) {
   store.setOwnerID( ownerID );
   // lots of other code ...
   int foo = processor.getFoo(); // introduce an intermediate variable
   // again lots of other code ...
   if ( changeID != null && changeText == null && customerID > 1) {
       store.setCustomerRegularID( customerID );
       // lots of other code ...
       if ( productID != null ) {

       }
   }
   else if ( salesDescription != null && foo != changeID  ) {
        // again lots of code here
   }
   store.setFoo(foo);
   // More and more code
}
// Lots of other code again ...
if ( productID != null && changeID != null ) {
   // ...
}
// etc etc etc

This already looks pretty bad, but now suppose the “lots of other code” comment is actually replaced with lots of other code. It shouldn’t require too much imagination to understand that it becomes ‘rather difficult’ then to decipher what the code is doing.

The problem here is clear; every ‘command’ given to the code is implicitly expressed through a complex combination of overlapping parameters. Which combination of parameters relates to which command is extremely hard to grasp just by looking at the code. The above code may actually do relatively straightforward things such as “update product description” or “update customer description” but we just can’t see that when looking at the code.

The solution to this problem is equally straightforward; simply adopt the command pattern (described by the GOF in the most excellent book Design Patterns, Element of Reusable Object-Oriented Software). Using this pattern, the above code would look more like this:

switch (command.cmd) {

   case updateProductDescription:
      handleUpdateProductDescription(command.params);
      break;

   case updateCustomerDescription:
      handleUpdateCustomerDescription(command.params);
      break;

   // other cases
}

A similar approach can be used at the URL level. Simply introduce one extra parameter called “cmd” and clearly document the meaning of the rest of the parameters depending on the value of the “cmd” parameter. E.g. compare:

http://example.com/foo?productID=4&description=some_description

with

http://example.com/foo?cmd=updateProductDescription&productID=4&description=some_description

This example may look trivial, but imagine 10 URLs with each a different combination of the parameters mentioned earlier. You’ll appreciate the cmd parameter pretty soon. Please note though that in object oriented frameworks like JSF we rarely need to construct URLs manually like this.

It may be hard to believe, but there’s actually an even more hideous form of the “Not structuring different cases explicitly” bad practice; Variable name re-using. This can actually be a bad practice by itself, but it most often shows up in combination with the former. Variable name re-using is often introduced into a software system when the number of parameters and conditionals in the code has already reached a certain threshold due to the usage of the implicit cases as described above. At this point the developer in question thinks he’s being clever and ‘abstracts’ a number of (partly) common cases by reusing existing variables to hold wildly different things. Of course, this only creates an even bigger mess.

E.g. imagine the first code fragment above starting with this:

if ( customerID != null ) {
   ownerID = customerID;
}

if ( changeText != null ) {
   ownerID = productID;
}

if ( salesDescription != null && changeText != null ) {
   customerID = ownerID;
   ownerID = salesDescription;
}

// Rest of the code as given in the first fragment here

Seems totally insane? The code fragment above is in fact a ‘simplified’ version of live code that I actually encountered during code auditing.

Well, that’s it for today. In the next installment we’ll be talking about “eating up exceptions” and “mixing JSTL and JSF for common cases”.

Arjan Tijms

 

Java best practices

11 augustus 2007

Within Mbuyu, the company I work with, one of the things I’m responsible for is guarding the quality of our code base. This job mainly involves reading through source code and marking dubious constructs and practices. In the past I’ve been doing quite similar things at other locations.

Over time I came across a number of bad practices that seem to be repeated over and over. Many of those originate from people who are just beginning their Java career; new employees, interns etc. Surprisingly, even some more experienced Java developers sometimes sin on these seemingly straightforward rules.

Of course, there is a subjective factor involved here. People actually differ on what is a best practice and what is not. Anyway, without further ado, let’s start with a list of some common bad practices:

  • Not using types
  • Not validating user input
  • Mixing business logic and view code
  • Not structuring different cases explicitly
  • Eating up exceptions; continuing with invalid data
  • Mixing JSTL and JSF for common cases
  • Using native arrays instead of ArrayList
  • Not using Java 5 features where appropriate
  • Coding to a class instead of to an interface
  • Selecting ResultSet columns by index, instead of by name
  • Needless use of instance variables

Due to the size of the discussion, I shall discuss only the first 3 items of this list today and leave the rest to a follow-up posting.

Not using types

This may sound like a weird bad practice in Java. After all, Java is a strongly typed language, so how can we not be using types when the compiler enforces them? Actually, there are at least two ways around the type system; make everything a String or make everything an Object.

This first option is common in plain JSP programming; data enters the application from request parameters as Strings and developers simply don’t care to convert them to some data type. Instead, business logic methods are written to take Strings and layer upon layer only Strings are passed around. It seems insane to do this, but I’ve actually seen people doing stuff like this for years(!).

The second option is nowadays less common, although it sometimes shows up in JDBC programming; programmers do not exactly know what Java types correspond to SQL types so they just call getObject(); and pass the data along. Probably they’re hoping the next guy will somehow magically know to which type the Object needs to be casted.

Before the introduction of generics in Java 5, the second option was very pervasive in Java code though. At that time there simply was no way to store anything in a collection without resorting to using Object. It couldn’t however really be called a ‘bad practice’ by then, since there was really no sane way to circumvent the problem.

Not validating user input

Not validating user input is one of the most common bad practices I’ve encountered. It gives rise to a whole slew of problems, ranging from SQL injection, to cross-site scripting and excessive exception throwing. For instance, many beginners don’t seem to realize that Javascript validations don’t protect your server from malicious users who can (of course) just send data to your server directly, bypassing any Javascript validation you may have in place.

A more subtle form of this bad practice is when a programmer doesn’t validate if data conforms to business rules right when it enters the system. Instead, such a programmer validates data at some other point in time, perhaps when the data is actually used. Of course, it’s often too late then to correct matters and afterwards the location which allowed for this invalid data is hard or impossible to find.

Mixing business logic and view code

Not separating business- and view code is another frequently encountered practice. It’s a major cause of creating spaghetti from code, which makes bug fixing and applying changes much harder than they should be. This bad practice is especially common for people with a PHP background, where the community more or less seems to encourage this practice (or at least doesn’t discourages it as much as in e.g. the Java or .NET communities).

One major problem with this bad practice is that beginning developers don’t always want to adopt a more sane MVC approach. It’s very much true that the MVC pattern may be overkill for small applications. However, many larger applications tend to be grown out of smaller ones. On top of that, for a new programmer the one or two pages he makes at first often seem to be the entire world, even when the application which is going to include these pages already has perhaps 500 other ones. Seeing the rest of the world is a skill often learned only over time.

For Java EE, an early effort by Sun to gently push the programmer into this MVC model was JSTL. In JSTL the programmer is presented with a number of tags and an expression language (EL) to define the rendering. JSTL contains conditionals, variables, and looping constructs. It should be very clear that these are solely meant to be used for rendering and nothing else. Or isn’t that so clear? A couple of years ago I asked one programmer to stop putting business logic in JSP pages using Java scriptlets and start using JSTL. After putting up some initial resistance, he finally agreed and went back to his work. When I looked through his next CVS commit, I was in for a surprise though. All the business logic was still exactly there in the JSP page, but this guy had simply rewritten the Java scriptlet code into JSTL tags! Needless to say that expressing business logic in the view layer through JSTL is an even worse practice.

Well, I’ll leave it to that today. Stay tuned for the next installment. 

Arjan Tijms 

best counter