Monday, 10 December 2007

From C# to Java - syntax, libraries, generics...

I was disappointed to discover that the standard Java regular expression library (java.util.regex) doesn't support named captures. I'm porting some C# code over to Java, which uses this feature. The best alternative seems to be JRegex, which forces me to depend on a non-standard library.

A porting exercise like this really highlights the differences between Java and C#. I must confess that I prefer C#'s syntax and the slightly cleaner code resulting from the (mostly) very well designed and implemented set of framework libraries. The C# designers of course had two great advantages: (1) being second, and (2) not having to worry over-much about portability.

Advantage (1) is about having a 'clean slate': they could take Microsoft's Java (J++), fix (or improve) awkwardnesses, and clean up the syntax without worrying about compatibility with existing code. Property syntax is a good example; when porting over my C# code, it was a pain having to convert every property to a getXXX() / setXXX(val) method pair. On the other hand, the Java bean convention has a modest advantage when using intellisense or similar in an editor: all the get and set methods appear in a nice list - you don't have to hunt for the properties.

Advantage (2) may be more debatable. Although it's true that the Java platform APIs are careful to avoid OS-dependent behaviour, it's sometimes necessary to be aware of potential differences in the platform beneath the VM: e.g. on a Unix-like system, a file may return false for both File::isDirectory() and File::isFile() (e.g. a block-device). To be fair to .NET, the core libraries provide good support wherever they touch a common native resource such as the filesystem.

Two simple examples where the .NET framework libraries provide an out-of-the-box solution, but Java still depends on third-party support:

1) Intelligently combine path fragments to yield a single, usable pathname, using the correct separator character:
// C#
string filePath = Path.Combine(pathFrag1, pathFrag2); // fragments are strings

// Java
import org.apache.commons.io.FilenameUtils; // Requires Apache Commons IO library
...
String filePath = FilenameUtils.concat(pathFrag1, pathFrag2);

2) Read the contents of a file into a string (and ensure the file is closed):
// C#
string fileContents = File.ReadAllText(filePath);

// Java
import org.apache.commons.io.FileUtils; // Requires Apache Commons IO library
...
String fileContents = FileUtils.readFileToString(new File(filePath));

Hardly a big deal in either case, but in C# you never have to hunt for the external library and ensure it's linked. So much of the .NET BCL becomes practically 'extended syntax': if only Visual Studio implemented the equivalent of 'fix imports' in Netbeans! I do get tired of having to go to the top of the file and add the 'using ...' lines.

Generics are another matter (and much too large a subject to deal with properly here). It is a simple truth that .NET does generics properly and Java does not. If you don't like that statement, read both this article and this one, before giving me any grief about it.

Here is the same combination of generic collection and property getter, in both languages:
// C#
//
List<Segment> m_segments = new List<Segment>();
...
public Segment[] Segments
{
get { return m_segments.ToArray(); }
}

// Java
//
ArrayList<Segment> m_segments = new ArrayList<Segment>();
...
public Segment[] getSegments()
{
return m_segments.toArray(new Segment[m_segments.size()]);
}

Again, not a huge difference, but I know which I prefer to look at. The Java toArray implementation requires us to provide the new Segment array as an argument because at runtime Java's ArrayList doesn't possess the generic type information to make the copy without the type-hint. The copying is done using System.arraycopy and casts, because ArrayList's type parameter information is erased and replaced with Object (and casts) at compile-time.

Despite all of that, I'm keen to continue with both languages. If only the Java community would be prepared to go for a 'breaking change', a major step release which fixed some of the fundamental problems (e.g. generics implementation) at the expense of JVM back-compatibility, and cleaned up the syntax even further.

No comments:

Post a Comment