Blog changes and a real update

B

For anyone that actually reads this blog (yes, there is a handful), you may have noticed some changes. I decided to do a pseudo “re-branding” since I am now going to include posts about my latest academic work, and not just posts related to programming contests.

As I mentioned in my last post, I’ve now been at Stanford for almost two weeks. One of the projects I am involved in is with the next revision of the International Classification of Disease (ICD-11). Researchers/developers at Stanford have been collaborating with the World Health Organization to develop software that will allow members of the medical research community to collaboratively develop ICD-11 in a somewhat open, crowd-sourcing manner.

Members of Stanford’s Protégé team have extended the web-based version of Protégé to help support this effort. For those unfamiliar with Protégé, it is a freely available ontology editor. If you don’t know what an ontology is, for the purposes of this post, you can think of it as a taxonomy or hierarchy of terms.

Now, one of the big issues with this project is managing this collaborative effort. There could potentially be all kinds of simultaneous editing, conflicts to resolve, and issues with simply understanding the evolution of the terminology. What I have started doing is analyzing the changes that have been made to the ontology so far.

I have a bunch of researchy-type questions I want to answer through this analysis and at first, I thought I would just write some scripts to pull out the data. However, since this is so exploratory, I thought it might be more interesting and potentially useful to develop a tool that would allow me to explore the data. I started writing a Protégé plugin that provides different analysis views based on the tracked changes of a given ontology (in this case, the ICD-11 ontology).

For one view in the plugin, I wanted to display changes made to terms over time. This way I could visually see how a term has evolved over a certain time period. I wanted this view to be similar to the NameVoyager applet. To create the area chart effect that I needed, I decided to use the JFreeChart library, which I have used a few times in past projects.

Creating the view was reasonably simple, but I did run into one annoying technical detail that took a while to resolve. With JFreeChart, chart’s use dataset objects to hold the data the chart is to represent. For my area charts, I used the DefaultCategoryDataset, which takes three parameters for adding new values; the actual numeric data value, a row key and a column key.

For example, in my case, I want to show term changes over time, so my x-axis is going to be months, y-axis the number of changes, and I want to have a different data series or category for each term. Thus, in my DefaultCategoryDataset, the row key is the term name, while the column key is the month and year.

The pesky technical issue was that I wanted my month and year to display like “July, 2009”, but sorted along the x-axis based on a date comparison. The row and column keys can be any Comparable class, so I created my own DateKey class that implemented the Comparable interface. For displaying, JFreeChart simply calls the toString method on the row and column key objects, so in my DateKey class, I format the data object as I want in the toString method, but in my compareTo function, I call the Date class compareTo function.

Based on the JFreeChart API, I assumed this would work, but I kept getting multiple entries for “September, 2009” for each term displayed in the visualization. I quickly realized that the compareTo function is never called. Eventually, after looking through the JFreeChart source code, I found that they store their row and column keys in a list. To determine if the key already exists, they call the list’s indexOf method. This method does not use the compareTo function, it performs a linear O(n) search using the equals method.

So, in order to make my DateKey work, I had to implement an equals method and hashCode function. JFreeChart’s reliance on these methods rather than the compareTo function seems like a bug in their library. If compareTo is never called, then why must the row and column key’s implement Comparable?

Here’s a couple of screenshots of what the view turned out to look like.

Here’s the final code for my DateKey class.


public class DateKey implements Comparable<DateKey> {
private static final DateFormat formatter =
new SimpleDateFormat("MMMMM, yyyy");
private Date key;

public DateKey(Date key) {
this.key = key;
}

public Date getKey() {
return key;
}

public String toString() {
return formatter.format(key);
}

public boolean equals(Object o) {
return key.equals(((DateKey)o).getKey());
}

public int hashCode() {
return key.hashCode();
}

public int compareTo(DateKey o) {
return key.compareTo(o.getKey());
}
}

About the author

Sean Falconer

3 Comments

By Sean Falconer

Sean Falconer

Get in touch

I write about programming, developer relations, technology, startup life, occasionally Survivor, and really anything that interests me.