Tag Archives: Database

Class Cast Exception is a Proxy Problem – Revisited using the Visitor Pattern

In my previous post: ClassCastException is a Proxy Problem I talked about how Hibernate tries to optimize the loading of objects by lazily fetching the parent class object, and returning a proxy for the subclass instead of the actual subclass, thus causing a ClassCastException when the proxy is cast to the actual subclass.  The proxied object doesn’t contain member variables or a representation of the subclass’ methods, only the parent’s.

My proposed solution to the ClassCastException was to add an enum for each subclass that would be stored in the parent class, and used to figure out the subclass of the object and retrieve the actual instance from the database.  I created a method that returned a generic in either the parent class or DAO to keep the code clean.  However, after implementing this solution I discovered a bug that had been introduced a long time ago; the wrong enum value had been stored for the subclass.  The enum value was for SpanishCourse, so when I retrieved the object from the database, I cast it to SpanishCourse, when it was actually a MathCourse, causing a ClassCastException.   While I could just fix the data in the database I wasn’t satisfied with this pattern, because it broke the OO principle of encapsulation; the parent class had to have knowledge of its subclasses’ enum value.

I searched for awhile and found this article: Hibernate Proxies and Polymorphism.  I do think this is a good solution to the proxy problem, but, it pertains to lazily initializing the member variables of a subclass, and is a bit overkill for what I needed.  So then I went back to thinking about a solution for my specific problem.  How was I going to load the subclass instance without it being proxied?

Here are a few things to note about Hibernate:

  1. Hibernate proxies an object inside a database transaction, the object is lazily loaded and it contains no data.  The data can be lazily fetched once a superclass method is called, as long as the database transaction is still open, otherwise you will get a LazyLoadingException.
  2. When calling a superclass’ abstract method, Hibernate will lookup the implementation of the method corresponding to the appropriate subclass.
  3. Hibernate cannot resolve subclass methods that are not part of the abstract parent class.

Here is my proposed solution:

public abstract class Course {

public abstract class void load();

….

}

Each subclass’ implementation of the load method will handle lazily fetching the member variables of the class that are needed.  This avoids the need for a cast altogether.  If you need to return the object you can continue to use generics.  If you need the actual instance of the subclass then you will need to implement the visitor pattern described in the Hibernate articles.  Here is a simple example:

public interface CourseVisitor {

public MathCourse getMathCourse();

public SpanishCourse getSpanishCourse();

}

public abstract class Course {

public abstract void load(CourseVisitor visitor);

….

}

public class MathCourse {

public void load(CourseVisitor visitor) {

return visitor.setMathCourse(this);

}

}

public class CourseVisitorImpl implements CourseVisitor {

private MathCourse mc;

private SpanishCourse s;

public MathCourse getMathCourse() { return mc; }

public SpanishCourse getSpanishCourse(){ return s; }

public void setMathCourse(MathCourse mc) { this.mc = mc; }

public void setSpanishCourse(SpanishCourse s) { this.s = s; }

}

The Course object will be a proxy, and when the load method is called on it, passing in a new instance of CourseVisitor the subclass will be instantiated, and can be retrieved using the CourseVisitor’s getter method.

This solution introduces more code, and requires that the CourseVisitor be updated with new subclasses of Course.  However, the visitor pattern is the correct solution to use, because it makes use of language concepts instead of relying on data such as an enum value, which can be error prone.

Enhanced by Zemanta

Feather Weight Hibernate Objects

In my previous post Cheap Tricks to Fullfill Your Need for Speed, I talked about how you could reduce the memory footprint of your data base query by using a SQL query instead of a Hibernate query, and retrieve only the columns that are you need. However, there maybe times when you actually want to use a pojo, and retrieving columns and storing them in an object array is insufficient. One solution is to create a feather weight Hibernate object. A feather weight Hibernate object contains a subset of the original Hibernate object’s properties (i.e. a limited set of the data base columns), based on which properties you specify. You still map to the same data base table, but your Hibernate mapping file contains fewer columns, therefore while querying you retrieve data from fewer columns thereby speeding up your query. You can continue to use Hibernate objects as pojos by retrieving the Hibernate through a criteria query or HQL statements.

Another reason to do use feather weight objects is when creating data base tables to store data from third party vendors. In many cases the vendors provide you with a lot of data fields, which you will want to store in the table because your data needs might change based on your feature set. But you might not need to retrieve all the fields for the current feature set. Using feature weight Hibernate objects allows you to continue storing all the fields, but only retrieve those that are the most essential.

Enhanced by Zemanta

Presentation for Code Camp ‘08

Part III. Rapid Development

I’ve covered three of the areas that are very important to becoming a web-service (latency, throughput, and quality), and I’m sure this seems daunting or overwhelming. But keep in mind I’m talking about how Mint’s code and service evolved; we didn’t do everything at once because we did not have the resources or the time. As Mint started maturing there were two areas that we stressed:

Manageability: keeping the code and data base clean, and extensible in case features are cut, added, or revised over time. Its very important to start a project thinking about manageability, or how the feature will evolve within the application.

Code manageability: re-factor, don’t introduce a lot of complexity, focus on the tiered architecture to figure out where certain pieces logically fit (e.g. persistence, business logic)

Data base manageability: consider how quickly a data set is going to grow when designing tables, foreign key associations, retrieving data, and frequency with which data is accessed.

Optimization: improving performance of code at runtime in order to satisfy latency and throughput requirements. While this is important, it is not something that one should focus on from the beginning.

  • Do not make architectural decisions that are too long term, do what you need for the next 6 months. Why? Because its a startup, the product will continue to evolve in approximately 6 month cycles. Don’t waste time optimizing everything, or before you see a demand for a feature. Remember its a startup; resources are scarce and time is critical.
  • e.g. Why we didn’t cache user data from the start? Initially aggregating data nightly because synchronizing data across nodes was difficult and had no mechanism for centralized locking, but once this was put in place we switched to loading data on demand (during user login) and then going through process of aggregating and caching it (in the future we might only show most recent data instead of all data).
  • Why we didn’t shard databases from the start? Huge amount of overhead and engineering resources that needed to be allocated more impending issues.