Saturday, October 30, 2010

GrouponCheck.com

I've been playing around with appengine for a while and finally I've decided to publish something. My website gets deals from groupon for all cities and displays it on http://www.grouponcheck.com.

Obviously its nothing much to look at right now, but hopefully I get some time to spruce it up. If you log on to the site, it prints out deals from just a few cities and not from the aforementioned all cities. Apparently I found out google has a 30 second limit on requests and since I load the json deals from groupon as a cronned servlet request, its hitting the 30 second limit. I might have to look at task queues and see if they have a solution for me.

Another, not quite obvious, stumbling point was how to set up masked domain forwarding. I bought my domain on google apps through enom and the domain forwarding part isn't as straight forward as my other domain forwarding (anoopkulkarni.com) from yahoo small business domains. I finally happened on to a knol which contained the required instructions with the main part in


  • Add a Host Record with @ as the Host Name, the url of the home page of your Site as the Address, and URL Frame as the Record Type
  • Change or add a CNAME of www so it points to the symbol @


If google is serious about google apps and google domains, they might look into making their services slightly more straightforward to use.

Saturday, October 16, 2010

Inter-vm RPC communication

Service Oriented Architecture

Borrowing the wikipedia definition: Service-oriented architecture (SOA) is a flexible set of design principles used during the phases of systems development and integration in computing. A system based on a SOA architecture will provide a loosely-coupled suite of services that can be used within multiple separate systems from several business domains.
SOA also generally provides a way for consumers of services, such as web-based applications, to be aware of available SOA-based services. For example, several disparate departments within a company may develop and deploy SOA services in different implementation languages; their respective clients will benefit from a well understood, well defined interface to access them
Now if we go ahead with SOA architecture principles, one of the major issues to be solved is how best to address inter-jvm data communication.
Speed is inversely proportional to the data size and one thing to consider is how do the respective communication protocols handle similar data. 

Caucho has a nice article on RMI, Hessian, Spring and CORBA communication (Technically JSON can also be used as a SOA protocol, but they dont have specs comparison for it)
More details on the study can be found protocol comparison
As you can see for sending back the same data, Hessian and RMI comprehensively beats SOAP and Corba, which unnecessarily slows down the speed of response (may not matter much on a gigabit network, but as the size of data adds up, this compounds the response size)

Spring remoting

If you dont want to go all the way into socket programming and maintain an abstraction layer over the actual communication protocol, spring has a few remoting options.
Lets take the example of Hessian (smallest memory footprint). Incase we build services in spring in a single jvm the normal spring bean definition would be
<bean id="accountService" class="example.AccountServiceImpl">
    <!-- any additional properties -->
</bean>
Now if we decide to remote this using spring remoting, the new server side definition will be
<bean id="accountService" class="example.AccountServiceImpl">
    <!-- any additional properties -->
</bean>
 
<bean name="/AccountService" class="org.springframework.remoting.caucho.HessianServiceExporter">
    <property name="service" ref="accountService"/>
    <property name="serviceInterface" value="example.AccountService"/>
</bean>
The client will then call
<bean id="accountService" class="org.springframework.remoting.caucho.HessianProxyFactoryBean">
    <property name="serviceUrl" value="http://remotehost:8080/remoting/AccountService"/>
    <property name="serviceInterface" value="example.AccountService"/>
</bean>
With minimal configuration change, spring will support either intra-vm or inter-vm communication over http. Ofcourse for additional speed, the remote host will hopefully be an intranet http call to reduce number of ip hops from client to server.
Another obvious advantage of Hessian compared to RMI is it can be called by non Java clients making it much more suitable for heterogenous environments. There are some issues concerning serialization of lazy-initialized hibernate objects.
Hessian Supported Languages: C++,C#,D,Erlang,Flash,Java,Python,PHP and Ruby
Website : spring remoting

TCP/UDP communication

If your app can write to TCP/UDP sockets, there are a few other protocols on the open source market which provide faster ser/deser speeds as well as smaller memory footprint.

Kryo

Kryo has the advantage of serializing at runtime.
Kryo kryo = new Kryo();
kryo.register(SomeClass.class);
// ...
SomeClass someObject = new SomeClass(...);
kryo.writeObject(buffer, someObject);
Once serialized, you might have to resort to socket level programming to pass the serialized object over the network, or as an alternative use Kryonet (based on top of kryo) for client/server communication. Ofcourse since both client and server need to implement Kryo ser/deser, you are tying yourself to a java based application environment.
Supported Languages: Java
Website : Kryo
Website : Kryonet

ProtoStuff runtime

Google uses ProtoBuf with most of its inter-vm communication and is one of the best java serialization libraries. It unfortunately requires a .proto file that describes the data structure, which gets difficult to maintain over time. Protostuff runtime (which uses protobuf) allows your existing pojo's to be written to different formats (protobuf,json,xml etc.) at runtime
// json serialize
boolean numeric = true;
byte[] json = JsonIOUtil.toByteArray(foo, schema, numeric);
 
// json deserialize
Foo f = new Foo();
JsonIOUtil.mergeFrom(json, f, schema, numeric);
ProtoBuf Supported Languages: Action Script,C/C++,C#/.NET/WCF/VB,Clojure,Common Lisp,D,Erlang,Go,Haskell,Java,Lua,Mercury,Objective C,Perl,PHP,Python,R,Ruby,Scala
ProtoStuff Runtime Supported Languages: Java
Website : protostuff

Apache Thrift

Thrift works similar to Google's ProtoBuf requiring pre-runtime configuration. It requires a thrift file specifying the thrift interface. However it provides the client server communication protocol with custom server/socket implementation.
TServerSocket serverTransport = new TServerSocket(somePort);
         TimeServer.Processor processor = new TimeServer.Processor(someImpl()); //Impl that the client will call
         Factory protFactory = new TBinaryProtocol.Factory(truetrue);
         TServer server = new TThreadPoolServer(processor, serverTransport, protFactory);
         server.serve();
transport = new TSocket("localhost"7911);
TProtocol protocol = new TBinaryProtocol(transport);
Client client = new Client(protocol);
transport.open();
//client can call the impl function
As opposed to kryo/hessian/Proto* which serializes entire objects, thrift has a few common base types
bool: A boolean value (true or false)
byte: An 8-bit signed integer
i16: A 16-bit signed integer
i32: A 32-bit signed integer
i64: A 64-bit signed integer
double: A 64-bit floating point number
string: A text string encoded using UTF-8 encoding
Supported Languages : C++, C#, Erlang, Haskell, Java, Objective C/Cocoa, OCaml, Perl, PHP, Python and Ruby
Website : Apache Thrift