Thursday, August 16, 2007

Optimal Architectures for Consumer and Enterprise Mashups

It’s so predictable. Whenever there’s a hugely popular technology concept, everyone jumps on the bandwagon and ends up confusing the industry. This year’s winner is “Mashups”. Popularized by Google Maps, it’s now all the rage. Even the Wall Street Journal has covered the topic. Unfortunately, most of the mashup talk seems focused on homogeneous data sources that are easily connected to each other. In short, nothing at all like real-world enterprise architectures. As JackBe's own VP of Engineering, Deepak Alur, put it in Mashing the Enterprise Service Cloud, 'once you step inside an enterprise, things can be a lot different for mashups'. He couldn't be more right.

Deepak worked through 2 issues that differentiate enterprise and non-enterprise (or 'consumer') mashups: the disparate nature of enterprise services and the common enterprise requirement for governance. And others have added to this discussion, like SOA-expert David Linthicum, with a useful walkthrough of security strategies for mashups in the enterprise. All of these things must be functionality supported in any mashup software that claims to be 'enterprise-grade'. It feels like we're finally getting a good set of 'founding principles' for mashups in the enterprise. And to this list I'd like to propose another principle: the proper architecture for an enterprise mashup.

Let's start with a common mashup: 2 data sources joined and subsequently visualized in a map widget. Popular map widgets, such as Google Maps and Yahoo Maps, are self-contained and embeddable Ajax components that can display any address, business, point of interest or driving directions. But say you want your sales prospect’s office locations (stored in a Salesforce.com database) displayed on a Google Map. To complicate matters, you also want to retrieve your current customer’s office locations (stored in an internally-managed Siebel database) to show them along side your prospects color coded by industry. You (or more accurately, a developer) must write a significant amount of Javascript code to retrieve the prospects and display on the map. In this particular example all this “mashing” happens in the browser as show below.

Mashing in the browser works well when information can be retrieved and directly display on the map. However, things get tougher when we need to “integrate” data from multiple sources before we display on a map because the more integration we have to do in the browser, the more customized code we need to write. That means that you not only have to write Javascript code to get the data, but now you have to write code to analyze, integrate and finally place the data on the map. In essence you’ve turned the browser into a mini-integration engine. This is not necessarily a great idea for a wide variety of reasons, particularly the processing typically required to perform such integration tasks. There a solution to this problem though; it’s called Enterprise Mashups, and it lives in the server, not the browser.

Enterprise mashups integrate or “mash” mostly on the server, where the real processing power typically resides. Let me illustrate this by expanding on the prior example. Suppose we want to take all our customers (in Siebel) and our prospects (in Salesforce) and detect which ones are competitors with the help of Hoover’s web services. We could do just what we did in the consumer example plus write more code to get a list of competitors from Hoovers for each prospect and customer. In this example, there’s large amounts of data traveling to the browser plus the data integration is supposedly happening in the browser as well. This is a very tall order for a tool that was originally intended to view static HTML. The efficient solution to this need is to “mashup” the data in the server as shown below. This is the core of Enterprise Mashups; they do all the integration work in the server where the processing power is.

Are server-driven mashups required for the enterprise? No. I expect some trivial mashing could be successfully completed in a browser-driven architecture. But even the simple example above becomes almost impossible to achieve once you've factored in the security and reliability requirements common to an enterprise. Or allow for more complex data needs like a filter-intersect-join mashup with 10,000 data elements from half-a-dozen separate data sources.

The road to successful enterprise mashup solutions is similar to most enterprise software. A proper, enterprise-capable architecture is a must.

3 comments:

Edwin said...

Hey John,

How is the server mashup component you are describing different from a more traditional data virtualization layer (BEA Liquid Data, Composite Software and others).

Cheers,
Edwin Khodabakchian

John Crupi said...

Hey Edwin,

Traditional data virtualization focuses more on IT-driven integrations. Or more specifically, the big stuff. Mashups is directed towards the user. It provides user-driven, ad-hoc capability to integrate disparate data from Web based sources.

The idea behind the blog is that consumer mashups are done in the browser and enterprise mashups are done mostly in the server.

jc

Bruce H said...

Hey, just found this blog - good stuff! Your comments about using a server side engine to perform aggrigation and correlation of content headed for the mash up is dead on. Take a look at one of our examples of this pattern in action here:

http://hardtack.osgcorp.com/osg-hardtack/

The idea was that the sales side of Real Estate is tough to manage and understand from a macro level because all of the data is hidden away in MLS databases that are locally controlled (hundreds of them) and they don't all roll up into a single source. So we build components to mine the MLSs and aggregate the information so we could do statists on it.

The posts in this blog show you guys at JackBe are on a good path. Keep it up!

Bruce Henderson
OSG