Before we dive in the subject, I'd like to say the good news is, there's almost nothing to add: OSGi has it all already (provided we resurrect some deprecated headers!)
Bad news is, there's no such thing as a free lunch!
Let's try and state the problem briefly: what we're trying to do is
- to deploy "grids of JVMs", that is, bunches of identical nodes, from scratch (no host framework has been previously deployed for us),
- take full advantage of the OBR to automatically resolve transitive dependencies expressed between bundles,
- to support multiple implementations of one service, either exclusive or complementary
- to do so in an "industrial" manner: interactivity is not an option,
- to do so in a safe manner, avoiding "virus" bundles as far as we can.
What do you mean "from scratch"?
Our problem here is not to deploy applications inside an existing framework but to assemble a new framework, along with the application.
So the first of these requirements puts us in an awkward position already.
Traditionally, the OBR resolver assumes it runs in the host framework, and it only pulls from the repository the bundles it doesn't have already.
But then who took care of building that host framework? What bundles does it contain? What version? How do I ensure that all my nodes are identical if I have 2 different processes, one for deploying the host and another one for pulling the application from the OBR? And why should my application even need an OBR resolver if it's not its purpose?
Even worse, if the resolver is called independently from each host framework, and if I update the OBR before deploying a new host (say I need to scale up a few days later), I may get a different resolution for the new host. This breaks my requirement to ensure that all nodes run the exact same software.
So we initially had to modify the resolver a bit, to resolve independently of locally installed bundles, so that one JVM can use the resolver to build another JVM from scratch: that is bring everything, starting with the OSGi framework bundles themselves (the feature is now available in the Felix resolver). The resolution can then be done only once and saved as list of versionned bundles, that will be used as such for assembling each JVM.
The OBR addresses packages; what about services?
The second and most tricky part is to pull the services implementations from the OBR.
By nature, the OBR will only resolve the transitive dependencies, essentially expressed via the Import-Package and Export-Package manifest headers. Package wiring is a static operation and packages can be versionned and imported with a range to pick amongst different options.
But as we all design clean service architectures built upon the service registry, our package imports all point to an empty shell: the service API. All implementations are properly hidden in private packages. Resolve this with an OBR and you'll end up with a bunch of API bundles that do nothing!
One quick answer is to just package the service implementation along with its API, but
1 - it doesn't resolve the main issue: how do we pick one (or several) implementations of a service amongst our OBR options?
2 - it's actually "dangerous", as we'll see further (in the "virus" section!)
The two main issues with services are, well, the same as their strength! Dynamicity and Cardinality.
Cardinality: when expressing dependency on a service, we may need
- optionally one
- any random implementation (not recommended!)
- one specific implementation (either "hard coded" with a service filter, or to be chosen at deployment time)
- all available implementations (typically plugins)
- a chosen subset of the available implementations
Dynamicity: services can be dynamically created at runtime, and for instance, another component can be (by design) locked in a service dependency until the service is dynamically created. This is very practical pattern for IOs for example. But how can we express this statically in an OBR?
So let's review what standard OSGi tools we have at hand (if we dig a bit for deprecated ones!)
Import/Export-Package, we've already mentioned, is great but not enough.
Import/Export-Service: great! this looks just like what we need!
This header gives us the transitive dependencies that we need on the service layer.. but,
- first bummer, Bnd doesn't automatically generate these headers like it does for packages, so all these service dependencies we've been coding in java, annotations or xml, must now be expressed a second time for the OBR
- second bummer, even if Bnd could generate these automatically, it wouldn't work! Take the dynamic service example, generate an Import-Service automatically, and you get a required dependency that will never resolve in the OBR. (We can still use Import-Service, but then we need to manually add an Export-Service in the bundle that may eventually register the service at runtime)
- third bummer, being (half way?) deprecated, the Import/Export-Service headers don't support the extended header syntax, that would allow to specify things like cardinality or service filters. (There's an open OSGi issue on the subject https://www.osgi.org/bugzilla/show_bug.cgi?id=70)
- and version? You'd think it would be nice for the resolver to pick the highest service version available, or a range, like it does with packages. But as several providers of a service may have different development cycles, the version for a service isn't really relevant; version 2 of provider 2 may be much newer than version 16 of provider 1.
So Import/Export-Service is very limited.
If we keep these limitations in mind though, it is a very convenient expression for the many simple in-house cases where we know there will just be one provider (but even then, the virus bundle is lurking around!).
Bundle-Category: what does this have to do with anything? it's just for display!
Well it's not just display and the good news is that the OBR resolver can take a Bundle-Category constraint as input for its resolution. This is a prefect expression for the "plugins" use case, when you have many possible implementors and you can't know them all in advance.
Ideally, it would be nice to be able to express dependency to a category directly from a bundle, but as it is, it's already quite useful.
Require-Bundle: the return of the terrible child!
We all know why Require-Bundle has been demonized: it implies the creation of one giant classloader containing all the required bundles, and as such it breaks the fine grained modularity of OSGi. OK.
But if we forget about this runtime aspect for a minute and look at it from an OBR perspective, it's just the simple expression of an explicit assembly. Exactly what we need to artificially create a "transitive dependency" to the chosen implementation bundles.
So with a simple adaptation of it's use, this header becomes a very convenient tool in the OBR. Here's what we can do:
- create empty bundles, only providing a manifest with a Require-Bundle header that points to all specific service implementations that need to be specified
- tag the bundle with Bundle-Category: assembly
- use this "assembly bundle" as the root of the OBR resolution, and when deploying the bundles into an actual JVM, simply strip off all the assembly bundles, so that you don't end up with that one big classloader.
Of course, these assemblies don't have to be monolitic; it's perfectly fine to split them into smaller reuseable "assembly" components. The point is that they simply add the missing glue to the OBR.
The "Virus bundle"! Now what bundles should actually be started?!
Wow. We're almost there!
But now consider the following use case (which is exactly how I originally bumped into the issue).
I'm trying to assemble a Jetty web server with a web application, and my OBR happens to contain this:
- org.mortbay.jetty.server.jar
- org.mortbay.jetty.servlet-api.jar
- my.web.application.war
- my.assembly.jar (Require-Bundle: org.mortbay.jetty.server, my.web.application)
- and, org.apache.felix.http.jetty.jar
It turns out that org.apache.felix.http.jetty.jar also embeds, and exports, the servlet APIs.
So when I ask the resolver to resolve my.assembly:
- it pulls org.mortbay.jetty.server and my.web.application as requested,
- and following the package imports, it needs to pull the servlet APIs
With my luck of course, it always picked org.apache.felix.http.jetty as exporter of the servlet API, not the plain library bundle.
So I deploy this, start all the bundles, and I end up with a port conflict, with both instances of Jetty (mortbay and felix) trying to start concurrently!
This is what I call the virus bundle: I never meant to bring Felix's Jetty (least start it!), but it was dragged along by transitivity, for the packages it exports.
What? All this, and it just doesn't work!
We need a finer control on what bundles are started and what bundles should only be installed. And a we certainly can't go with an "Are you sure?" dialog box for each bundle starting!
So we'll consider Require-Bundle and Import-Service as requirements that a the corresponding bundle should be started. We can simply scan the resolved bundles to automatically generate a whitelist of bundles to be started, in the form of an OSGi/LDAP filter:
- each Require-Bundle translates into a Bundle-SymbolicName appended to the whitelist,
- each Import-Service translates into a Export-Service appended to the whitelist.
This way, if an "activable" bundle is dragged only to be used as a library; it just won't be started.
Note that, even though convenient, the Import-Service rule should only apply to the well known in-house services: in an uncontrolled OBR, it is quite a breach and should probably be removed.
Conclusion?
My point here is that there is not much to invent and, providing some simple conventions and a little intelligence in the deployer, the OBR has everything we need to assemble clusters of JVMs (actually not only JVMs but other processes as well).
On the other hand, things could be easier, through standard conventions and better support of the extended header syntax for all headers.