Reuse applied to environmental software : design of an adaptable system for water quality assessement and monitoring
DATE:
2013-05-30
UNIVERSAL IDENTIFIER: http://hdl.handle.net/11093/86
UNESCO SUBJECT: 1203.17 Informática
DOCUMENT TYPE: doctoralThesis
ABSTRACT
One big challenge for the future of software engineering is to offer support for easier connecting different systems and permitting collaboration of different programs, domains or people and organizations of the same domain. This for example may include the definition of workflows across the systems and program boundaries and, especially for science, to include all steps of the scientific research chain into it. A technology that may help realizing that challenge is the service paradigm of software engineering called SOA, which not only is useful and promising for the business sector where it derives from, but also for a large range of different other domains as well.
Our contribution towards that goal is a case study in the environmental modeling software domain, where we analyze how to offer water quality assessment models in a service oriented manner. With the lessons learned in this case-study we contribute to the research that is still needed in this area to determine the best practices, rules and patterns to effectively implement the SOA paradigm in scientific domains like environmental modeling. A general problem still is the complexity of the SOA world and the sheer number of technologies one has to make oneself familiar with in order to design a SOA based software system or just to get an overview. Many authors complain about that. Here, exemplary, a statement of Dave Thomas:
“In the rush to create a middleware platform dependency larger than your current legacy platform dependency, vendors and their well-paid industry analysts push a plethora of complex technologies on organizations that just want to run their businesses. Some of the most talented technical experts I know find the complexity overwhelming when there are at least 5 different ways to do the same thing”. Finally he concludes in the summary: “There is just too much stuff in the SOA for most developers to absorb. It is difficult to understand what is real and what is hype. Beyond that, it is difficult to understand the performance and inter-operability of different implementations approaches. Developers need guidelines for when and where to use XML, objects and BPEL and for how to make them play together”
Apart from getting an overview over the confusing amount of competing technologies in the SOA domain, one furthermore has to find out the limits of what is supported by the current state of development in this area. For instance, the first problem that arose in our case was how to compose services into flexible workflows in an easy and natural way. We seemed to be able to choose from a variety of service composition languages and composing systems for that aim, thus, as a very suggesting choice for such composing system, we wanted to use scientific workflow management systems for that purpose. We noticed however, that existing ones for various reasons don’t fit well into the SOA paradigm yet as for example they cannot be accessed as services from outside to integrate their workflows into own programs.
Although most of them indeed are capable to integrate services, they nevertheless are not designed to fit well into a SOA. Trying to write services that are independent of concrete data sources we found out that despite the existence of numerous technologies in this problem field, none of them permits including new data sources by just defining new mappings as it was the initial idea.
A third difficult question was the problem of incorporating the user into the service paradigm. How to do this properly? We need to create highly interactive workflows, while WS-BPEL, the most frequently used language for service orchestration, is most suitable for creating automated workflows with little user interaction like needed in the business sector. To now facilitate the understanding of how different SOA technologies and problem areas are related and where to put in mentioned problems and where to look for a solution, this
thesis elaborates a theoretical fundament of SOA where the basic principles of SOA are put into the following two categories: technology agnostic design, and SoC techniques.
Technology agnostic design means using the outstanding capabilities to bridge differences between different systems and platforms to overcome platform dependencies, which actually in most cases is the main reason for the decision for a SOA based approach. In this work we make clear that because every design decision may introduce some unwanted new platform dependencies, a design unaware of the dependency problem easily unmakes this main advantage of SOA, and introduces new dependencies on workflow management systems, data bases, concrete linked services and more. For a true SOA, it is necessary to carefully minimize dependencies at all levels.
Technology agnostic design also includes data bases and data structures and the dependences deriving from them. Many of the current unsolved problems are a consequence from the data independence problem, and a solution for this maybe would be the biggest step forward towards an easier future for service cooperation: an uniform and in large parts automated forward and backward transformation of data to the required form coming from any kind of data sources. Even if in this work we cannot significantly contribute to solve this problem, our analyze comes to the conclusion that for true data independence two transformation steps are necessary to compensate for different data base models and second for data structured in different ways. None of existing data independence technologies however fully supports both transformation steps. Existing data independence technologies therefore only allow a certain grade of data independence like for instance design patterns like data access objects (DAO), what we finally used for our implementation. Here, all data access is performed via data access objects and encapsulated into a data access layer (DAL). A change of the data source requires rewriting the DAL, but does not affect the rest of the system.
The second category of SOA principles are various SoC techniques that are compelling to be used in a service oriented architecture: A layered design, i.e. the SOA stack, the separation of interface and implementation(s), and the separation of components and composition.
When trying to design a SOA stack, one will notice that there is no consensus of which layers a SOA stack has to consist of. Many different designs and approaches are reflecting different views and philosophies to the SOA idea as well as different conditions found in different companies and projects. This for example includes the role of the ESB. It is only clear that SOA extends the traditional two or three layered architecture by some additional layers and a more fine-grained layer design. It can be expected that the ongoing development of WS-*
standards may change the view onto this topic as they may establish standard solutions for problems that today still need to be solved in an ad-hoc way for every SOA implementation. In the SOA stack we implemented, we concentrated in the main on the novel orchestration layer where service composition takes place, as well as the often neglected user interface layer, which with current technologies cannot be integrated seamlessly into the SOA stack in the same way than other services. In contrast to traditional programming, the user interface as all other parts of the SOA should be loose coupled to the system. The problem here is that the user interface is different from services by its platform dependent nature. To solve this problem, in this work we proposed the development and usage of some kind of service contracts for the user interface. A basic example was provided by the data viewers developed for NORTIFlow.
The question of separating interfaces and implementations is about how pure interface based programming can and should be realized. Only interface based programming makes it possible to switch components and program parts later and this way contribute to gain the desired enhanced flexibility, adaptability and failure tolerance. This problem area of interface based programming furthermore includes finding a standard way of describing services with all metadata and semantic information necessary, to publish it in an easy way
in public available distributed repositories from where it is easily possible to locate and use a suitable service when looking for it. The same is applicable for complete or partial workflow definitions.
In NORTIFlow we adapt the interface based approach by describing workflows on the top level in a pure interface based way. Later, concrete services that implement the needed functionality are mapped to the interface nodes, which this way can easily be exchanged if alternative mappings are available. The second problem of finding and publishing services and workflows however is not tackled in this work any further, we only argue for using established SOA technologies like UDDI for that purpose.
Finally the problem area of separating components and composition treats the already mentioned search for an appropriate composition mechanism for the orchestration layer. Here, this thesis still argues for graphical composition mechanisms as used by (scientifically) workflow management systems we wanted to use, allowing for defining workflows for everyone and not only expert programmers. For this issue, NORTIFlow was designed as a lightweight scientifically workflow management system that can be integrated into a SOA due to its service oriented design. NORTIFlow was subsequently used to launch the Norti-Online web portal, being well prepared for future extensions, the addition of new assessment models, workflows, viewers and data sources.