Gonzalo Génova Fuster

DOCTORAL THESIS

"Interlacement of structural and dynamic aspects in UML associations" (2003)

Directed by Prof. Dr. Juan Llorens Morillo

Doctoral Program in Software Engineering, Carlos III University of Madrid

DOWNLOAD (Spanish version only)

· Persistent URL in Open Access Repository at Carlos III University of Madrid: http://hdl.handle.net/10016/682

· Directly from this page: ZIP PDF 1380 KB

MAIN RELATED PUBLICATIONS

International Journals

· Gonzalo Génova, Juan Llorens, José Miguel Fuentes. "UML Associations: A Structural and contextual View", Journal of Object Technology, 3(7): 83-100, Jul-Aug 2004.

· Gonzalo Génova, Carlos Ruiz del Castillo, Juan Lloréns. "Mapping UML Associations into Java Code", Journal of Object Technology, 2(5): 135-162, Sep-Oct 2003.

· Gonzalo Génova, Juan Lloréns, Vicente Palacios. "Sending Messages in UML", Journal of Object Technology, vol.2, no.1, Jan-Feb 2003, pp. 99-115.

· Gonzalo Génova, Juan Lloréns, Paloma Martínez. "The meaning of multiplicity of n-ary associations in UML", Journal on Software and Systems Modeling, 1(2): 86-97, 2002.

International Congresses

· Gonzalo Génova, Juan Llorens, José Miguel Fuentes. "The Baseless Links Problem". Workshop on Consistency Problems in UML-based Software Development II, October 20, 2003, San Francisco, USA. Held in conjunction with The 6th International Conference on the Unified Modeling Language-UML'2003, October 20-24, 2003, San Francisco, California, USA. Published in Blekinge Institute of Technology, Research Report 2003:06.

· Gonzalo Génova, Juan Llorens, Paloma Martínez. "Semantics of the minimum multiplicity in ternary associations in UML". The 4th International Conference on the Unified Modeling Language-UML'2001, October 1-5 2001, Toronto, Ontario, Canada. Published in Lecture Notes in Computer Science 2185, Springer 2001, pp. 329-341.

Technical Reports

· Gonzalo Génova. "Semantics of navigability in UML associations". Technical Report UC3M-TR-CS-2001-06, Computer Science Department, Carlos III University of Madrid, November 2001, pp. 233-251.

SUMMARY

In this Doctoral Thesis we have conducted a research about the concept of association in the Unified Modeling Language, centered around three main theoretical aspects (multiplicity, navigability and visibility), and always searching the consequences of its practical application (implementation).

¿What is an association?

Maybe one of the main fruits of this Doctoral Thesis is the clarification of the association concept itself. The definition of association given in the UML Standard (Glossary) is as follows: an association is “the semantic relationship between two or more classifiers that specifies connections among their instances” (UML Specification v. 1.4, p. B-3). In turn, a link is defined as “a semantic connection among a tuple of objects; an instance of an association” (p. B-11). The goal of this work has been to explain precisely the sense of the word “semantic” when it is used to define an association or a link. ¿What is a semantic relationship, what is a semantic connection? ¿Is a link the same as a tuple?

We have arrived to the conclusion that the semantics, or meaning, of every association includes two intimately interlaced aspects: the static aspect and the dynamic aspect, respectively related to the structure and behavior of the system; these two aspects serve as a basis for a new classification of associations. We have also argued that, in order to achieve a better decoupling among the participants in an association, it is convenient to define an association not between classifiers, but between interfaces (redefining also the concept of interface, since its actual definition does not allow this).

The static aspect manifests itself in the “fact” which is expressed by the link; the existence of a link implies the verification of the predicate signified by the association name, and this fact is part of the system’s state. The dynamic aspect manifests itself in the possibility of communication given by the link; the existence of the link implies that the instances know themselves along the direction in which the link is navigable, and therefore they can interchange messages within an interaction, having into account also the visibility of the invoked operations. The use of interfaces flexibilizes the modeling with associations, since the association can be defined independently of the connected classifiers.

Therefore, here is our definition: An association is a relationship defined between two or more interfaces. In each association end, the interface specifies the structure and behavior that is possible to know by navigating towards that end through the association. The association specifies a set of links between instances of the classifiers that realize the respective interfaces in each association end. Every link is a connection between instances that states a fact (the verification of a predicate) and gives a possibility for communication (a navigable path).

The multiplicity of associations

In this Chapter we have tackled two main topics: multiplicity in n-ary associations, and multiplicity in qualified-associations and association-classes.

First, we have considered some semantic problems of minimum multiplicity in n-ary associations, as it is currently expressed in UML; nevertheless, our ideas are general enough to be applicable to other modeling techniques more or less based on the Entity/Relationship approach. Minimum multiplicity is closely related to the participation constraint, although in the case of n-ary associations it does not mean the participation of the class in the association, but the participation of tuples of the other n-1 classes. Moreover, we discovered that this latter participation is defined with uncertainty, allowing three conflictive interpretations: participation of actual tuples, participation of potential tuples, and participation with limping links.

The only one which is (implicitly) in agreement with the UML documentation is the second interpretation, potential tuples, in spite of the bouncing effect of minimum multiplicity 1. The Standard should clarify this question, without resigning itself to a lack of obviousness in the definition. Besides, if this second interpretation were chosen, the Standard should also warn, since this result is not at all intuitive, that a minimum multiplicity 1 or greater assigned to one class forces all potential tuples of instances of the remaining classes to actually exist within some n-tuple; therefore, minimum multiplicity would be 0 in nearly every n-ary association.

The third interpretation, limping links, which is a variation of the first one, seems intuitive and has also some pragmatic advantages, although it is in contradiction with the definition of n-ary association in UML (maybe more with the letter than with the spirit). We are inclined to support this interpretation as far as it is carefully intended to represent incomplete associations, but not related constraining subassociations. Up to n-2 legs could be allowed to be lacking, and the value "unknown", "empty" or "null" should be considered as a concrete value when applying the restrictions imposed by multiplicity values. However, this topic deservers further research which exceeds the scope of this work.

The eventual clarification of this point leaves another problem unsolved: the participation of each class remains unexpressed in the Chen style of representing multiplicities (which is also the UML style), while the Merise style shows it adequately. Both Chen and Merise styles are correct, but they describe different characteristics of the same association, which cannot be derived from each other in the n-ary case, although they are related by a simple consistency rule.

Being both styles useful to understand the nature of associations, we propose a simple extension to the notation of UML n-ary multiplicities that enables the representation of both participation and functional dependency (that is, Merise and Chen styles, or inner and outer multiplicity in the CDIF terminology). Since this notation is compatible with the three alternative interpretations of Chen multiplicities, its use does not avoid by itself the ambiguity of the definition of multiplicity: they are independent problems. If this notation were accepted, the Standard should also modify the metamodel accordingly, since it foresees only one multiplicity specification in the AssociationEnd meta-class. If this were not the case, it could be at least recognized that Chen multiplicities are not the only sensible co-occurrence constraints that may be defined in an n-ary association.

Understanding n-ary associations is a difficult problem in itself. If the rules of the language used to represent them are not clear, this task may become inaccessible. If the interpretation of n-ary associations is uncertain, straight communication among modelers becomes impossible. If the semantic implications of a model are ambiguous, implementers will have to take decisions that do not correspond to them, and possibly wrong decisions. These reasons are more than enough to expect a more precise definition of UML on this topics, which maybe will be reached in version 2.0. In fact, the 3C (Clear, Clean, Concise) proposal for the elaboration of version 2.0, promoted by Financial Systems Architects (New York, U.S.A., http://www.community-ml.org/), has already assumed our ideas about n-ary associations.

The second topic we have tackled in this Chapter is the definition of multiplicity in qualified associations and association-classes. Contrarily to n-ary associations, the Standard does explain the meaning of minimum multiplicity on the target end of a qualified association, adopting the equivalent to the potential tuples interpretation. This supports the conclusion that the Standard assumes implicitly the potential tuples interpretation for n-ary associations. Since we have considered that, for practical reasons and intuitiveness, the limping links interpretation is more convenient, we have tried to apply it to qualified multiplicity, too. It has not been possible, because the concept of “incomplete association” has no sense in a qualified association, since there is no association between the source class and the qualifier attribute. If the representation of an incomplete association between the source class and the qualifier had some sense in a certain domain, then the qualified association should be represented rather as an n-ary association. Therefore, we have to renounce to the limping links interpretation for qualified multiplicity.

Multiplicity in association-classes is affected by the constraint, shared by every association, that no duplicated tuples are allowed, what complicates the modeling of some common situations, and specifically the representation of associations with “temporal logic”, that is, predicates that are considered valid during a certain period of time. The use of qualified associations instead of association-classes does not help to solve these problems either, because they are subject to this constraint, too. The definition of a qualified association in UML, halfway between binary and n-ary associations, does not make this point clear enough: if the binary analogy prevails, tuples cannot be duplicated; if the n-ary aspect prevails, they can.

Finally, we have examined in detail the root of these difficulties, which is the definition of an association as a set of non-duplicated tuples, a definition which is excessively influenced by data base design methodologies, which has been adopted by UML without considering all the consequences. We have shown a possible solution to escape from this restriction in some common modeling problems, consisting in the insertion of a fictitious associative class that permits the repetition of tuples, but introduces an unnecessary complexity in the models. Anyway, the analysis of the metamodel, putting aside its internal contradictions, makes it clear that a link is not exactly the same as a tuple: a link is a connection between two (or more) objects, and it determines a tuple; every link in an association is different from the others (as long as each one has its own identity and can be distinguished from the others), but two or more links can connect the same objects, so that they determine the same tuple (they can have the same data content). Therefore, the constraint that there cannot be duplicated tuples does not derive from the very nature of links, but it is rather an additional constraint that could be suppressed without violence for the language principles, being it easy to recover when the nature of the modeled problem requires it.

The navigability of associations

In this Chapter we have considered some semantic problems of associations and navigability in UML. We have tried to clarify some definitions, and we have proposed solutions for some problems. We have searched for a definition of navigability that is missing in the official documentation, we have explored the relationship of navigability to message sending, and we have examined in detail the issue of communication links, highlighting some misunderstandings and conflicts in the present definition of UML (version 1.4). We have reaffirmed the principle that every link is an instance of an association, and our analysis has lead us to the distinction between structural and contextual associations, and to a new definition and application of association and link stereotypes. We have pointed out the relationship of navigability to dependency, and we have examined the invertibility, efficiency and notation of navigable associations. We have also applied the concept of navigability to more complex associations (associations-class, qualified association and n-ary association), a topic that has been neglected in the UML documentation for the time being.

Associations in object-oriented models, and more specifically in UML models, are not symmetrical. The main asymmetries we can find in associations are: first, linguistic asymmetry, which is basically the non interchangeability between subject and object in the verbal phrase that gives name to the association, and which is graphically expressed by the association name direction triangle; second, whole-part asymmetry, expressed by the aggregation or composition property of the association; and third, communication asymmetry, which means the direction in which knowledge can be obtained through the association, and which is closely related to the concepts of visibility, reference and navigation. An association can be bidirectional in the latter sense (two-way navigable), but this does not make it symmetrical in any sense. These three kinds of asymmetry are independent, but conceptually related. Generalizations and dependencies, which are other kinds of relationships together with associations, are also asymmetric.

To “navigate” or to “traverse” an association is to obtain, through the association, a path or reference to the opposite object that permits handling it; in other words, to form the expression of a path that designates a target object (or set of objects) from a source object. Once the source object has a relative name of the target object that is valid in the source's context, the source can manipulate the target, that is, it can invoke its public operations, get or set its public attributes, pass it as a parameter in messages to other objects, and so on. Navigability, then, is (our definition) the possibility for a source object to designate a target object through an association, in order to manipulate or access it in an interaction with message interchanges. This one or a similar definition should be incorporated to the Standard.

The direction of navigability indicates that the object at the source end can know other objects at the target end through the association. The object that has knowledge of the association is responsible for maintaining the state of the association and controlling the interaction that can take place through it. If both ends have knowledge and are responsible of the association, then the association is said to be two-way (bidirectional), otherwise it is one-way (unidirectional). No-way navigability has no sense.

Visibility and navigability are both required for communication between objects to take place: an object can communicate only with other objects it knows about, and that have made available the desired operations in their interface. This idea should be expressed clearly and concisely in the Standard. Navigability is so closely related to the ability of sending messages, that very often this two concepts are identified.

The different kinds of communication links that can exist in a model pose the question of whether every link is or is not an instance of an association, and whether an association must exist whenever there is a communication between objects. The distinction between static and dynamic associations is not adequate to solve this problem, since in object-orientation every association has static and dynamic properties, therefore these aspects do not serve to define two disjoint subtypes of association. Instead, we have proposed the distinction between structural and contextual associations, which, with an adequate redefinition of association and link stereotypes, helps to maintain the principle that every link is an instance of an association. This distinction is not based on the static or dynamic properties of associations, since every association is (or at least may be) involved in the structure and behavior of the modeled system. Instead, our classification is based on the context in which associations are valid. The distinction is graphically expressed in diagrams using the traditional association and link stereotypes, although they are not applied to association and link ends any more, but to associations and links themselves.

We have examined three properties of associations that depend on navigability: dependency, invertibility and efficiency. Since navigability means knowing, and knowing means both communicability and dependency, navigability creates a dependency from the source to the target. When the associations in a model are predominantly unidirectional, the reuse of small portions of the model becomes easier. This is the main argument in favor of one-way associations as a default option, instead of two-way associations as UML promotes. In any case, two-way associations cannot be completely discarded, since sometimes they are required by the nature of the problem or the solution.

According to some places of the Reference Manual, invertibility-bidirectionality seems a logical property of an association (even of every association), different from the fact that the association is both ways navigable. Navigability would not be a logical property, but an implementation property meaning nearly the same as navigation efficiency. We consider, instead, that the logical possibility of navigation is an important concept in analysis as well as in design. In our view, invertibility, bidirectionality and two-way navigability are synonyms. The essence of association is knowledge, and knowledge can be unidirectional, not for a question of efficiency, but rather for a question of principle. Therefore, the navigability arrow should never be used to mean efficient navigation, especially because it makes impossible to specify an association that is not navigable at all in one direction.

Among the three presentation options recommended by the Standard, we think the best practice is using only the “suppress all” style for the first stages of analysis, and the “show all” style for a detailed analysis and for design. A connection without arrows should not be used to mean two-way navigability but only undecided or unspecified navigability.

The concept of navigability, which is handled in the documentation only with regard to simple binary associations, can be extended without great difficulty to association-classes and qualified associations. On the contrary, it is not so easy for n-ary associations. We have suggested various kinds of navigation expressions, in order to take advantage of n-ary multiplicity: from one end towards another end, from one combination of n-1 ends towards another end (using a similar notation to that of qualified associations), and from one end towards the association itself.

Even though using an n-ary link as a communication infrastructure between the linked objects can make sense, communication is in itself an intrinsic binary phenomenon, that is, a message has exactly one sender and one receiver. This precludes, on the one hand, the joint emission of a message by two or more objects and, on the other hand, the joint reception of messages. The representation of a “binary” message sent through an n-ary link in a collaboration diagram is somewhat problematic, but we have given some simple rules that can solve the problem.

The visibility of associations

In this Chapter we have considered some problems regarding the visibility of associations. Since it is based on the visibility of attributes and operations, it has been necessary to clarify a couple of questions regarding visibility in general in UML. First, attribute and operation visibility is a feature specified for classifiers, not for instances; that is, visibility does not specify whether an object can see another object, but rather whether a class feature can see another feature of the same or another class. Therefore, when both the sender and receiver objects belong to the same class, it is possible to send a message that corresponds to a private operation, even though this operation does not belong to the “native interface” of the class, consisting in its public operations. Second, the four existing kinds of visibility are not four progressively restricting levels, even though the expression “levels of visibility” insinuates this, because some times protected is more restrictive than package, and some times it is less restrictive.

The definitions of visibility of association ends contained in the Standard are not very rigorous; instead of them, we have proposed the following one: the visibility of an association end specifies the visibility of the association from the point of view of other classifiers when navigating the association towards that association end. In UML, association ends are assimilated to pseudo-attributes of the classifiers participating in the association, thus compromising the unity of the association as a building block in models. Consequently, visibility of association ends is defined similarly to attribute and operation visibility, with the same four possibilities, except for package visibility, in which case the parallelism is lost. To recover this parallelism we have proposed a new definition of package visibility for association ends, one that does not depend on where the association is defined, but on where the associated classifiers are defined. In any case, it remains unsolved the problem of ambivalence between “association” as a concept that is independent from the associated classifiers, and “reference” as an element that is included in them and more or less equivalent to an attribute.

On the other hand, the definition of bidirectional associations between classifiers belonging to different packages turns out to be problematic. Indeed: first, an element is defined in one package only, and, outside its owner package, it can be used, but it cannot be modified. Second, a navigable association induces a dependency from the source classifier towards the target classifier, a dependency that affects the definition of the source classifier, so that this dependency must be defined in the same package as the source classifier; therefore, a navigable association end must be defined in the same package as the source classifier. Third, it is not possible to define a bidirectional association between classifiers from different packages, since this would require the association (that is, its ends) to be defined in both packages, but an association must be defined in a unique package, as any other model element. Therefore, a bidirectional association can be defined only between classifiers belonging to the same package. Even though this may seem a severe limitation of UML, it is really a natural, though not much evident, consequence of the package concept.

The use of attribute and operation visibility is not enough to achieve an effective decoupling between the classifiers in a model, since this visibility does not discriminate between the different associations a classifier is connected to. The perfect thing would be having a different interface for each association, so that the classifier connected in the opposite end would have a dependency limited to the features included in the interface. In UML there are two very similar mechanisms (even too much similar for the distinction to make sense) that permit tackling this problem: interface specifier, and strictly-speaking interface.

The interface specifier gives a partially adequate solution to this problem, especially if we adopt the revised definition proposed in this Chapter; nevertheless, the source class still depends on the concrete target class, even though it knows only the part revealed by the interface specifier. On the contrary, by means of using a proper interface we achieve full independence from the concrete target class, which can be any class realizing the interface; in this way we achieve, in addition to the specification of the required functionality through the association, the sharing of the association end among several classes without need of creating a superclass. However, since an interface in UML specifies the behavior but not the structure of the classifier that realizes it, an interface is not allowed to take part in bidirectional associations, because this would require that the interface had a structure. In any case, it is reasonable to extend UML with a less restrictive notion of interface (termed by some as “role”) that includes not only the specification of behavior, but also the specification of state. Indeed, the use of interfaces with behavior and state helps not only to express better the interaction required through the association, but also the required structure, thus achieving a better integration of static and dynamic aspects of the association. This new notion of interface implies a new notion of compatibility or realization as well, that includes both aspects.

In this sense, we have proposed a new definition of associations that is no more established between classifiers, but between interfaces, which can be realized by one or more classifiers, permitting this way a great flexibility in design. The interface on each end specifies the structure and the behavior that can be known through the association on that concrete end, and that the associated objects must satisfy. We have also presented a notation that allows different complexity levels to express the same association with more or less detail; in the simplest levels the interfaces are not shown, or shown in reduced form; in successive levels they are shown with more detail and their role in association ends is emphasized.

Finally, we have applied this new concept of association, showing how its use permits the simplification of attribute and operation visibility, and solves the inconveniencies of private visibility for reflexive associations. We have shown also the relevance of an interface being connected to only one association, or more than one.

The implementation of associations

In this Chapter we have developed a concrete way of mapping UML associations into Java code: we have written specific code patterns, and we have constructed a tool that reads a UML design model stored in XMI format and generates the necessary Java files. We have paid special attention to three main features of associations: multiplicity, navigability and visibility. Our analysis has encountered difficulties that may reveal some weaknesses of the UML Specification.

Regarding multiplicity, we have shown that it is impossible in practice, with a few primitive operations, to keep the minimum multiplicity constraint at any moment on a mandatory association end; our proposal is to check this constraint only when accessing the links, but not when modifying them. The programmer will be responsible for using the primitives in a consistent way so that a valid system state is reached as soon as possible. On the contrary, it is possible to ensure the fulfilment of the maximum multiplicity constraint during run-time, and so we enforce it in our implementation. Single association ends are easily stored in attributes having the related target class as type, but multiple association ends require the use of collections to store the corresponding set of links; as collections in Java are based on the standard Object class, it is necessary to perform run-time type-checking by means of explicit casting when using collections as parameters in the mutator methods.

Regarding navigability, unidirectional associations are easier to implement by means of attributes than bidirectional associations, because of the difficulties in synchronizing both associations ends. An update to a bidirectional association must be performed atomically on both ends to keep them consistent; this is achieved in the source object by issuing a reciprocal update on the target object. We have considered the pros and cons of an alternative implementation, based on the storage of “reified tuples”, and finally we have discarded it in favor of our “synchronized cross-references” scheme. A side consequence of our analysis is that the multiplicity constraint in a design model can be specified only for a navigable association end.

Regarding visibility, in the case of unidirectional associations it can be implemented rather easily by simply mapping the visibility of the association end onto the visibility of the corresponding accessor and mutator methods, because UML and Java visibility levels have approximately the same semantics (except for Java’s protected, which is equivalent to the union of UML’s protected and package). However, bidirectional associations with one or two private (or protected) ends behave paradoxically, because the reciprocal update becomes impossible.

The generated code for each association is easily localized inside the involved Java classes. Each association end presents a uniform programmer's interface. The interface is exactly the same for unidirectional and bidirectional association ends, but there are slight differences for single and multiple association ends.

Our approach is rather check-exhaustive with regard to invariants. We think that it is worth doing for the programmer as much as we can, so that our tool will insert code to perform run-time multiplicity and type checking and, of course, to issue reciprocal updates on bidirectional associations. However, different tool options will allow the user to override the automatic multiplicity and type checks when generating code, in favor of efficiency. Besides, we have argued that unidirectional associations should not have a multiplicity constraint on the source end in a design model, and bidirectional associations should not have both ends with private (or protected) visibility; therefore, the tool will reject the generation of code for these associations. Again, the user will be able to disable this model-correctness checking and issue the code generation at his/her own risk.