DOCTORAL THESIS
"Interlacement of structural and dynamic aspects in UML
associations" (2003)
Directed by Prof. Dr.
Doctoral Program in Software Engineering, Carlos III University of
Madrid
DOWNLOAD (Spanish version only)
·
Persistent
URL in Open Access Repository at Carlos III University
of Madrid: http://hdl.handle.net/10016/682
·
Directly
from this page: ZIP PDF 1380 KB
MAIN RELATED PUBLICATIONS
International Journals
·
Gonzalo Génova,
·
Gonzalo Génova, Carlos Ruiz del Castillo, Juan
Lloréns. "Mapping UML Associations into
Java Code", Journal
of Object Technology, 2(5): 135-162, Sep-Oct 2003.
·
Gonzalo Génova, Juan Lloréns, Vicente Palacios. "Sending Messages in UML", Journal of Object Technology,
vol.2, no.1, Jan-Feb 2003, pp. 99-115.
·
Gonzalo Génova, Juan Lloréns, Paloma Martínez. "The meaning of multiplicity of n-ary associations
in UML", Journal
on Software and Systems Modeling, 1(2): 86-97, 2002.
International Congresses
·
Gonzalo Génova,
·
Gonzalo Génova,
Technical Reports
·
Gonzalo
Génova. "Semantics of navigability in
UML associations". Technical Report UC3M-TR-CS-2001-06, Computer Science
Department, Carlos III University of Madrid, November 2001, pp. 233-251.
SUMMARY
In this Doctoral Thesis we have
conducted a research about the concept of association in the Unified
Modeling Language, centered around three main theoretical aspects (multiplicity,
navigability and visibility), and always searching the consequences of its practical
application (implementation).
¿What is an association?
Maybe one of the main fruits of this
Doctoral Thesis is the clarification of the association concept itself. The
definition of association given in the UML Standard (Glossary) is as follows:
an association is “the semantic relationship between two or more classifiers
that specifies connections among their instances” (UML Specification v. 1.4, p.
B-3). In turn, a link is defined as “a semantic connection among a tuple
of objects; an instance of an association” (p. B-11). The goal of this work has
been to explain precisely the sense of the word “semantic” when it is used to
define an association or a link. ¿What is a semantic relationship, what
is a semantic connection? ¿Is a link the same as a tuple?
We have arrived to the conclusion
that the semantics, or meaning, of every association includes two
intimately interlaced aspects: the static aspect and the dynamic
aspect, respectively related to the structure and behavior of the system; these
two aspects serve as a basis for a new classification of associations. We have
also argued that, in order to achieve a better decoupling among the
participants in an association, it is convenient to define an association not
between classifiers, but between interfaces (redefining also the concept of
interface, since its actual definition does not allow this).
The static aspect manifests itself
in the “fact” which is expressed by the link; the existence of a link implies
the verification of the predicate signified by the association name, and this
fact is part of the system’s state. The dynamic aspect manifests itself
in the possibility of communication given by the link; the existence of the
link implies that the instances know themselves along the direction in which
the link is navigable, and therefore they can interchange messages within
an interaction, having into account also the visibility of the invoked
operations. The use of interfaces flexibilizes the modeling with associations,
since the association can be defined independently of the connected classifiers.
Therefore, here is our definition:
An association is a relationship defined between two or more interfaces. In
each association end, the interface specifies the structure and behavior that is
possible to know by navigating towards that end through the association. The
association specifies a set of links between instances of the classifiers that
realize the respective interfaces in each association end. Every link is a
connection between instances that states a fact (the verification of a
predicate) and gives a possibility for communication (a navigable path).
The multiplicity of associations
In this Chapter we have tackled two
main topics: multiplicity in n-ary associations, and multiplicity in
qualified-associations and association-classes.
First, we have considered some
semantic problems of minimum multiplicity in n-ary associations, as it is
currently expressed in UML; nevertheless, our ideas are general enough to be
applicable to other modeling techniques more or less based on the
Entity/Relationship approach. Minimum multiplicity is closely related to the
participation constraint, although in the case of n-ary associations it does
not mean the participation of the class in the association, but the
participation of tuples of the other n-1 classes. Moreover, we discovered that
this latter participation is defined with uncertainty, allowing three
conflictive interpretations: participation of actual tuples, participation of
potential tuples, and participation with limping links.
The only one which is (implicitly)
in agreement with the UML documentation is the second interpretation, potential
tuples, in spite of the bouncing effect of minimum multiplicity 1. The Standard
should clarify this question, without resigning itself to a lack of obviousness
in the definition. Besides, if this second interpretation were chosen, the
Standard should also warn, since this result is not at all intuitive, that a
minimum multiplicity 1 or greater assigned to one class forces all potential
tuples of instances of the remaining classes to actually exist within some
n-tuple; therefore, minimum multiplicity would be 0 in nearly every n-ary
association.
The third interpretation, limping
links, which is a variation of the first one, seems intuitive and has also some
pragmatic advantages, although it is in contradiction with the definition of
n-ary association in UML (maybe more with the letter than with the spirit). We
are inclined to support this interpretation as far as it is carefully intended
to represent incomplete associations, but not related constraining
subassociations. Up to n-2 legs could be allowed to be lacking, and the value
"unknown", "empty" or "null" should be considered
as a concrete value when applying the restrictions imposed by multiplicity
values. However, this topic deservers further research which exceeds the scope
of this work.
The eventual clarification of this
point leaves another problem unsolved: the participation of each class remains
unexpressed in the Chen style of representing multiplicities (which is also the
UML style), while the Merise style shows it adequately. Both Chen and Merise
styles are correct, but they describe different characteristics of the same
association, which cannot be derived from each other in the n-ary case,
although they are related by a simple consistency rule.
Being both styles useful to
understand the nature of associations, we propose a simple extension to the
notation of UML n-ary multiplicities that enables the representation of both
participation and functional dependency (that is, Merise and Chen styles, or
inner and outer multiplicity in the CDIF terminology). Since this notation is
compatible with the three alternative interpretations of Chen multiplicities,
its use does not avoid by itself the ambiguity of the definition of
multiplicity: they are independent problems. If this notation were accepted,
the Standard should also modify the metamodel accordingly, since it foresees
only one multiplicity specification in the AssociationEnd meta-class. If this were not the
case, it could be at least recognized that Chen multiplicities are not the only
sensible co-occurrence constraints that may be defined in an n-ary association.
Understanding n-ary associations is
a difficult problem in itself. If the rules of the language used to represent
them are not clear, this task may become inaccessible. If the interpretation of
n-ary associations is uncertain, straight communication among modelers becomes
impossible. If the semantic implications of a model are ambiguous, implementers
will have to take decisions that do not correspond to them, and possibly wrong
decisions. These reasons are more than enough to expect a more precise definition
of UML on this topics, which maybe will be reached in version 2.0. In fact, the
3C (Clear, Clean, Concise) proposal for the elaboration of version 2.0,
promoted by Financial Systems Architects (
The second topic we have tackled in
this Chapter is the definition of multiplicity in qualified associations and
association-classes. Contrarily to n-ary associations, the Standard does
explain the meaning of minimum multiplicity on the target end of a qualified
association, adopting the equivalent to the potential tuples interpretation.
This supports the conclusion that the Standard assumes implicitly the potential
tuples interpretation for n-ary associations. Since we have considered that,
for practical reasons and intuitiveness, the limping links interpretation is
more convenient, we have tried to apply it to qualified multiplicity, too. It
has not been possible, because the concept of “incomplete association” has no
sense in a qualified association, since there is no association between the
source class and the qualifier attribute. If the representation of an
incomplete association between the source class and the qualifier had some
sense in a certain domain, then the qualified association should be represented
rather as an n-ary association. Therefore, we have to renounce to the limping
links interpretation for qualified multiplicity.
Multiplicity
in association-classes is affected by the constraint, shared by every
association, that no duplicated tuples are allowed, what complicates the
modeling of some common situations, and specifically the representation of
associations with “temporal logic”, that is, predicates that are considered
valid during a certain period of time. The use of qualified associations
instead of association-classes does not help to solve these problems either,
because they are subject to this constraint, too. The definition of a qualified
association in UML, halfway between binary and n-ary associations, does not
make this point clear enough: if the binary analogy prevails, tuples cannot be
duplicated; if the n-ary aspect prevails, they can.
Finally, we have examined in detail
the root of these difficulties, which is the definition of an association as
a set of non-duplicated tuples, a definition which is excessively
influenced by data base design methodologies, which has been adopted by UML
without considering all the consequences. We have shown a possible solution to
escape from this restriction in some common modeling problems, consisting in
the insertion of a fictitious associative class that permits the repetition of
tuples, but introduces an unnecessary complexity in the models. Anyway, the
analysis of the metamodel, putting aside its internal contradictions, makes it
clear that a link is not exactly the same as a tuple: a link is a connection
between two (or more) objects, and it determines a tuple; every link
in an association is different from the others (as long as each one has its own
identity and can be distinguished from the others), but two or more links can
connect the same objects, so that they determine the same tuple (they can have
the same data content). Therefore, the constraint that there cannot be
duplicated tuples does not derive from the very nature of links, but it is
rather an additional constraint that could be suppressed without violence for
the language principles, being it easy to recover when the nature of the
modeled problem requires it.
The navigability of associations
In this Chapter we have considered
some semantic problems of associations and navigability in UML. We have tried
to clarify some definitions, and we have proposed solutions for some problems.
We have searched for a definition of navigability that is missing in the
official documentation, we have explored the relationship of navigability
to message sending, and we have examined in detail the issue of communication
links, highlighting some misunderstandings and conflicts in the present
definition of UML (version 1.4). We have reaffirmed the principle that every
link is an instance of an association, and our analysis has lead us to the
distinction between structural and contextual associations, and to a new
definition and application of association and link stereotypes. We have
pointed out the relationship of navigability to dependency, and we have
examined the invertibility, efficiency and notation of navigable associations.
We have also applied the concept of navigability to more complex associations
(associations-class, qualified association and n-ary association), a topic that
has been neglected in the UML documentation for the time being.
Associations in object-oriented
models, and more specifically in UML models, are not symmetrical. The main asymmetries
we can find in associations are: first, linguistic
asymmetry, which is basically the non interchangeability between subject
and object in the verbal phrase that gives name to the association, and which
is graphically expressed by the association name direction triangle; second, whole-part asymmetry, expressed by the
aggregation or composition property of the association; and third, communication asymmetry, which means the
direction in which knowledge can be obtained through the association, and which
is closely related to the concepts of visibility, reference and navigation. An
association can be bidirectional in the latter sense (two-way navigable), but this does not make
it symmetrical in any sense. These three kinds of asymmetry are independent,
but conceptually related. Generalizations and dependencies, which are other
kinds of relationships together with associations, are also asymmetric.
To “navigate” or to “traverse” an
association is to obtain, through the association, a path or reference to the
opposite object that permits handling it; in other words, to form
the expression of a path that designates a target object (or set of objects)
from a source object. Once the source object has a relative name of the
target object that is valid in the source's context, the source can manipulate
the target, that is, it can invoke its public operations, get or set its public
attributes, pass it as a parameter in messages to other objects, and so on.
Navigability, then, is (our definition) the
possibility for a source object to designate a target object through an
association, in order to manipulate or access it in an interaction with
message interchanges. This one or a similar definition should be incorporated
to the Standard.
The direction of navigability
indicates that the object at the source end can know other objects at the
target end through the association. The object that has knowledge of the
association is responsible for maintaining
the state of the association and controlling
the interaction that can take place through it. If both ends have knowledge
and are responsible of the association, then the association is said to be
two-way (bidirectional), otherwise it is one-way (unidirectional). No-way
navigability has no sense.
Visibility and navigability are both
required for communication between objects to take place: an object can communicate only with other objects it knows about, and
that have made available the desired operations in their interface. This
idea should be expressed clearly and concisely in the Standard. Navigability is
so closely related to the ability of sending messages, that very often this two
concepts are identified.
The different kinds of communication
links that can exist in a model pose the question of whether every link is or
is not an instance of an association, and whether an association must exist
whenever there is a communication between objects. The distinction between
static and dynamic associations is not adequate to solve this problem, since in
object-orientation every association has static and dynamic properties,
therefore these aspects do not serve to define two disjoint subtypes of
association. Instead, we have proposed the distinction between structural and
contextual associations, which, with an adequate redefinition of association
and link stereotypes, helps to maintain the principle that every link is an
instance of an association. This distinction is not based on the static or
dynamic properties of associations, since every association is (or at least may
be) involved in the structure and behavior of the modeled system. Instead, our
classification is based on the context in which associations are valid. The
distinction is graphically expressed in diagrams using the traditional
association and link stereotypes, although they are not applied to association
and link ends any more, but to associations and links themselves.
We have examined three properties of
associations that depend on navigability: dependency, invertibility and efficiency. Since navigability
means knowing, and knowing means both communicability and dependency,
navigability creates a dependency
from the source to the target. When the associations in a model are
predominantly unidirectional, the reuse of small portions of the model becomes
easier. This is the main argument in favor of one-way associations as a default
option, instead of two-way associations as UML promotes. In any case, two-way
associations cannot be completely discarded, since sometimes they are required
by the nature of the problem or the solution.
According to some places of the
Reference Manual, invertibility-bidirectionality seems a logical property of an
association (even of every association), different from the fact that the
association is both ways navigable. Navigability would not be a logical property, but an implementation property meaning
nearly the same as navigation efficiency. We consider, instead, that the
logical possibility of navigation is an important concept in analysis as well
as in design. In our view, invertibility, bidirectionality and two-way
navigability are synonyms. The essence of
association is knowledge, and knowledge can be unidirectional, not for a
question of efficiency, but rather for a question of principle. Therefore, the
navigability arrow should never be used to mean efficient navigation,
especially because it makes impossible to specify an association that is not
navigable at all in one direction.
Among the three presentation options recommended by the Standard, we think the best
practice is using only the “suppress all” style for the first stages of
analysis, and the “show all” style for a detailed analysis and for design. A
connection without arrows should not be used to mean two-way navigability but
only undecided or unspecified navigability.
The concept
of navigability, which is handled in the documentation only with regard to
simple binary associations, can be extended without great difficulty to association-classes
and qualified associations. On the contrary, it is not so easy for n-ary
associations. We have suggested various kinds of navigation expressions, in
order to take advantage of n-ary multiplicity: from one end towards another
end, from one combination of n-1 ends towards another end (using a similar
notation to that of qualified associations), and from one end towards the
association itself.
Even though using an n-ary link
as a communication infrastructure between the linked objects can make
sense, communication is in itself an intrinsic binary phenomenon, that is, a
message has exactly one sender and one receiver. This precludes, on the one
hand, the joint emission of a message by two or more objects and, on the other
hand, the joint reception of messages. The representation of a “binary” message
sent through an n-ary link in a collaboration diagram is somewhat problematic,
but we have given some simple rules that can solve the problem.
The visibility of associations
In this Chapter we have considered
some problems regarding the visibility of associations. Since it is based on
the visibility of attributes and operations, it has been necessary to clarify a
couple of questions regarding visibility in general in UML. First, attribute
and operation visibility is a feature specified for classifiers, not for
instances; that is, visibility does not specify whether an object can see
another object, but rather whether a class feature can see another feature of
the same or another class. Therefore, when both the sender and receiver objects
belong to the same class, it is possible to send a message that corresponds to
a private operation, even though this operation does not belong to the “native
interface” of the class, consisting in its public operations. Second, the
four existing kinds of visibility are not four progressively restricting levels,
even though the expression “levels of visibility” insinuates this, because some
times protected is more restrictive than package,
and some times it is less restrictive.
The definitions of visibility of
association ends contained in the Standard are not very rigorous; instead of
them, we have proposed the following one: the visibility of an association
end specifies the visibility of the association from the point of view
of other classifiers when navigating the association towards that association
end. In UML, association ends are assimilated to pseudo-attributes of the
classifiers participating in the association, thus compromising the unity of
the association as a building block in models. Consequently, visibility of
association ends is defined similarly to attribute and operation visibility,
with the same four possibilities, except for package visibility, in which case
the parallelism is lost. To recover this parallelism we have proposed a new
definition of package visibility for association ends, one that does not
depend on where the association is defined, but on where the associated
classifiers are defined. In any case, it remains unsolved the problem of
ambivalence between “association” as a concept that is independent from the
associated classifiers, and “reference” as an element that is included in them
and more or less equivalent to an attribute.
On the other hand, the definition of
bidirectional associations between classifiers belonging to different packages
turns out to be problematic. Indeed: first, an element is defined in one
package only, and, outside its owner package, it can be used, but it cannot be
modified. Second, a navigable association induces a dependency from the source
classifier towards the target classifier, a dependency that affects the
definition of the source classifier, so that this dependency must be defined in
the same package as the source classifier; therefore, a navigable association
end must be defined in the same package as the source classifier. Third, it is
not possible to define a bidirectional association between classifiers from
different packages, since this would require the association (that is, its
ends) to be defined in both packages, but an association must be defined in a
unique package, as any other model element. Therefore, a bidirectional
association can be defined only between classifiers belonging to the same
package. Even though this may seem a severe limitation of UML, it is really
a natural, though not much evident, consequence of the package concept.
The use of attribute and operation
visibility is not enough to achieve an effective decoupling between the
classifiers in a model, since this visibility does not discriminate between the
different associations a classifier is connected to. The perfect thing would be
having a different interface for each association, so that the
classifier connected in the opposite end would have a dependency limited to the
features included in the interface. In UML there are two very similar
mechanisms (even too much similar for the distinction to make sense) that
permit tackling this problem: interface specifier, and strictly-speaking
interface.
The interface specifier gives
a partially adequate solution to this problem, especially if we adopt the
revised definition proposed in this Chapter; nevertheless, the source class
still depends on the concrete target class, even though it knows only the part
revealed by the interface specifier. On the contrary, by means of using a proper
interface we achieve full independence from the concrete target class,
which can be any class realizing the interface; in this way we achieve, in
addition to the specification of the required functionality through the
association, the sharing of the association end among several classes without
need of creating a superclass. However, since an interface in UML specifies the
behavior but not the structure of the classifier that realizes it, an interface
is not allowed to take part in bidirectional associations, because this would
require that the interface had a structure. In any case, it is reasonable to
extend UML with a less restrictive notion of interface (termed by some
as “role”) that includes not only the specification of behavior, but also
the specification of state. Indeed, the use of interfaces with behavior and
state helps not only to express better the interaction required through
the association, but also the required structure, thus achieving a
better integration of static and dynamic aspects of the association. This new notion
of interface implies a new notion of compatibility or realization as
well, that includes both aspects.
In this sense, we have proposed a
new definition of associations that is no more established between classifiers,
but between interfaces, which can be realized by one or more classifiers,
permitting this way a great flexibility in design. The interface on each end
specifies the structure and the behavior that can be known through the
association on that concrete end, and that the associated objects must satisfy.
We have also presented a notation that allows different complexity levels to
express the same association with more or less detail; in the simplest levels
the interfaces are not shown, or shown in reduced form; in successive levels
they are shown with more detail and their role in association ends is
emphasized.
Finally, we have applied this new
concept of association, showing how its use permits the simplification of
attribute and operation visibility, and solves the inconveniencies of private
visibility for reflexive associations. We have shown also the relevance of an
interface being connected to only one association, or more than one.
The implementation of associations
In this Chapter we have developed a
concrete way of mapping UML associations into Java code: we have written
specific code patterns, and we have constructed a tool that reads a UML design
model stored in XMI format and generates the necessary Java files. We have paid
special attention to three main features of associations: multiplicity,
navigability and visibility. Our analysis has encountered difficulties that may
reveal some weaknesses of the UML Specification.
Regarding multiplicity, we have
shown that it is impossible in practice, with a few primitive operations, to keep
the minimum multiplicity constraint at any moment on a mandatory association
end; our proposal is to check this constraint only when accessing the links,
but not when modifying them. The programmer will be responsible for using the
primitives in a consistent way so that a valid system state is reached as soon
as possible. On the contrary, it is possible to ensure the fulfilment of the
maximum multiplicity constraint during run-time, and so we enforce it in our
implementation. Single association ends are easily stored in attributes having
the related target class as type, but multiple association ends require the use
of collections to store the corresponding set of links; as collections in Java
are based on the standard Object class, it is necessary to perform run-time type-checking by means of
explicit casting when using collections as parameters in the mutator methods.
Regarding navigability,
unidirectional associations are easier to implement by means of attributes than
bidirectional associations, because of the difficulties in synchronizing both
associations ends. An update to a bidirectional association must be performed
atomically on both ends to keep them consistent; this is achieved in the source
object by issuing a reciprocal update on the target object. We have considered
the pros and cons of an alternative implementation, based on the storage of
“reified tuples”, and finally we have discarded it in favor of our “synchronized cross-references” scheme. A side consequence of
our analysis is that the multiplicity constraint in a design model can be
specified only for a navigable association end.
Regarding visibility, in the case of
unidirectional associations it can be implemented rather easily by simply
mapping the visibility of the association end onto the visibility of the
corresponding accessor and mutator methods, because UML and Java visibility
levels have approximately the same semantics (except for Java’s protected,
which is equivalent to the union of UML’s protected and package). However, bidirectional
associations with one or two private (or protected) ends behave paradoxically,
because the reciprocal update becomes impossible.
The generated code for each
association is easily localized inside the involved Java classes. Each
association end presents a uniform programmer's interface. The interface is
exactly the same for unidirectional and bidirectional association ends, but
there are slight differences for single and multiple association ends.
Our approach is rather
check-exhaustive with regard to invariants. We think that it is worth doing for
the programmer as much as we can, so that our tool will insert code to perform
run-time multiplicity and type checking and, of course, to issue reciprocal
updates on bidirectional associations. However, different tool options will
allow the user to override the automatic multiplicity and type checks when
generating code, in favor of efficiency. Besides, we have argued that
unidirectional associations should not have a multiplicity constraint on the
source end in a design model, and bidirectional associations should not have
both ends with private (or protected) visibility; therefore, the tool will
reject the generation of code for these associations. Again, the user will be able
to disable this model-correctness checking and issue the code generation at
his/her own risk.