4 Identifying objects

Now, and only now that we have the intellectual tools necessary to describe and model objects, we can start to think about how to discover and invent them. We have seen that spotting the nouns and verbs in the post-conditions of use cases is a very useful technique for establishing the vocabulary of the domain and thereby the static type model. However, detailed discussion of how to go about identifying objects has been deferred until now, because we could not have described what we would have been looking for had we dealt with this earlier. The second reason for this is that object identification is recognized as a key, possibly the key, bottleneck in applying both object-oriented design and analysis. In this respect the topic deserves the attention of a separate section, in which we will try to show that the object identification bottleneck is as chimerical as the famous Feigenbaum knowledge acquisition bottleneck. Techniques exist to help apprehend objects, mostly those in use already in data modelling and knowledge engineering, and good analysts do not, in practice, stumble badly in this area.

Textual analysis

Booch's original object-oriented design method began with a dataflow analysis which was then used to help identify objects by looking for both concrete and abstract objects in the problem space found from the bubbles and data stores in a data flow diagram (DFD). Next, methods are obtained from the process bubbles of the DFD. An alternative but complementary approach, first suggested by Abbott (1983), is to extract the objects and methods from a textual description of the problem. Objects correspond to nouns and methods to verbs. Verbs and nouns can be further subclassified. For example, there are proper and improper nouns, and verbs to do with doing, being and having. Doing verbs usually give rise to methods, being verbs to classification structures and having verbs to composition structures. Transitive verbs generally correspond to methods, but intransitive ones may refer to exceptions or time-dependent events; in a phrase such as 'the shop closes', for example. This process is a helpful guide but may not be regarded as any sort of formal method. Intuition is still required to get hold of the best design. This technique can be automated as some of the tools developed to support HOOD and Saeki et al. (1989) showed.

For example, a requirements statement transcript might contain the following fragment:

We have emboldened some candidate classes (nouns) and italicized some candidate methods (transitive verbs). Possible attributes or associations are underlined. The process could be continued using the guidelines set out in Table 1, but it must be understood that this is not a formal technique in any sense of the word. The table is intended merely to provoke thought during analysis.

Most of the methods descended from Booch's work, including those based on UML, use some form of textual analysis of this sort.

HOOD, RDD and some other methods used Abbott textual analysis but otherwise there are no precise, normative techniques. Coad and Yourdon (1991) say that analysts should look for things or events remembered, devices, rôles, sites and organizational units. The Shlaer-Mellor method offers five categories: tangible entities, rôles, incidents, interactions and specifications. This is all very well, but rather fuzzy and certainly not complete. Coad et al. (1999) say that there are exactly five kinds of archetypal objects and gives standard features for each one. Each archetype is associated with a colour - based on the colours of commercially available pads of paper stickers. His archetypes are objects representing:

It is difficult to be completely convinced that the world is really so simple, but the idea is a useful one and emphasizes the recurrence of analysis patterns. This section provides several, quite precise, normative techniques for eliciting objects. They are founded in knowledge engineering, HCI practice and philosophy.


Table 1 Guidelines for textual analysis.



Part of speech


Model component


Example from SACIS text
proper noun instance J. Smith
improper noun class/type/rôle toy
doing verb operation buy
being verb classification is an
having verb composition has an
stative verb invariance-condition are owned
modal verb data semantics, pre-condition, post-condition or invariance-condition must be
adjective attribute value or class unsuitable
adjectival phrase association

operation
the customer with children

the customer who bought the kite
transitive verb operation enter
intransitive verb exception or event depend

 As we have suggested, most object-oriented analysis methods give little help on the process of identifying objects. It is a very reasonable approach to engage in the analysis of nouns, verbs and other parts of speech in an informal written description of the problem or a set of use case post-conditions. Rules of thumb here include: matching proper nouns to instances, and improper nouns to types or attributes; adjectival phrases qualifying nouns such as 'the employee who works in the salaries department' indicate relations or may indicate methods if they contain verbs as in 'the employee who got a rise'.

The Abbott technique is useful, but cannot succeed on its own. This semi-structured approach is only a guide and creative perception must be brought to bear by experienced analysts, as we have already emphasized. Using it involves difficult decisions.

Essential & accidental judgments

Fred Brooks (1986) notes the difference between essence and accidents in software engineering. The distinction is, in fact, a very old one going back to Aristotle and the mediaeval Scholastics. The idea of essence was attacked by modern philosophers from Descartes onwards, who saw objects as mere bundles of properties with no essence. This gave rise to severe difficulties because it fails to explain how we can recognize a chair with no properties in common with all the previous chairs we have experienced. A school of thought known as Phenomenology, represented by philosophers such as Hegel, Brentano, Husserl and Heidegger, arose inter alia from attempts to solve this kind of problem. Another classical problem, important for object-oriented analysis, is the problem of categories. Aristotle laid down a set of fixed pairs of categories through the application of which thought could proceed. These were concepts such as Universal/Individual, Necessary/Contingent, and so on. Kant gave a revised list but, as Hegel once remarked, didn't put himself to much trouble in the doing. The idealist Hegel showed that the categories were related and grew out of each other in thought. Finally, the materialist Marx showed that the categories of thought arose out of human social and historical practice:

So, we inherit categories from our forebears, but also learn new ones from our practice in the world.

All phenomenologists and dialecticians, whether idealist or materialist, acknowledge that the perception or apprehension of objects is an active process. Objects are defined by the purpose of the thinking subject, although for a materialist they correspond to previously existing patterns of energy in the world - including of course patterns in the brain. A chair is a coherent object for the purposes of sitting (or perhaps for bar-room brawling) but not for the purposes of sub-atomic physics. You may be wondering by now what all this has got to do with object-oriented analysis. What does this analysis tell us about the identification of objects? The answer is that it directs attention to the user.

User-centred analysis requires that we ask about purpose when we partition the world into objects. It also tells us that common purpose is required for reusability, because objects designed for one user may not correspond to anything in the world of another. In fact reuse is only possible because society and production determine a common basis for perception. A clear understanding of Ontology helps to avoid the introduction of accidental, as opposed to essential, objects. Thus, Fred Brooks, in our opinion, either learned some Ontology or had it by instinct alone.

Some useful tips for identifying important, rather than arbitrary, objects can be gleaned from a study of philosophy, especially Hegelian philosophy and modern Phenomenology. Stern (1990) analyses Hegel's concept of the object in great detail. The main difference between this notion of objects and other notions is that objects are neither arbitrary 'bundles' of properties (the Empiricist or Kantian view), nor are they based on a mysterious essence, but are conceptual structures representing universal abstractions. The practical import of this view is that it allows us to distinguish between genuine high level abstractions such as Man and completely contingent ones such as Red Objects. Objects may be judged according to various, historically determined, categories. For example 'this rose is red' is a judgment in the category of quality. The important judgments for object-oriented analysis and their relevant uses are those shown in Table 2.

Table 2 Analysis of judgments.



Judgment


Example


Feature
Quality this ball is red attribute
Reflection this herb is medicinal relationship
Categorical Fred is a man generalization
Value Fred should be kind rules

The categorical judgment is the one that reveals genuine high level abstractions. We call such abstractions essential. Qualitative judgments only reveal contingent and accidental properties unlikely to be reusable, but nevertheless of semantic importance within the application. Beware, for example, of abstractions such as 'red roses' or 'dangerous toys'; they are qualitative and probably not reusable without internal restructuring. Objects revealed by qualitative judgments are called accidental. Accidental objects are mere bundles of arbitrary properties, such as 'expensive, prickly, red roses wrapped in foil'. Essential objects are universal, in the sense that they are (or belong to) classes which correspond to objects that already have been identified by human practice and are stable in time and space. What they are depends on human purposes; prior to trade money was not an object. Reflective judgments are useful for establishing usage relationships and methods; being medicinal connects herbs to the sicknesses that they cure. Value judgments may be outside the scope of a computer system, but can reveal semantic rules. For example, we could have, at a very high business analysis level, 'employees should be rewarded for loyalty' which at a lower level would translate to the rule: 'if five years' service then an extra three days' annual leave'.

Attributes are functions that take objects as values; that is, their ranges are classes. They may be distinguished into attributes whose values are abstract (or essential in the sense alluded to above) objects like employee, and those with printable, i.e. accidental, objects as values like redness. This observation has also been made in the context of semantic data models by Hull and King (1987).

For business and user-centred design, the ontological view dictates that objects should have a purpose. Operations too should have a goal. In several methods this is accomplished by specifying post-conditions. These conditions should be stated for each method (as in Eiffel) and for the object as a whole in the rules compartment of an object.

Lenat and Guha (1990) suggest that instances are things that something definite can be said about, but point out the danger of relying too much on the structure of natural language. They suggest that a concept should be abstracted as a class if:

They also emphasize the point we have made: that purpose is the chief determinant of what is to be a class or type.

A useful rule of thumb for distinguishing essential objects is that one should ask if more can be said about the object than can be obtained by listing its attributes and methods. It is cheating in the use of this rule to merely keep on adding more properties. Examples abound of this type of object. In a payroll system, an employee may have red hair, even though this is not an attribute, or be able to fly a plane, even though this is not a method. Nothing special can be said about the class 'employees who can fly' unless, of course, we are dealing with the payroll for an airline. What is essential is context sensitive.

Very long methods, objects with hundreds of attributes and/or hundreds of methods indicate that you are trying to model something that normal mortals couldn't apprehend in any conceivable perceptive act. This tells us, and we hope your project manager, that you haven't listened to the users.

It is not only the purposes of the immediate users that concern us, but the purposes of the user community at large and, indeed, of software engineers who will reuse your objects. Therefore, analysts should keep reuse in mind throughout the requirements elicitation process. Designing or analysing is not copying user and expert knowledge. As with perception, it is a creative act. A designer, analyst or knowledge engineer takes the purposes and perceptions of users and transforms them. S/he is not a tabula rasa - a blank sheet upon which knowledge is writ - as older texts on knowledge elicitation used to recommend, but a creative participant.

Johnson and Foote (1988) make a few suggestions for desiderata concerning when to create a new class rather than add a method to an existing class which seem to conform to the ontological insights of this section.

Epistemology has been studied by knowledge engineers involved in building expert systems. Many of the lessons learnt and the techniques they have discovered can be used in building conventional systems, and this is now routine in our work. In particular, they can be applied to HCI design (Johnson, 1992) and within object-oriented analysis and design.