In the first generic systems for model-based diagnosis [5] poor performance on complex applications was the price to pay for flexibility. Early systems, based on logical inference, had the problem of recomputations during diagnosis. After several years of research and intermediate results [6, ], Raiman, de Kleer and Saraswat succeeded in constructing a diagnosis engine, which was able to solve large combinatorial problems by focusing computation and avoiding recomputations of values [12].
DRUM-2 emerged from a different line of research. It treats a model of the system as a data structure and computes diagnoses by manipulating this model. It is simular in spirit to the works of Chou and Winslett on model-based belief revision [2]. By directly manipulating the model, DRUM-2 avoids recomputations of values (which can be accessed through the model) as well as symbolic operations, which tend to slow down computation. Using this approach DRUM-2 is able to solve combinatorial benchmark circuits with more than 3000 components efficiently [11]. A detailed description of the semantics can be found in [7].
In this section we report the performance of our model-based approach on a library of 32 representative alarm cases. We are currently extending this to 512 alarm cases from our GSM subnetwork. Current results, test cases and references can be found on our web page http://www.kbs.uni-hannover.de/project/alarm.html.
The alarm cases represent a wide range of alarm patterns for the
subnetwork shown in the previous examples (
to
). As we already stated, due to the heavy traffic caused
by defects of microwave links (alarm bursts), the probability of
suppressed or lost bts failure alarms is rather high. For the
test cases we assumed a probability of 0.1 of lost alarm messages and
0.01 for a faulty microwave link. The probabilities are much lower in
reality, but for the purpose of diagnosis the exact values do not
matter. The probabilities are needed only to discriminate between more
plausible diagnoses (assuming less lost messages and faulty microwave
links) and less plausible diagnoses.
In all test cases except one the system identifies the correct diagnosis, either as the single plausible diagnosis, or as the most probable diagnosis. We comment on the only exception in the next section.
The running time of our prototype is also very encouraging. Table 1 shows the typical running time of our system for one test case. In the first row we show the time for the subnetwork used above. The second row shows the running time on the complete network of a large German city. All times were measured on a SUN Ultra 1 workstation.
| Network | # BTSs | # MLs | Time |
| One Subnetwork | 5 | 5 | 0.8s |
| A City Network | 22 | 20 | 2.5s |
Let us now examine three of our test cases in more detail to discuss the scope of our current approach. The first example obeys our first deterministic model as well as the probabilistic approach:
In this example microwave link 1 is faulty and error messages are
generated for all base transceiver stations located downstreams. Since
none of these messages is lost, the example can also be explained by
the deterministic model. In the next example the message from
is lost:
This case is still handled correctly by the probabilistic model, because
the other minimal diagnoses
,
,
and
,
,
,
,
are less likely (since they assume more
lost messages). Using the most probable diagnosis approach our system is
able to handle 31 out of 32 alarm cases correctly. In the following case
it produces no diagnosis, since all relevant alarm messages were lost or
suppressed.
![]()