% dpaper.cc % Designing Multi-user KBMSs % Vinay K. Chaudhri % Unpublished Report, University of Toronto, February 1991. The KBMSs are a product of building new functionalities in the databases or incorporating database like features in expert systems. Such systems find applications in engineering design, strategic information systems, computer integrated manufacturing systems, etc. ``Knowledge Sharing'' and ``connectivity'' are going to be the key features of the intelligent information systems of future. This means that different users should simultaneously be able to access the KBMSs - the core of such systems. However, arbitrary concurrent access to a resource can lead to many inconsistencies in the stored and retrieved information. This paper tries to identify the difficulties encountered in the design of such systems and describes the possible solution methodologies. It first reviews the state of the art KR systems. Then it gives examples of concurrency control difficulties faced in such systems. Then it looks at the solution methodologies in the area of concurrency control and performance modelling. It concludes by identifying some directions for future research. % icde92 % Query Optimization for KBMSs: Temporal, Syntactic and Semantic Transformations % Thodoros Topaloglou, Arantza Illarramandi, Licia Sbattella % appears in the Proceedings of the 8th Intenational Conference on % Data Engineering, February 3-7, 1992, Phoennix, Arizona This paper describes a framework for query optimization for KBMSs based on the knowledge representation language Telos. The developed framework involves temporal and syntactic simplifications and semantic modification of the queries. Temporal simplification attempts to select parts of a knowledge base that are relevant to a query from a temporal viewpoint. Syntactic simplification exploits structural properties of the data model and attempts to transform the query into an equivalent and more efficient one. Semantic transformation uses knowledge specific to the application domain i.e, integrity constraints and rules, in order to transform a user specified query into another form which gets the same answer and it is processed efficiently. The above three steps are integrated into a global algorithm for query optimization, that utilizes all features of the considered KBMS. % canai92 % A Formal Analysis of Solution Caching % Vinay K. Chaudhri and Russell Greiner % Appears in the Proceedings of the Canadian Artificial Intelligence % Conference, 11-13 May, Vancouver, 1992. Many inference management systems store and maintain the conclusions found during a derivation process in a form that allows these conclusions to be used during subsequent derivations. As this approach, called ``solution caching'', allows the system to avoid repeating these derivations, it can reduce the system's overall cost for answering queries. Unfortunately, as there is a cost for storing these conclusions, it is not always desirable to cache every solution found --- this depends on whether the savings achieved by performing fewer inference steps for these future queries exceeds the storage overhead incurred. This paper formally characterizes this tradeoff and presents an efficient algorithm, FOCL, that produces an optimal caching strategy: ie given an inference graph of a knowledge base, anticipated frequencies of queries and updates of each node in this graph, and various implementation-dependent cost parameters, FOCL determines which of these nodes should cache their solutions to produce a system whose overall cost is minimal. The paper also presents empirical results that indicate that a system based on FOCL can significantly outperform one based on any of the standard approaches to solution caching. % cml-palo92 % A Statistical Approach to Solving the EBL Utility Problem % Russell Greiner, Igor Jurisica % Appears in the Proceedings of Tenth National Conference on Artificial % Intelligence, San Jose, CA, July 12-16, 1992, pp 241-249 % email: greiner@learning.scr.siemens.com; juris@cs.utoronto.ca Many ``learning from experience'' systems use information extracted from problem solving experiences to modify a performance element {\sc PE}, forming a new element {\sc PE$'$} that can solve these and similar problems more efficiently. However, as transformations that improve performance on one set of problems can degrade performance on other sets, the new {\sc PE$'$} is not always better than the original {\sc PE}; this depends on the distribution of problems. We therefore seek the performance element whose {\m expected performance}, over this distribution, is optimal. Unfortunately, the actual distribution, which is needed to determine which element is optimal, is usually not known. Moreover, the task of finding the optimal element, even knowing the distribution, is intractable for most interesting spaces of elements. This paper presents a method, {\sc palo}, that side-steps these problems by using a set of samples to estimate the unknown distribution, and by using a set of transformations to hill-climb to a local optimum. This process is based on a mathematically rigorous form of {\em utility analysis}/: in particular, it uses statistical techniques to determine whether the result of a proposed transformation will be better than the original system. We also present an efficient way of implementing this learning system in the context of a general class of performance elements, and include empirical evidence that this approach can work effectively. % cml-MSC92 % Query optimization for KBMSs; A machine learning approach. % MSc Thesis, University of Toronto, Dept. of Computer Science, 1992 % Igor Jurisica % email: juris@cs.utoronto.ca Efficient query processing is a critical issue in the design of knowledge base management systems. Several methods for query optimization already exist and have been used for database management systems since the late seventies. Some of these methods have also served as starting points for the development of optimization algorithms for knowledge base management systems. More recently, machine learning algorithms have been proposed as a promising artificial intelligence application contribution to the theory of query optimization. This thesis proposes new machine learning applications to optimize queries in a knowledge base management systems. In particular, a machine learning algorithm initially described in \cite{CoGr-CLNL91} is adopted, extended and tested. The algorithm, called \npalo~(\underline{P}robably \underline{A}pproximately \underline{L}ocally \underline{O}ptimal), is a general model of a learning system and is directly applicable to a variety of systems as a speedup learning module. The algorithm is based on the theoretical work of Valiant \cite{Valiant-CACM84} and uses statistical information to produce a close approximation of a locally optimal search strategy. Some additions are made to the original version of the algorithm, to solve a broader range of problems. In addition, the termination condition in the algorithm is changed in order to make it run faster without any degradation of its performance. The learning module is implemented and its integration into an architecture of a knowledge base management system is shown. The proposed optimization technique is tested with real and artificial examples to establish its effectiveness. % cikm92 % A Performance-Oriented Approach to Knowledge Base Management % J. Mylopoulos, V. K. Chaudhri, D. Plexousakis and T. Topaloglou % e-mail: jm,vinay,dp,thodoros @ai.toronto.edu % Appears in Proceedings of the 1st International Conference on Information % and Knowledge Crafting large knowledge bases requires appropriate management facilities which include efficient and robust implementations, sophisticated user interfaces, version control and configuration management. This paper examines the problem of efficiently implementing a knowledge base management system (hereafter KBMS) by presenting an overview of technical problems addressed in the design of a prototype KBMS at the University of Toronto. In particular, the paper describes the physical storage management scheme developed for the KBMS and outlines a query processing algorithm that has been developed, emphasizing optimizations and access planning aspects. Concurrency control is then examined for knowledge bases to support multi-user access. Finally, rule and constraint management is discussed and a comprehensive scheme for compiling and processing them is presented. Throughout, the paper sketches algorithms, presents some formally proven properties of these algorithms and discusses preliminary performance results. % kr92 % Concurrency Control for Knowledge Bases % Vinay K. Chaudhri Vassos Hadzilacos John Mylopoulos % Appears in the Third International Conference on Knowledge Representation % and Reasoning, Boston, October 1992. As the demand for ever-larger knowledge bases grows, knowledge base management techniques assume paramount importance. In this paper we show that large, multi-user knowledge bases need concurrency control. We discuss known techniques from database concurrency control and explain their inadequacies in the context of knowledge bases. We offer a concurrency control algorithm, called the Dynamic Directed Graph (DDG) policy that addresses the specific needs of knowledge bases. The DDG policy exploits the rich structure of a knowledge base to support the interleaved, concurrent execution of several user requests, thereby improving overall system performance. We give a proof of correctness of the proposed concurrency control algorithm and an analysis of its properties. We demonstrate that these results from concurrency control interact in interesting ways with knowledge base features and highlight the importance of performance-oriented tradeoffs in the design of knowledge-based systems. % ci93 % Semantical and Ontological Considerations in Telos: a Language for Knowledge % Representation % Dimitris Plexousakis % Appears in Computational Intelligence, 9,1,February 1993 This paper focuses on the semantics of {\tt Telos}, a language for representing knowledge about Information Systems. {\tt Telos} is intended to support the development of information systems, especially in the requirements modeling phase. An object-oriented representational framework is supported by {\tt Telos}. Its features include {\em aggregation, generalization} and {\em classification}, the treatment of attributes as first-class objects and the explicit representation of time. {\tt Telos} also provides an assertion sublanguage for expressing deductive rules and integrity constraints. A possible-worlds semantics is defined for {\tt Telos} knowledge bases. This semantics is intended to capture the peculiarities involved in the interpretation of temporal expressions. The integration of time has also inspired the treatment of {\em existence} in {\tt Telos}. An ontology of objects based on the property of existence is proposed. In the spirit of {\tt KRYPTON}, {\tt Telos} knowledge bases are specified functionally, in terms of the operations provided for querying and updating them. This knowledge-level analysis will allow us to specify exactly what a knowledge base can be {\tt ASK}-ed or {\tt TELL}-ed about the domain of discourse. Soundness, Consistency and Completeness results have also been proven to complete the specification of {\tt Telos} knowledge bases. This formal account of the language provides a logical framework that can be used to verify the correctness of any proposed implementation of the system. % vldb93 % Integrity Constraint and Rule Maintenance in Temporal Deductive Knowledge % Bases % Dimitris Plexousakis % Proceedings of the 19th International Conference on Very Large Data Bases % Dublin, Ireland, August 1993. The enforcement of semantic integrity constraints in data and knowledge bases constitutes a major bottleneck in the system's performance. Integrity constraint simplification methods aim at reducing the complexity of formula evaluation at run-time. This paper proposes such a simplification method for large and semantically rich knowledge bases. Structural, temporal and assertional knowledge in the form of deductive rules and integrity constraints, is represented in Telos, a hybrid language for knowledge representation. A compilation method performs a number of syntactic, semantic and temporal transformations to integrity constraints and deductive rules, and organizes simplified forms in a dependence graph that allows for efficient computation of implicit updates. Precomputation of potential implicit updates at compile time is possible by computing the dependence graph transitive closure. To account for dynamic changes to the dependence graph by updates of constraints and rules, we propose efficient algorithms for the incremental maintenance of the computed transitive closure. % cikm93 % Storage Management for Knowledge Bases % Thodoros Topaloglou % To appear in the Proceedings of the ISMM/ACM Second International Conference % on Information and Knowledge Management (CIKM'93) Secondary memory storage plays a major role in making large knowledge bases usable. This paper presents such a storage architecture for knowledge bases. In particular, the Controlled Decomposition Model, a flexible storage model which takes into account the numerous and expressive features of the knowledge representation model is defined. Second, the indexing problem for the knowledge base context is examined and the Temporal Join Index is proposed. An implementation technique for temporal indices, based on a spatial access method, is presented. Finally, an analytical and experimental performance study is conducted to account the performance limits of the proposed methods. % vlkb93 % Adapting Database Implementation Techniques to Manage Very % Large Knowledge Bases % J. Mylopoulos, V. K. Chaudhri, D. Plexousakis and T. Topaloglou % e-mail: jm,vinay,dp,thodoros @ai.toronto.edu % Appears in Proceedings of the International Workshop on Building and % Sharing Very Large-Scale Knowledge Bases, Dec 3-4, Tokyo, Japan. % (This is a revised version of the paper cikm92) The management of very large knowledge bases presupposes efficient and robust implementation techniques, sophisticated user interfaces and tools to support knowledge acquisition, validation and evolution. This paper examines the problem of efficiently implementing a knowledge base management system by adopting database techiques. In particular, the paper describes algorithms for designing logical and physical storage schemes and for processing efficiently queries with respect to a given knowledge base. In addition, paper offers an overview of a new concrrency control algorithm which exploits knowledge base structure to support efficient multi-user access. Finally, rule and and constraint management is discussed and a comprehensive scheme for compiling and processing them is presented. Throughout, the paper sketches algorithms, presents some formally proven properties of these algorithms and discusses performance results. The research presented in this paper was conducted at the University of Toronto for a project titled "The Telos Knowledge Base Management Systems". % cbr-depthp % Representation and Management Issues for Case-Based Reasoning Systems % Igor Jurisica % Unpublished Report, University of Toronto, September, 1993 A case-based reasoning system (CBRS) can be seen as a special type of a knowledge-based system. Case-based systems deal with cases or episodes, which represent instances of concepts (or their generic form). They are special in the sense of different knowledge they use and consequently different representation and management requirements they have. The performance of case-based reasoning system directly depends on the quality and quantity of its encoded knowledge: on the one hand, if there is not enough knowledge, the problem-solving capabilities are limited; on the other hand, the system can become useless because of unmanageable amounts of knowledge. The representation and management of large case knowledge bases are not trivial tasks and have not been sufficiently elaborated so far. These tasks can be characterized as a process of representation, visualization, and application of episodic knowledge: specifying the content of the case (episode), describing a retrieval algorithm and precisely defining matching criteria, characterizing strategies to make past cases fit the new problems (adaptation), and case base evolution through application of machine learning. The mainstream theme of this document is to characterize problems in case-based reasoning systems and present available solutions to it. % TSynchronization % Transaction Synchronization in Knowledge Bases: % Concepts, Realization and Quantitative Evaluation % Vinay K. Chaudhri % e-mail: vinay@cs.toronto.edu % Preliminary Draft of Ph.D. thesis As end user demands on database technology increase, so does the need for ever richer data models which, nevertheless, need to be implemented efficiently and must ``scale up'' to meet the challenge of ever larger and ever more complex applications. Unfortunately, richer data models, which adopt modeling features from object-oriented, deductive, temporal and spatial data models lead to complex multi-dimensional databases (or, knowledge bases) with rich processing requirements involving sophisticated retrieval and inference procedures. In this thesis, we examine the concurrency control problem for databases that support such data models. With a directed graph as a model of knowledge bases, we develop an algorithm called the Dynamic Directed Graph (DDG) policy that meets the requirements of knowledge bases and exploits their semantic structure and multi-dimensional properties. In specific, the DDG policy deals with a knowledge base whose graph may contain cycles and may undergo updates due to inserts and deletes of nodes and edges. This work is not only of interest to knowledge bases but also to object-oriented databases or for that matter to any database that can be viewed as a directed graph. The thesis also includes an implementation of the DDG policy and performance results for three real knowledge bases. These results are used to show that the DDG policy can be efficiently implemented without unacceptably high overhead, and that for the workloads found in knowledge bases, the proposed algorithm can indeed perform better than the conventional methods. % cikm94 % Quantitative Evaluation of a Transaction Facility for a % Knowledge Base Management System % V. Chaudhri, J. Mylopoulos, V. Hadzilacos, K. Sevcik Large knowledge bases that are intended for applications such as CAD, corporate repositories or process control will have to be shared by multiple users. For these systems to scale up, to give acceptable performance and to exhibit consistent behavior, it is mandatory to synchronize user transactions using a concurrency control algorithm. In this paper, we examine a novel concurrency control policy called Dynamic Directed Graph (or DDG) policy that effectively exploits the rich semantic structure of a knowledge base. Our analysis is carried out in the context of a real knowledge based system application from which knowledge base structure and workload parameters are computed. These serve as a basis for studying the implementation alternatives that arise as a result of knowledge base characteristics. The implementation alternatives that we consider include selection of portions of the knowledge base structure to be exploited for concurrency control, and also the dependence of concurrency on the traversal strategy used to search through the knowledge base. We analyze the effects of various workload parameters and conclude that the DDG policy improves substantially the response time for short transactions when there is heavy data contention. % cbr-fss94 % How to Retrieve Relevant Information? % Igor Jurisica % Appears in Russell Greiner (Ed.): Proceedings of the AAAI Fall Symposium % Series on Relevance. New Orleans, Louisiana, November 4-6, 1994 % email: juris@cs.utoronto.ca Relevant information is required in all kinds of information-based systems. It is not a trivial task to locate the right piece of information since it may be difficult (or impossible) to specify query requests precisely and completely. Thus, a flexible retrieval algorithm is required, allowing for an imprecise query specification. The document presents an approach to judging relevance of retrieved information based on a novel approach to similarity assessment. Contrary to other systems, we define relevance measures (context in similarity) at query time. This is necessary if since without a context in similarity one cannot guarantee that similar items will also be relevant. % cbr-tr94-5 % A similarity-based retrieval of relevant cases % Igor Jurisica % Technical report, DKBS-TR-94-5 % email: juris@cs.utoronto.ca Retrieving relevant cases is a crucial component of case-based reasoning systems. The task is to use user-defined query to retrieve useful information, i.e., exact matches or partial matches which are close to query-defined request according to certain measures. The difficulty stems from the fact that it may not be easy (or it may be even impossible) to specify query requests precisely and completely - resulting in a situation known as a fuzzy-querying. It is usually not a problem for small domains, but for a large repositories which store various information (multifunctional information bases or a federated databases), a request specification becomes a bottleneck. Thus, a flexible retrieval algorithm is required, allowing for imprecise query specification and for changing the viewpoint. Efficient database techniques exists for locating exact matches. Finding relevant partial matches might be a problem. This document proposes a context-based similarity as a basis for flexible retrieval. Historical bacground on research in similarity assessment is presented and is used as a motivation for formal definition of context-based similarity. We also describe a similarity-based retrieval system for multifunctinal information bases. % cbr-aise95 % A similarity-based retrieval of relevant cases % Igor Jurisica % Appears in: Proceedings of the Third Workshop on AI and % Software Engineering: Breaking the Toy Mold. % IJCAI-95, Montreal, Quebec, August 20-21, 1995 % email: juris@cs.utoronto.ca In this paper we present a prototype of a flexible similarity-based retrieval system. Its flexibility is supported by allowing for an imprecisely specified query. Moreover, our algorithm allows for assessing if the retrieved items are relevant in the initial context, specified in the query. The presented system can be used as a supporting tool for a software repository. We also discuss system evaluation with concerns on usefulness, scalability, applicability and comparability. Evaluation of the \tatry system on three domains gives us encouraging results and an integration of \tatry into a real software repository as a retrieval tool is ongoing. % cbr-rkbs95 % Applying Case-Based Reasoning to Control in Robotics % Igor Jurisica and Janice Glasgow % 3rd Robotics and Knowledge-Based Systems Workshop % St. Hubert, Quebec, 1995. % email: juris@cs.utoronto.ca The paper describes an application of a case-based reasoning system TA3 to a control task in robotics. It differs from previous methods in its approach to similarity-based retrieval, which allows for greater flexibility and for improved accuracy in performance. The proposed architecture is experimentally evaluated on two real world domains and the results are compared to other machine learning algorithms applied to the same problem. % cbr-ilp96 % Inductive learning and case-based reasoning % Igor Jurisica % Canadian AI Conference, Workshop on What is Inductive Learning? % Toronto, Ontario, 1996. % email: juris@cs.utoronto.ca % cbr-fss96 % Supporting flexibility. A case-based reasoning approach. % Igor Jurisica % The AAAI Fall Symposium. Flexible Computation in Intelligent % Systems: Results, Issues, and Opportunities % Cambridge, Massachusetts. % email: juris@cs.utoronto.ca This paper presents a prototype of a case-based reasoning system, called \tatry, that allows for flexible retrieval. The flexibility is defined in terms of: \begin{itemize} \item Variable accuracy of the final solution, depending on resources used -- supporting recall-oriented retrieval (all relevant entities should be retrieved, even if some irrelevant are included) or for supporting precision-oriented retrieval (only relevant entities must be retrieved). \item Supporting imprecise query specification, which allows for retrieving case even if the query is not specified completely and retrieves similar cases in addition to exact matches. \item Allowing for automatic query modifications as well as query-by-example and query-by-reformulation. \end{itemize} This is necessary since the task requirements may change over time and individual user may have different preferences. In addition, the retrieval must be efficient, i.e., the system will scale up as more cases are acquired. % cbr-dksme96 % A case-based reasoning approach to learning control % Igor Jurisica and Janice Glasgow % 5th International Conference on Data and Knowledge Systems for % Manufacturing and Engineering, DKSME-96 % Phoenix, Arizona. % email: juris@cs.utoronto.ca The paper describes an application of a case-based reasoning system \tatry, to a control task in robotics. It differs from previous methods in its approach to relevancy-based retrieval, which allows for greater flexibility and for improved accuracy in performance. Most successful robotic manipulators are precise, fast, smooth and have high repeatability. Their major drawback is their tendency to have only a limited repertoire of movements that can only be extended by costly inverse kinematic calculations or direct teaching. Our approach also starts with a limited repertoire of movements, but can incrementally add new positions and thus allows for higher flexibility. The proposed architecture is experimentally evaluated on two real world domains, albeit simple, and the results are compared to other machine learning algorithms applied to the same problem. % cbr-tai96 % Case-Based Classification Using Similarity-Based Retrieval. % Igor Jurisica and Janice Glasgow % 8th IEEE International Conference on Tools with Artificial Intelligence % ICTAI-96 % Toulouse, France. % email: juris@cs.utoronto.ca % cbr-ijait97 % cbr-aim97 % cbr-cascon97 % cbr-dk97 % email: juris@cs.utoronto.ca % proofs % Safe Locking Policies for Dynamic Databases % V. Chaudhri and V. Hadzilacos, November 1994 It was shown by Yannakakis that a locking policy is not safe if and only if there exists a canonical non-serializable schedule of transactions running according to the rules of the policy in which all the transactions except one are executed serially \cite{Yannakakis-JACM-82}. In the present paper, we study the generalization of this result to a dynamic database, that is, a database that may undergo insertions and deletions of entities. We illustrate the utility of this generalization by applying it to obtain correctness proofs of three locking policies that handle dynamic databases. % coocs95.ps.Z % Simulation and Analysis of Business Processes Using GOLOG % Dimitris Plexousakis % Proceedings of the ACM Conference on Organizational Computing Systems % Milpitas, CA, August 1995. This paper describes a novel approach to simulating and analyzing business processes using GOLOG, a high-level logic programming language suitable for defining complex behaviors and capable of simulating action execution. The language is based on an extended version of the situation calculus and incorporates a formal theory of action. Business processes can be viewed as actions (physical or perceptual) that affect the state of affairs or an agent's knowledge of this state. Using GOLOG, business processes can be specified, synthesized and tested for feasibility and consistency. The theoretical framework behind GOLOG includes a solution to the frame problem for perceptual and complex actions, as well as, a formal method for process analysis. The latter uses a solution to the ramification problem for proving the satisfaction or violation of constraints. In case this is not possible, the method proposes strengthenings to the processes' pre- and post-conditions, so that any implementation that meets the process specification, provably guarantees that constraints will not be violated. In this manner, business process reengineering can be assisted by a formal analysis and simulation tool for testing the consistency of the process model. % rids95.ps.Z % Compilation and Simplification of Temporal Integrity Constraints % Dimitris Plexousakis % Proceedings of the 2nd International Workshop on Rules in Database Systems % Athens, GR, September 1995. The paper presents a novel compilation scheme for temporal integrity constraints and deductive rules expressed in an interval-based first-order temporal logic. Compilation builds a dependence graph with simplified forms of the constraints and rules. This permits the compile-time simplification of the formulae that have to be verified at run-time, as well as the precomputation of potential implicit updates. We show how simplified forms can be obtained with respect to transactions made up of arbitrary sequences of basic updates. Additional optimization steps exploit the organization of simplified forms in dependence graphs. % edbt96.ps.Z % Accommodating Integrity Constraints During Database Design % Dimitris Plexousakis and John Mylopoulos % Proceedings of EDBT-96 % Avignon, FR, March 1996. We address the problem of maintaining the integrity of large knowledge bases using a compile-time transaction modification technique. The novelty of the approach lies in the adaptation of ideas from Artificial Intelligence (AI) planning research. Specifically, starting with the observation that solutions to the frame and ramification problems can be used during database transaction design time, we propose an integrity maintenance technique that modifies transaction specifications by incorporating into them conditions necessary of the constraints' satisfaction. Additions to the transactions' postconditions whose effect is to maintain the integrity constraints, are generated from a set of (determinate) transaction specifications. Thus, the implications of constraints are realized by the transaction specifier and the effort of having to prove transaction safety is saved, since it is guaranteed by the correctness of the generation process. We envision the development of a tool that, given a set of determinate transaction specifications, automatically suggests additions to the transaction postconditions whose effect is to maintain the integrity constraints.