From: PEM moderator To: Multiple recipients of list PEM <> Date: Fri, 17 Jan 2003 10:42:44 +0100 Subject: PEM | 21.01.03 | M279 Precedence: bulk X-url: http://www.cwi.nl/~pem Mime-Version: 1.0 Content-Type: text/plain Dear colleagues, Next Tuesday we welcome two international guests from Namur, namely Jean-Luc Hainaut and Jean Henrard. The general topic of their work is reverse engineering and reengineering of legacy database applications: This announcement can be found at Database Reverse Engineering and Data-centered application Reengineering Date: 21.01.03 Time: 10:00 Venue: M279 Speaker: Jean Henrard (Universiteit van Namen) Title: Database Reverse Engineering and Data-centered application Reengine ering Database engineering is a subdomain of software engineering that addresses analysis, design, implementation, reengineering of databases and of their applications. Just like SE, it is based on models, techniques, methods and tools. Traditionally, the data structures of a database are known through two schemas, namely the technology-independent conceptual schema that specifies the concepts about which information are recorded in the database (models: ERA, UML classes, NIAM, ORM, etc.), and the logical schema that expresses (implements) the conceptual schema according to the model of a specific data management system or DMS (models: standard files, CODASYL, IMS, relational, OO, object-relational, etc.). Finally, the logical schema, adorned with some physical constructs (such as indexes, storage spaces, clusters, buffer management strategy, etc.), is expressed in some sort of data description language (DDL) to be compiled by the DBMS. The analyst is in charge of building the conceptual schema and of deriving a logical schema of the database. The application programmer develops programs according to the logical schema. In most databases, and specially in legacy applications, many (if not most) fine-grained data structures and integrity constraints have not been explicitly declared through the DMS-DDL (notably due to the weakness of the DMS data model), but have been manually coded in the application programs, in the dialog boxes, or quite often, not implemented at all. As a consequence, understanding the intention, i.e., the very meaning of the data structures of a database, often prove much more complex than expected first. The objective of database reverse engineering is to recover the most correct description of the data structures of a database, that is, its logical and conceptual schemas. The presentation is divided into two parts. First, we will describe a generic database reverse engineering methodology. This method is divided into three processes, project preparation, data structure extraction and data structure conceptualization, and produces two schemas, the logical schema and the conceptual schema. The project preparation identifies the components to be analyzed, the resources to be allotted and the planning. The data structure extraction process aims at rebuilding the logical schema of the database. The data structure conceptualization process tries to recover the semantic structures, i.e., the conceptual schema, from the logical schema. Our experience showed that, to successfully recover the logical schema, the source code often is the most reliable source from which the constraints can be elicited. Source code analysis requires program understanding techniques such as cliché identification, dependency and data flow analysis and program slicing, as well as tools that support them. We will discuss these techniques and show how they apply to logical structure discovery. A database engineering CASE tool will be used to support the presentation. The second part of this presentation describes and analyzes a series of strategies to migrate data-intensive applications from a legacy data management system to a modern DMS. Considering two ways to migrate the data and three ways to propagate the corresponding perturbation to the program code, the paper identifies six reference strategies that provide different levels of quality and induce different costs. Three of them are discussed in detail and illustrated by the conversion of COBOL files into a SQL database and of a small COBOL program. Jean-Luc Hainaut, Jean Henrard Laboratory of Database Application Engineering University of Namur Belgium http://www.info.fundp.ac.be/libd Have a nice day. _________________________________________________________________ The programming environment meetings are a forum for the presentation and discussion of new ideas, ongoing and finished work. A typical meeting addresses a subject in the area of programming environments, program generation, algebraic specification, term rewriting, parsing, etc. A presentation ideally takes between 45 and 90 minutes. Meetings taking longer than 45 minutes are interrupted by a coffeebreak. Most Thursdays, a meeting is held which starts at 10:00 am. in one of the rooms at CWI/WINS. Exceptionally, dates or times may change. The program of the meetings is available on WWW: http://www.cwi.nl/~pem _________________________________________________________________