From: PEM moderator
To: Multiple recipients of list PEM <>
Date: Fri, 17 Jan 2003 10:42:44 +0100
Subject: PEM | 21.01.03 | M279

Precedence: bulk
X-url: http://www.cwi.nl/~pem

Mime-Version: 1.0
Content-Type: text/plain

   Dear colleagues,

   Next  Tuesday  we  welcome two international guests from Namur, namely
   Jean-Luc  Hainaut and Jean Henrard. The general topic of their work is
   reverse engineering and reengineering of legacy database applications:

   This announcement can be found at <http://www.cwi.nl/~pem>

         Database Reverse Engineering and Data-centered application
                               Reengineering

   Date:    21.01.03
   Time:    10:00
   Venue:   M279
   Speaker: Jean Henrard (Universiteit van Namen)
   Title:   Database Reverse Engineering and Data-centered application Reengine
ering

   Database  engineering  is  a  subdomain  of  software engineering that
   addresses analysis, design, implementation, reengineering of databases
   and  of  their  applications.  Just  like  SE,  it is based on models,
   techniques, methods and tools. Traditionally, the data structures of a
   database    are    known    through    two    schemas,    namely   the
   technology-independent  conceptual  schema that specifies the concepts
   about which information are recorded in the database (models: ERA, UML
   classes,  NIAM,  ORM,  etc.),  and  the  logical schema that expresses
   (implements)  the  conceptual  schema  according  to  the  model  of a
   specific  data  management  system  or  DMS  (models:  standard files,
   CODASYL,  IMS,  relational, OO, object-relational, etc.). Finally, the
   logical  schema,  adorned  with  some  physical  constructs  (such  as
   indexes,  storage spaces, clusters, buffer management strategy, etc.),
   is  expressed  in  some  sort of data description language (DDL) to be
   compiled  by  the  DBMS.  The  analyst  is  in  charge of building the
   conceptual  schema  and  of deriving a logical schema of the database.
   The  application programmer develops programs according to the logical
   schema.

   In  most databases, and specially in legacy applications, many (if not
   most)  fine-grained data structures and integrity constraints have not
   been  explicitly  declared  through  the  DMS-DDL  (notably due to the
   weakness  of  the DMS data model), but have been manually coded in the
   application  programs,  in  the  dialog  boxes,  or  quite  often, not
   implemented  at  all.  As  a consequence, understanding the intention,
   i.e.,  the  very  meaning  of the data structures of a database, often
   prove much more complex than expected first. The objective of database
   reverse  engineering is to recover the most correct description of the
   data  structures  of  a  database, that is, its logical and conceptual
   schemas.

   The presentation is divided into two parts.

   First,  we  will  describe  a  generic  database  reverse  engineering
   methodology.  This  method  is  divided  into three processes, project
   preparation,    data   structure   extraction   and   data   structure
   conceptualization,  and  produces  two schemas, the logical schema and
   the   conceptual   schema.  The  project  preparation  identifies  the
   components  to  be  analyzed,  the  resources  to  be allotted and the
   planning. The data structure extraction process aims at rebuilding the
   logical  schema  of the database. The data structure conceptualization
   process tries to recover the semantic structures, i.e., the conceptual
   schema,  from  the  logical  schema.  Our  experience  showed that, to
   successfully  recover the logical schema, the source code often is the
   most  reliable  source  from  which  the  constraints can be elicited.
   Source code analysis requires program understanding techniques such as
   cliché  identification,  dependency and data flow analysis and program
   slicing,  as  well  as  tools that support them. We will discuss these
   techniques  and  show how they apply to logical structure discovery. A
   database   engineering   CASE   tool  will  be  used  to  support  the
   presentation.

   The  second  part of this presentation describes and analyzes a series
   of  strategies  to  migrate  data-intensive applications from a legacy
   data  management  system  to  a  modern  DMS.  Considering two ways to
   migrate  the  data  and  three  ways  to  propagate  the corresponding
   perturbation  to  the program code, the paper identifies six reference
   strategies  that  provide  different  levels  of  quality  and  induce
   different costs. Three of them are discussed in detail and illustrated
   by  the  conversion  of COBOL files into a SQL database and of a small
   COBOL program.

   Jean-Luc  Hainaut,  Jean  Henrard  Laboratory  of Database Application
   Engineering University of Namur Belgium
   http://www.info.fundp.ac.be/libd

   Have a nice day.
     _________________________________________________________________

   The  programming environment meetings are a forum for the presentation
   and  discussion  of  new  ideas,  ongoing and finished work. A typical
   meeting  addresses  a subject in the area of programming environments,
   program  generation, algebraic specification, term rewriting, parsing,
   etc.  A presentation ideally takes between 45 and 90 minutes. Meetings
   taking  longer  than 45 minutes are interrupted by a coffeebreak. Most
   Thursdays,  a  meeting is held which starts at 10:00 am. in one of the
   rooms at CWI/WINS. Exceptionally, dates or times may change.

   The    program    of    the    meetings    is    available   on   WWW:
   http://www.cwi.nl/~pem
     _________________________________________________________________