Heuristic techniques for semantic conflict resolution in a network of databases

Project Details :

Laboratory : LSIR Semester Completed

Description :

It is a common situation in business or scientific practice that one needs to integrate data, stored in multiple databases. Often the set of databases or data sources is not predefined, or it changes over time. In such situations one can rely on pairwise mappings. There is a number of commercial and academic tools for schema matching. They apply various techniques to construct the matching, and they work reasonably well, but still, often they don’t reach a perfect matching. The starting point of this project is this output, generated by schema matcher tools. 

The goal of the project is to apply heuristic methods, such as simulated annealing, genetic algorithms, etc. to improve such set of mappings. In particular,

0) pre-process the data (minor effort)
1) implement  heuristic algorithms (the overall heuristic scheme will be defined)
2) adjust the parameters of the algorithms and experimentally validate their effectiveness and adapt the implementation, if necessary (major effort)
3) integrate into a SW framework developed at LSIR through a well-defined interface

The work should be performed independently, there is no dependency to other tasks. On the other hand, the project is a complementary task to our research at LSIR (http://www.nisb-project.eu/), so it is expected that the student interacts with the research assistants on a regular basis. 

Requirements: java programming skills, interest in conducting systematic experiments, interest in research

Site:
Contact: Zoltan Miklos