Crowd-sourcing approaches for user feedback on a network of schemas

Project Details

Crowd-sourcing approaches for user feedback on a network of schemas

Laboratory : LSIR Semester / Master Completed

Description:

There are a lot of optimization techniques on semantic reconciliation to find the matching instances of schema network which satisfy some pre-defined constraints. Although they tried to minimize the uncertainty about best possible solutions, there still remain several cases where the final choice could not be certain. One way to tackle this problem is exploiting user knowledge in terms of feedback. However, when the network evolves, this does not scale well since there are too much ambiguity for user to give correct feedback.

The goal of the project is to employ crowd-sourcing algorithms for harnessing the tremendous knowledge from user community; and thus, improve the quality of database schema and instance matching networks. This will be part of the data quality improvement framework of the NisB research project.

The algorithms shall be implemented in a way that the can be integrated into the NisB software framework. In particular, they should be integrated with the NisB testbed, which might require minor adaptations of the testbed as well. The algorithms and metrics shall be implemented with special attention to large amounts of data. The project also involves the following steps:

  • Maintaining and extending the available benchmark datasets
  • Define aggregation model on outputs to avoid malicious users
  • Formalize the crowd-sourcing markets and working mechanisms
  • Solve the assignment problem of matching workers to tasks which they can do best
  • Design the art of asking questions to influence the quality of user feedback
  • Leverage truthful responses to harness the ‘wisdom of the crowd’

Prerequisites

  • Information systems: structured and semi-structured databases
  • Good programming skills (Java, Web Services)
  • Probability theory

Preferred, but not required

  • Machine Learning/Data Mining
Contact: Nguyen Quoc Viet Hung