Skip to content
This repository was archived by the owner on Mar 5, 2019. It is now read-only.
This repository was archived by the owner on Mar 5, 2019. It is now read-only.

rails: add ui for manual deduplication workflow  #153

@aspiers

Description

@aspiers

Description

At the 2017/2/7 meeting we agreed that we needed a UI in the Rails app which supported manual deduplication ("cleansing") of entities which were still unclean after the automatic deduplication pass. This UI would need to be friendly enough to be usable by non-technical volunteers, for example those attending a "data cleansing hackathon day" event which we also proposed in the same meeting.

Blocked by

Comments, Questions and Considerations

Essentially the workflow needs to support the following sequence of events:

  • allow selection of an entity type (person, organization, or government office)
  • provide a list of unclean entities of that type in the database, preferably sorted in descending order by probability of a match against an existing clean entity already in the database (where the probability is calculated by the automated matching heuristics)
  • allow selection of one individual unclean entity
  • allow browsing of clean entities which could be a possible match for that unclean entity (against preferably sorted in descending order by probability of a match)
  • allow marking of the unclean entity as clean (see db: express whether an entity has been 'cleaned' or not #150), i.e. a new non-duplicate entity, thereby removing it from the unclean list
  • allow marking of the unclean entity as a duplicate of a clean entity, which would cause the unclean entity to be "merged with" the clean one

This final step would manifest itself in the following sequence of database changes:

  • a new record would be inserted in the entity name database table (see db: extract entity names into separate table #152) with name matching the unclean entity, and with the entity id foreign key equal to the id of the matched clean entity
  • any references from other tables to the unclean entity would be changed to refer to the matched clean entity
  • the unclean entity would be removed from the database.

Acceptance Criteria

This story can be considered done when the following acceptance tests
are satisfied:

Given a database populated with both clean and unclean entities of the same type
When I visit the Rails frontend
Then I can go through unclean entities one by one, either deduplicating them or marking them as new, clean entities.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions