Skip to Main Content
Monash Health Library
If you have an exact title add ti= (eg. ti=Emergency Medicine)


Click here to chat with a librarian

Deduplication is the process of removing duplicate records from search results when multiple databases are searched. Duplicate records occur when search results from multiple databases are exported into one combined file or reference management system, such as Endnote.


When to deduplicate?

Deduplication is done prior to screening results.

Note: When conducting a systematic review or meta-analysis you must:

  • record how many original results you have exported from each database prior to deduplication.
  • record the total number of duplicated removed.
  • A table is a good way to keep track of results numbers.

How to deduplicate

The most efficient method for finding and removing duplicates is by using reference management software such as EndNote, Zotero, RefWorks or Mendeley. Visit our Referencing Guide for more information on using different reference management systems.

Referencing Guide

 

It is also possible to determine duplicates through an Excel spreadsheet. Using formulas in Excel, such as highlighting duplicates, can be a useful tool to speed up this process. 

Using EndNote to remove duplicate records

1. Before you start

  • Create or open the library where you want your results to be saved : Click File > New
  • Give your EndNote library a meaningful name, e.g. project name, and date
  • Avoid saving it in the cloud because this can create syncing issues
  • Set up groups in your EndNote Library to keep track of each databases’ results: Right click on My groups > Create Group

2. Export results from databases to EndNote

  • Place results in corresponding groups e.g. Medline results in the Medline group; Embase results in the Embase group etc.

3. BACKUP! Before removing duplicates. Click File > Compressed Library (.enlx) and save file in your desktop or drive

4. Set up deduplication preferences in EndNote

  • Click Edit > Preferences > Duplicates. Select the fields you want EndNote to match in the deduplication process . Recommended fields  for stage 1:  Author, Year, Title, Secondary Title (Journal), Pages, abstract, database. This means that EndNote will try and find duplicates in all of these fields, this is a precise search, and good to start with large sets of results. 
  • Do NOT select Automatically discard duplicates!

5. Find Duplicates in EndNote

  • Click on References > Find Duplicates
  • A window will pop-up with 2 columns allowing you to compare records one at a time for similarity and choosing to “Keep This Record”. This is the slow method.
  •  Recommended Method -  Click on References > Find Duplicates then click the cancel button in the pop-up window. This will create a temporary ‘Duplicate References’ folder . The ‘Duplicate References’ folder shows a list of highlighted records identified by EndNote,  based on the duplicate preferences you set up earlier

Stage 1 - Scan the list to check they are indeed duplicates (this is not a perfect system and needs some oversight) – same page number is a good indication. Drag all the highlighted duplicates to the trash folder. You will see your 'All reference' folder number go down and the 'trash' folder number go up. 

Stage 2 – Adjust your deduplication preferences in EndNote to compare just author, volume, pages (see above).  This is less precise but still allows allows EndNote to search for duplicates. Repeat stage 1 - Click on References > Find Duplicates > Cancel, then scan results moving duplicates to trash. 

Stage 3 – The last stage is less exact and involves you scanning the remaining results without using the References > Find Duplicates function. Scan based on author, or title, or journal. Remove any duplicates to the trash.

6. After duplication complete

  • Record the total number of records in trash, this is your number of duplicates.
  • Record the total number of records left in ‘All References’. This is your final number of results (no need to record from which database at this stage).
  • It is recommended that you move the duplicate records that are in the trash to a NEW library instead of deleting them. Just in case you want to refer to them, or check something in future.

7. BACKUP EndNote Library AGAIN! You can never backup enough!

University of Calgary (2018) EndNote: Identify and Remove Internal Duplicates [YouTube]  6m42s

Monash Health acknowledges the Traditional Custodians of the land, the Wurundjeri and Boonwurrung peoples, and we pay our respects to them, their culture and their Elders past, present and future.

We are committed to creating a safe and welcoming environment that embraces all backgrounds, cultures, sexualities, genders and abilities.