Find duplicate records in MongoDB

Question

How would I find duplicate fields in a mongo collection   I d like to check if any of the  name  fields are duplicates          name     ksqn291          v    0        id    ObjectId  540f346c3e7fc1054ffa7086         channel     Sales      Many thanks

User · Accepted Answer

Use aggregation on name and get name with count  gt  1  db collection aggregate         quot  group quot       quot  id quot    quot  name quot    quot count quot      quot  sum quot   1              quot  match quot     quot  id quot      quot  ne quot    null      quot count quot      quot  gt quot   1              quot  project quot     quot name quot     quot   id quot    quot  id quot    0         To sort the results by most to least duplicates  db collection aggregate         quot  group quot       quot  id quot    quot  name quot    quot count quot      quot  sum quot   1              quot  match quot     quot  id quot      quot  ne quot    null      quot count quot      quot  gt quot   1              quot  sort quot     quot count quot    -1           quot  project quot     quot name quot     quot   id quot    quot  id quot    0              To use with another column name than  quot name quot   change  quot  name quot  to  quot  column name quot

User · Answer

The answer anhic gave can be very inefficient if you have a large database and the attribute name is present only in some of the documents   To improve efficiency you can add a  match to the aggregation   db collection aggregate         match     name       ne    null                group       id     name    count       sum   1               match     count       gt   1               project     name       id     id    0

User · Answer

You can find the list of duplicate names using the following aggregate pipeline    Group all the records having similar name  Match those groups having records greater than 1  Then group again to project all the duplicate names as an array    The Code   db collection aggregate     group    id    name   name    first   name    count    sum 1       match   count    gt 1       project   name  1   id  0      group    id  null  duplicateNames    push   name        project    id  0  duplicateNames  1        o p      duplicateNames       ksqn291    ksqn29123213Test

User · Answer

db getCollection  orders   aggregate           group                  id   name   quot  name quot                uniqueIds    addToSet   quot   id quot                count    sum  1                          match             count    quot  gt quot   1                      First Group Query the group according to the fields  Then we check the unique Id and count it  If count is greater then 1 then the field is duplicate in the entire collection so that thing is to be handle by  match query

[mongodb] Find duplicate records in MongoDB

Examples related to mongodb

Examples related to aggregation-framework

Examples related to database