Finding Dupes using RegEx?

If you are experiencing problems with "Everything", post here for assistance.
Post Reply
Mike_PB
Posts: 2
Joined: Fri Mar 24, 2023 10:08 am

Finding Dupes using RegEx?

Post by Mike_PB »

Hello everyone,
I am still a relatively new user here, but I could imagine that the Everything tool can do what I am looking for...

I would like to find duplicate file names on a drive, let's say x:\. But there is a hurdle: The file names are often not quite the same.

Example:
Filename [Recipient] (2022).docx
Filename [sender] (2022).docx
Filename [Reader] (2023).docx
Contain almost the same information. But I can't get any further with the dupes command itself, because the file names are different here.

I could, of course, simply display all the docx files, sort them alphabetically and search for such similarities manually, but that would be very time-consuming, since there are about 40,000 files, only a fraction of which correspond to this scheme. You would be sure to miss something with this method. Unfortunately, the file contents are also sometimes minimally different, so that I don't get any further with a search for "real" duplicates (e.g. with the programme clonespy). The file name is still the most relevant.

I then tried the following:
x:\ regex:^(?!(?:.*[\[\(].*[\]\)])){1}(.*)\.docx

However, this only results in the files containing [] or () no longer being displayed.

My question:
Is there a solution to generate the file list in such a way that either only the name is displayed and the rest is truncated, or is it possible to set the query so that these three files are displayed one below the other in the duplicate search?

Thanks for your help!

Kind regards,
Michael
NotNull
Posts: 5296
Joined: Wed May 24, 2017 9:22 pm

Re: Finding Dupes using RegEx?

Post by NotNull »

For this, you nee d Everything 1.5
  • Search for:

    Code: Select all

    X:   ext:docx   regex:^(.+?)[[(].*\.docx
    (the ext:docx is not strictly needed, but is a fast way of limiting results that have to be processed by the regex)
  • Right-click the result list header (for example Name)
  • From the context menu, select Search => Regular Expression Match 1
  • Sort results by Regular Expression Match 1
  • Right-click the Regular Expression Match 1 header
  • Select Find Regular Expression Match 1 duplicates
When done, double-click DUPE in the statusbar to return to the regular layout.


BTW: What should your regular expression do?
Mike_PB
Posts: 2
Joined: Fri Mar 24, 2023 10:08 am

Re: Finding Dupes using RegEx?

Post by Mike_PB »

Hi,
thanks for this quick help!

And yes, it´s working perfectly. Exactly what I am looking for.

That´s what my RegExp should have done as well. So everything ok with your solution.

Thanks again!
anmac1789
Posts: 561
Joined: Mon Aug 24, 2020 1:16 pm

Re: Finding Dupes using RegEx?

Post by anmac1789 »

Can regex be used to find dupes by doing date and time arithmetic and finding the number of occurances of a dupe using a custom column?
NotNull
Posts: 5296
Joined: Wed May 24, 2017 9:22 pm

Re: Finding Dupes using RegEx?

Post by NotNull »

Without further details I have to say that this is not possible.
Maybe there are some options if you tell us exactly what you want to accomplish.
Post Reply