How to filter or select the specific dupe files?

General discussion related to "Everything".
Post Reply
tinyvane
Posts: 5
Joined: Sat Nov 26, 2022 6:47 am

How to filter or select the specific dupe files?

Post by tinyvane »

I have search the forum for several days.

I got millions of dupe files b/c insync or resilio sync, but that's another topic, not going to discuss here.

I use search command like below:

d:\ file: D:\Insync\***\Google Drive\googlesync\!图片视频-单独划出来 sort:size dupe:size;sha256 secondary-sort:filename-length

and the search result is like below:

Image

(https://ibb.co/98zKcn3, the pic url, previewed and found no pic shown....)

My goal is to delete the files above with red circle. Many thanks.

Yi.
void
Developer
Posts: 15488
Joined: Fri Oct 16, 2009 11:31 pm

Re: How to filter or select the specific dupe files?

Post by void »

Please try adding the following to the end of your search:

Code: Select all

 addcolumn:column1 <regex:\((\d)\)\.[^.]*$ column1:="("..regmatch1:..")"> | *
Sort by Column 1.
Files containing (0-9) will show (#) in the column1 column.
Select all your (0-9) files.
Sort by Size.



Another easy way:
  • Select all your results. (Ctrl + A)
  • Copy all the filenames. (Ctrl + Shift + C)
  • Search for: filelist1: (2).
  • Hold down Ctrl and click the filelist1: text in the search box.
  • Paste your filenames. (Ctrl + V)
  • Click OK.
(this will only list your (#) files)



Another way to instantly find the (2) - (9) files, where the same filename without the (#) part exists and without comparing content:

regex:^(.*)" "\(\d\)\.([^.]*)$ fileexists:\1.\2

(this will only list your (#) files)

fileexists:



A feature to select files based on a filter is on my TODO list.
tinyvane
Posts: 5
Joined: Sat Nov 26, 2022 6:47 am

Re: How to filter or select the specific dupe files?

Post by tinyvane »

Thanks so much, really magic.

While I was doing a little stupid EXCEL work before you gave the smooth way to do this.

1. I export the result to csv.
2. Open csv in EXCEL.
3. Add a column before the NAME column.
3. write the expression in first line, =IF(D2=D3,"to delete","DO NOTHING"), D column is the file size column.
4. Filter all the "to delete" column and save back to .csv.
5. Import the .csv into Everything and delete.

The files are all my family photos and movies which made them so important. I can't endure any chance to lose any original version of the files. But in another way I even could not endure any waste of dupes. LOL.

Thanks again.

Really great software to make my life easier.
tinyvane
Posts: 5
Joined: Sat Nov 26, 2022 6:47 am

Re: How to filter or select the specific dupe files?

Post by tinyvane »

Sorry about this. Did you just modify your reply ?

I remembered first time the code I copied is

Code: Select all

addcolumn:column1 <(2). column1:=(2)> | *
Which I just found that your reply content is replaced with

Code: Select all

addcolumn:column1 <regex:\((\d)\)\.[^.]*$ column1:="("..regmatch1:..")"> | *
... I used the first code, and just deleted over 30k files.... ORZ

Anything I need to worry?
void
Developer
Posts: 15488
Joined: Fri Oct 16, 2009 11:31 pm

Re: How to filter or select the specific dupe files?

Post by void »

The first one can match (2). anywhere in the filename.
The second one will only match (2). at the end of the filename, where the extension follows.

The first one will only match (2), not (3-9)
The second one will match (0-9) and also set the matched number in the column1 column.

It's only a minor formatting change.
Please backup your files if unsure.
-or-
instead of deleting these files move them to another folder first.
tinyvane
Posts: 5
Joined: Sat Nov 26, 2022 6:47 am

Re: How to filter or select the specific dupe files?

Post by tinyvane »

Thanks for ur time.

Another reply I just made seemed to be failure b/c site temporarily not responding.

The reason I worry about the (2) or etc. is I could not ensure ther were some "ORIGINAL" filename contained (2), which of them are not the dupe files.

So that why I got a little worried about your second "easier" way to filter the (2) files in your original first reply, maybe not safe enough.

Thanks again. I would love to learn even more about complited regex or expression in Everything.
brazilsamba
Posts: 1
Joined: Sun Nov 27, 2022 5:52 pm

Re: How to filter or select the specific dupe files?

Post by brazilsamba »

dupe:size;sha-256

add column sha-256 and short
copy full path

excel/notepad++
remove duplicate lines by column sha-256
tinyvane
Posts: 5
Joined: Sat Nov 26, 2022 6:47 am

Re: How to filter or select the specific dupe files?

Post by tinyvane »

UPDATE:

I companyed the two method, which I found that the author's way would delete those "original" files already contained the key words such like (2) or (3).

So my current way is TWO main steps:

1. file: D:\yourpath sort:size dupe:size;sha256 secondary-sort:filename-length addcolumn:column1 <regex:\((\d)\)\.[^.]*$ column1:="("..regmatch1:..")"> | *

2. Edit file EFU list in EXCEL;

2.1 add another column before the first column, named after "flag", with a expression of "=IF(OR(F2="FLAG2",F2="FLAG3",F2="FLAG4",F2="FLAG5",F2="FLAG6",F2="FLAG7",F2="FLAG8",F2="FLAG9"),(IF(D2=D1,"delete","NOTHING")),"NOTHING")"

F coulmn is file name, D coulme is filesize.

2.2 Then filter the dataset with first column contained only "delete".
2.3 Copy to a new xls file and then save as another EFU file(remember to delete first coulmn and add some another tab like attribute and modified date..I use VSCODE with regex replace).
2.4 Load EFU file into Everything and delete the files.

All done.
Post Reply