Page 1 of 1

[Chinese Sorting] Sort by name is messed up

Posted: Mon Apr 23, 2018 3:52 am
by 775405984
Hi, my name is Matt and I'm Chinese.

I find this software very useful. But when I type a random word into the search bar, the sorting method is quite messed up.

The files and folders with Chinese names were not sorted alphabetically(Pinyin), instead, they use an outdated method called Stroke order.

No one uses this kind of ancient technique, except someone with zero knowledge of Pinyin.

Some might say you can change the sorting method in Control Panal on Windows. I did, and nothing happens.

So I was wondering if you guys fix it, please!!! Thank you!!!

https://en.wikipedia.org/wiki/Stroke_order
https://en.wikipedia.org/wiki/Pinyin

Re: [Chinese Sorting] Sort by name is messed up

Posted: Thu Sep 20, 2018 12:32 pm
by 775405984
were you ever gonna fix this ever? Come on, please.

Re: [Chinese Sorting] Sort by name is messed up

Posted: Thu Sep 20, 2018 12:59 pm
by NotNull
775405984 wrote:were you ever gonna fix this ever? Come on, please.
It is number 227 on the to do list

Cut @void some slack ....

Re: [Chinese Sorting] Sort by name is messed up

Posted: Mon Oct 22, 2018 7:37 am
by Debugger
NotNull - such a long list will be realized but in the next century :lol:

Re: [Chinese Sorting] Sort by name is messed up

Posted: Mon Oct 22, 2018 7:40 am
by Debugger
I am most annoyed with the need to introduce the preceding expressions, it is a waste of time for me and not everyone remembers all these.

Re: [Chinese Sorting] Sort by name is messed up

Posted: Tue Oct 23, 2018 6:54 am
by void
Currently, Everything sorts filenames by unicode code points, which is completely wrong, but fast!

I have added to my TODO list to support Unicode Collation Algorithm (UCA). Hopefully this will be available in the next release of Everything.

While this is not pinyin, it might be 'good enough'.
Implementing Pinyin at this stage will be unfeasible. The sorting rules are too complex, I can't use third party sorting or the Windows API to sort as these could change at any time and it is critical the Everything database is sorted in a specific way.

I also have concerns about UCA as storing these collation lookup tables will require quite a bit of data.. (128K+)
There is also a small performance hit with adding collation lookup tables.

https://unicode.org/faq/collation.html

Re: [Chinese Sorting] Sort by name is messed up

Posted: Sun Jul 11, 2021 11:49 pm
by void

Re: [Chinese Sorting] Sort by name is messed up

Posted: Sat Jul 24, 2021 2:19 pm
by 775405984
void wrote: Sun Jul 11, 2021 11:49 pm The Everything 1.5 alpha adds support for sorting by Unicode weights.
I updated to 1269a, and it changes nothing.

Thanks for trying, but I'm not seeing the improvements.

Re: [Chinese Sorting] Sort by name is messed up

Posted: Mon Jul 26, 2021 5:21 pm
by therube
Maybe some hints as to what it is not doing correctly?

And is it (at least) "correct" per the Unicode Collation Algorithm?

Re: [Chinese Sorting] Sort by name is messed up

Posted: Tue Jul 27, 2021 5:11 am
by 775405984
therube wrote: Mon Jul 26, 2021 5:21 pm Maybe some hints as to what it is not doing correctly?

And is it (at least) "correct" per the Unicode Collation Algorithm?
Screenshot_2.png
Screenshot_2.png (97.42 KiB) Viewed 11690 times
This is an example of how messed up sorting by name is.

I didn't find anything wrong with UCA though. What I did find is that Unicode doesn't support pinyin.

You can link UCA character to GBK,GBK support pinyin.

I'm not a developer, so I don't know how to do it. I'll post this to a Chinese website to see if anyone have any ideas.

Thank you!