Regular expression: "Italian sentences"

General discussion related to "Everything".
Post Reply
Debugger
Posts: 591
Joined: Thu Jan 26, 2017 11:56 am

Regular expression: "Italian sentences"

Post by Debugger »

Is there a regular expression that can recognize at least some "Italian words" in file names
Or the file name contains at least some one word in Italian.
This is not an easy task, here you need experts in "special expressions". I'm just trying to find the name of Italian songs in Everything.
E.g. the title of Italian music:
https://www.youtube.com/watch?v=1SuJud96yjM

Not working:
\b[\p{IsBasicLatin}À-ÖØ-öø-ÿ]+\b
\b[A-Za-zÀ-ÖØ-öø-ÿ]+\b
\b\p{Script=Latin}+\b
\b[\p{L}'àèéìíîòóùú]+\b
void
Developer
Posts: 15401
Joined: Fri Oct 16, 2009 11:31 pm

Re: Regular expression: "Italian sentences"

Post by void »

You'll need to make your own Italian word list.

For example:

\b(Pensami|Anche|Tu)\b



Use \p{Latin} to match basic Latin characters.
Everything uses Perl Compatible Regular Expressions.



To match words containing at least one diacritic:
regex:\b\w*[ñáéíóúïüçâêîôûàèìòù]\w*\b

regex: = enable regular expressions
\b = match a word boundary.
\w* = match a word-character zero or more times.
[...] = match a character from a set.
ChrisGreaves
Posts: 609
Joined: Wed Jan 05, 2022 9:29 pm

Re: Regular expression: "Italian sentences"

Post by ChrisGreaves »

Debugger wrote: Mon Apr 22, 2024 9:48 am Is there a regular expression that can recognize at least some "Italian words" in file names
Void's suggestion that you create a list of words in Italian might be achieved semi-automatically by taking your existing list of file names, titles etc, and then using Selenium to translate each term to Italian.
I found this video to be a good introduction.
Cheers, Chris
Post Reply