Page 1 of 1

How to fix the coding into Polish letters? (charset fix) (REGEX ISSUE)

Posted: Thu Dec 06, 2018 2:55 pm
by Debugger
How to fix the coding into Polish letters? (charset fix). I need a regex to change all the characters throughout the text.

Mini-example:

Code: Select all

Na drodze mojej sstanąłes. Pojawiles się z nikąd. Żeśmy się

"Ä…"=>"ą"
"ć"=>"ć"
"Ä™"=>"ę"
"ó"=>"ó"
"Å‚"=>"ł"
"Å„"=>"ń"
"Å›"=>"ś"
"ż"=>"ż"
"ź"=>"ź"
"Å�"=>"Ł"
"Ó"=>"Ó"
"ü"=>"ü"
"ä"=>"ä"
"Å‘"=>"ö"
"Å�"=>"Ö"
''Å»''=>''Ż''


Or how to change *.html to a text file without losing the original coding?
Of course, the all HTML code must be removed, this is something like a page with a blog

Re: How to fix the coding into Polish letters? (charset fix) (REGEX ISSUE)

Posted: Fri Dec 07, 2018 7:03 am
by void
Fixing encoding with regex would be difficult.

Did you want to use regex to find (and not fix) possible encoding issues?

Please try converting from ANSI to UTF-8:
https://superuser.com/questions/762473/ ... in-notepad

Re: How to fix the coding into Polish letters? (charset fix) (REGEX ISSUE)

Posted: Fri Dec 07, 2018 11:32 am
by Debugger
I do not understand anything, it seems to be complicated, so I still do not know how to do it. I used to have a regular expression where it could be done in a few seconds, now I need a regex or step-by-step how to do it in EmEditor (because I use it and it's intuitive)