Size: 611
Comment:
|
← Revision 17 as of 2015-04-14 20:18:23 ⇥
Size: 1842
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
<<TableOfContents>> |
|
Line 5: | Line 3: |
== Overview == | == LowerCase and UpperCase == |
Line 29: | Line 27: |
== Remove the Accents (diacritics) == The following function can be used to remove the accents of Unicode words. This is useful to normalize the text input of user when used for searching. {{{ r←RemoveAccents string;str;strFormD;stringBuilder;⎕USING ⍝ Function to remove the accents. ⍝ For example: 'Crème Brûlée' becomes 'Creme Brulee' ⍝ Adapted from the following posts: ⍝ http://www.siao2.com/2005/02/19/376617.aspx ⍝ http://www.siao2.com/2007/05/14/2629747.aspx ⎕USING←'System,mscorlib.dll' 'System.Text,mscorlib.dll' 'System.Globalization,mscorlib.dll' str←⎕NEW String(⊂string) strFormD←str.Normalize(NormalizationForm.FormD) stringBuilder←⎕NEW StringBuilder {UnicodeCategory.NonSpacingMark≠CharUnicodeInfo.GetUnicodeCategory(⍵):{}stringBuilder.Append(⍵)}¨strFormD str←⎕NEW String(⊂stringBuilder.ToString ⍬) r←str.Normalize(NormalizationForm.FormC) }}} And here is some utilization of the function: {{{ RemoveAccents 'Crème Brûlée' Creme Brulee RemoveAccents 'âãäåçèéêë ìíîïðñòó ôõöùúûüý' aaaaceeee iiiiðnoo ooouuuuy }}} ---- CategoryDyalog CategoryDyalogDotNet CategoryDyalogExamplesDotNet |
netUpperLowerCase
LowerCase and UpperCase
The 2 functions ToLowercase and ToUppercase are used when dealing with Unicode characters:
ToLowercase←{ (0=1↑0⍴⍵):'' ⎕USING←',mscorlib.dll' (⎕NEW System.String(⊂,⍵)).ToLowerInvariant } ToUppercase←{ (0=1↑0⍴⍵):'' ⎕USING←',mscorlib.dll' (⎕NEW System.String(⊂,⍵)).ToUpperInvariant } ToUppercase 'monday' MONDAY ToUppercase¨ 'sunday' 'monday' 'tuesday' SUNDAY MONDAY TUESDAY ToUppercase 'Вторник' ВТОРНИК
Remove the Accents (diacritics)
The following function can be used to remove the accents of Unicode words. This is useful to normalize the text input of user when used for searching.
r←RemoveAccents string;str;strFormD;stringBuilder;⎕USING ⍝ Function to remove the accents. ⍝ For example: 'Crème Brûlée' becomes 'Creme Brulee' ⍝ Adapted from the following posts: ⍝ http://www.siao2.com/2005/02/19/376617.aspx ⍝ http://www.siao2.com/2007/05/14/2629747.aspx ⎕USING←'System,mscorlib.dll' 'System.Text,mscorlib.dll' 'System.Globalization,mscorlib.dll' str←⎕NEW String(⊂string) strFormD←str.Normalize(NormalizationForm.FormD) stringBuilder←⎕NEW StringBuilder {UnicodeCategory.NonSpacingMark≠CharUnicodeInfo.GetUnicodeCategory(⍵):{}stringBuilder.Append(⍵)}¨strFormD str←⎕NEW String(⊂stringBuilder.ToString ⍬) r←str.Normalize(NormalizationForm.FormC)
And here is some utilization of the function:
RemoveAccents 'Crème Brûlée' Creme Brulee RemoveAccents 'âãäåçèéêë ìíîïðñòó ôõöùúûüý' aaaaceeee iiiiðnoo ooouuuuy
CategoryDyalog CategoryDyalogDotNet CategoryDyalogExamplesDotNet