netUpperLowerCase

LowerCase and UpperCase

The 2 functions ToLowercase and ToUppercase are used when dealing with Unicode characters:

 ToLowercase←{
     (0=1↑0⍴⍵):''
     ⎕USING←',mscorlib.dll'
     (⎕NEW System.String(⊂,⍵)).ToLowerInvariant
 }

 ToUppercase←{
     (0=1↑0⍴⍵):''
     ⎕USING←',mscorlib.dll'
     (⎕NEW System.String(⊂,⍵)).ToUpperInvariant
 }

      ToUppercase 'monday'
MONDAY

      ToUppercase¨ 'sunday' 'monday' 'tuesday'
 SUNDAY  MONDAY  TUESDAY

      ToUppercase 'Вторник'
ВТОРНИК

Remove the Accents (diacritics)

The following function can be used to remove the accents of Unicode words. This is useful to normalize the text input of user when used for searching.

 r←RemoveAccents string;str;strFormD;stringBuilder;⎕USING
⍝ Function to remove the accents.
⍝ For example: 'Crème Brûlée' becomes 'Creme Brulee'

⍝ Adapted from the following posts:
⍝ http://www.siao2.com/2005/02/19/376617.aspx
⍝ http://www.siao2.com/2007/05/14/2629747.aspx

 ⎕USING←'System,mscorlib.dll' 'System.Text,mscorlib.dll' 'System.Globalization,mscorlib.dll'

 str←⎕NEW String(⊂string)

 strFormD←str.Normalize(NormalizationForm.FormD)

 stringBuilder←⎕NEW StringBuilder

 {UnicodeCategory.NonSpacingMark≠CharUnicodeInfo.GetUnicodeCategory(⍵):{}stringBuilder.Append(⍵)}¨strFormD

 str←⎕NEW String(⊂stringBuilder.ToString ⍬)

 r←str.Normalize(NormalizationForm.FormC)

And here is some utilization of the function:

      RemoveAccents 'Crème Brûlée'
Creme Brulee

      RemoveAccents 'âãäåçèéêë ìíîïðñòó ôõöùúûüý'
aaaaceeee iiiiðnoo ooouuuuy


CategoryDyalog CategoryDyalogDotNet CategoryDyalogExamplesDotNet

netUpperLowerCase (last edited 2015-04-14 20:18:23 by PierreGilbert)