sanitizestr#

sanitizestr(string=None, alphanumeric=False, nospaces=False, asciify=False, lower=False, validvariable=False, spacechar='_', symchar='?')[source]#

Remove all non-“standard” characters from a string

Can be used to e.g. generate a valid variable name from arbitrary input, remove non-ASCII characters (replacing with equivalent ASCII ones if possible), etc.

Parameters:
  • string (str) – the string to sanitize

  • alphanumeric (bool) – allow only alphanumeric characters

  • nospaces (bool) – remove spaces

  • asciify (bool) – remove non-ASCII characters

  • lower (bool) – convert uppercase characters to lowercase

  • validvariable (bool) – convert to a valid Python variable name (similar to alphanumeric=True, nospaces=True; uses spacechar to substitute)

  • spacechar (str) – if nospaces is True, character to replace spaces with (can be blank)

  • symchar (str) – character to replace non-alphanumeric characters with (can be blank)

Examples:

string1 = 'This Is a String'
sc.sanitizestr(string1, lower=True) # Returns 'this is a string'

string2 = 'Lukáš wanted €500‽'
sc.sanitizestr(string2, asciify=True, nospaces=True, symchar='*') # Returns 'Lukas_wanted_*500*'

string3 = '"Ψ scattering", María said, "at ≤5 μm?"'
sc.sanitizestr(string3, asciify=True, alphanumeric=True, nospaces=True, spacechar='') # Returns '??scattering??Mariasaid??at?5?m??'

string4 = '4 path/names/to variable!'
sc.sanitizestr(string4, validvariable=True, spacechar='') # Returns '_4pathnamestovariable'

New in version 3.0.0.