Problem
I need to strip a string of all special characters, punctuation, and spaces, leaving simply letters and numbers.
Asked by user664546
Solution #1
This can be accomplished without the use of regex:
>>> string = "Special $#! characters spaces 888323"
>>> ''.join(e for e in string if e.isalnum())
'Specialcharactersspaces888323'
It’s possible to use str.isalnum:
Other alternatives will suffice if you insist on utilizing regex. However, if it can be done without using a regular expression, that is the preferred method.
Answered by user225312
Solution #2
A regex to match a string of characters that aren’t letters or numbers is as follows:
[^A-Za-z0-9]+
Here’s how to do a regex substitution in Python:
re.sub('[^A-Za-z0-9]+', '', mystring)
Answered by Andy White
Solution #3
Shorter way :
import re
cleanString = re.sub('\W+','', string )
If you want spaces between words and numbers substitute ” with ‘ ‘
Answered by tuxErrante
Solution #4
I kept track of the time it took me to respond to the questions.
import re
re.sub('\W+','', string)
is usually 3x faster than the next fastest top answer supplied.
This option should be used with caution. This approach may not extract some special characters (for example, ).
After reading this, I wanted to expand on the offered solutions by determining which one takes the least amount of time to run, so I used timeit to compare some of the proposed responses to two of the sample strings:
'.join(e for e in string if e.isalnum())
import re
re.sub('[^A-Za-z0-9]+', '', string)
import re
re.sub('\W+','', string)
The above outcomes are based on the lowest returning result from an average of: (3, 2000000)
Example 3 is potentially three times faster than Example 1.
Answered by mbeacom
Solution #5
#!/usr/bin/python
import re
strs = "how much for the maple syrup? $20.99? That's ricidulous!!!"
print strs
nstr = re.sub(r'[?|$|.|!]',r'',strs)
print nstr
nestr = re.sub(r'[^a-zA-Z0-9 ]',r'',nstr)
print nestr
More special characters can be added, but they will be replaced by “, which means they will be eliminated.
Answered by pkm
Post is based on https://stackoverflow.com/questions/5843518/remove-all-special-characters-punctuation-and-spaces-from-string