Coder Perfect

Remove the string’s special characters, punctuation, and spaces.


I need to strip a string of all special characters, punctuation, and spaces, leaving simply letters and numbers.

Asked by user664546

Solution #1

This can be accomplished without the use of regex:

>>> string = "Special $#! characters   spaces 888323"
>>> ''.join(e for e in string if e.isalnum())

It’s possible to use str.isalnum:

Other alternatives will suffice if you insist on utilizing regex. However, if it can be done without using a regular expression, that is the preferred method.

Answered by user225312

Solution #2

A regex to match a string of characters that aren’t letters or numbers is as follows:


Here’s how to do a regex substitution in Python:

re.sub('[^A-Za-z0-9]+', '', mystring)

Answered by Andy White

Solution #3

Shorter way :

import re
cleanString = re.sub('\W+','', string )

If you want spaces between words and numbers substitute ” with ‘ ‘

Answered by tuxErrante

Solution #4

I kept track of the time it took me to respond to the questions.

import re
re.sub('\W+','', string)

is usually 3x faster than the next fastest top answer supplied.

This option should be used with caution. This approach may not extract some special characters (for example, ).

After reading this, I wanted to expand on the offered solutions by determining which one takes the least amount of time to run, so I used timeit to compare some of the proposed responses to two of the sample strings:

'.join(e for e in string if e.isalnum())
import re
re.sub('[^A-Za-z0-9]+', '', string)
import re
re.sub('\W+','', string)

The above outcomes are based on the lowest returning result from an average of: (3, 2000000)

Example 3 is potentially three times faster than Example 1.

Answered by mbeacom

Solution #5

import re

strs = "how much for the maple syrup? $20.99? That's ricidulous!!!"
print strs
nstr = re.sub(r'[?|$|.|!]',r'',strs)
print nstr
nestr = re.sub(r'[^a-zA-Z0-9 ]',r'',nstr)
print nestr

More special characters can be added, but they will be replaced by “, which means they will be eliminated.

Answered by pkm

Post is based on