Coder Perfect

Url decode UTF-8 in Python


I have spent plenty of time as far as I am newbie in Python. How could I ever decode such a URL:

to the following in Python 2.7: title==прaвовая+защита

url=urllib.unquote(url.encode(“utf8”)) returns a very unattractive result.

There is still no solution; any assistance would be greatly appreciated.

Asked by swordholder

Solution #1

The data is UTF-8 encoded bytes escaped with URL quoting, so you want to decode, with urllib.parse.unquote(), which handles decoding from percent-encoded data to UTF-8 bytes and then to text, transparently:

from urllib.parse import unquote

url = unquote(url)


>>> from urllib.parse import unquote
>>> url = ''
>>> unquote(url)

urllib.unquote() is the Python 2 counterpart, however it returns a bytestring, which you’d have to decode manually:

from urllib import unquote

url = unquote(url).decode('utf8')

Answered by Martijn Pieters

Solution #2

You can use urllib.parse if you’re running Python 3.

url = """"""

import urllib.parse



Answered by pavan

Solution #3

With the requests library, you can also achieve the desired result:

import requests

url = ""

print(f"Before: {url}")
print(f"After:  {requests.utils.unquote(url)}")


$ python3


If you’re currently using requests and don’t want to use another library, this could be useful.

Answered by ivanleoncz

Solution #4

HTML entities can be included in URLs. This also replaces them.

#from urllib import unquote #earlier python version
from urllib.request import unquote
from html import unescape

Answered by Roland Puntaier

Post is based on