|
New in version 2.3.
This module implements RFC 3490 (Internationalized Domain Names in
Applications) and RFC 3492 (Nameprep: A Stringprep Profile for
Internationalized Domain Names (IDN)). It builds upon the punycode encoding and stringprep.
These RFCs together define a protocol to support non-ASCII characters in domain names. A
domain name containing non-ASCII characters (such as ``www.Alliancefrançaise.nu'') is
converted into an ASCII-compatible encoding (ACE, such as ``www.xn-alliancefranaise-npb.nu'').
The ACE form of the domain name is then used in all places where arbitrary characters are not
allowed by the protocol, such as DNS queries, HTTP
fields, and so on. This conversion is carried out in the application; if possible invisible to
the user: The application should transparently convert Unicode domain labels to IDNA on the
wire, and convert back ACE labels to Unicode before presenting them to the user.
Python supports this conversion in several ways: The idna codec allows to
convert between Unicode and the ACE. Furthermore, the socket module transparently converts Unicode host names to
ACE, so that applications need not be concerned about converting host names themselves when
they pass them to the socket module. On top of that, modules that have host names as function
parameters, such as httplib and ftplib, accept Unicode host names (httplib then also transparently sends an
IDNA hostname in the field if it sends that field at
all).
When receiving host names from the wire (such as in reverse name lookup), no automatic
conversion to Unicode is performed: Applications wishing to present such host names to the
user should decode them to Unicode.
The module encodings.idna also implements the nameprep procedure,
which performs certain normalizations on host names, to achieve case-insensitivity of
international domain names, and to unify similar characters. The nameprep functions can be
used directly if desired.
-
- Return the nameprepped version of label. The implementation currently assumes
query strings, so
AllowUnassigned is true.
-
- Convert a label to ASCII, as specified in RFC 3490.
UseSTD3ASCIIRules
is assumed to be false.
-
- Convert a label to Unicode, as specified in RFC 3490.
|