URLs and email addresses with unusable characters like %... Sometimes they contain the symbol % followed by a few characters not usually found in URLs and adresses. _________________________________________________________ Thanks to Brian Wilson, whose article URL Encoding, at http://www.blooberry.com/indexdot/html/topics/urlencoding.htm explains the reason, the method in the madness, and helps decode. Introduction & the basic scheme: - % is the prefix for encoded characters in the URL. - Following % is the hexadecimal encoding of the of the decimal character number in the ASCII character list. For example, a space is character 32 in the ISO-Latin character set. Decimal 32 is represented in hex notation as 20. Thus when encoded in a URL it is %20, to avoid software acting as though it is something other than a simple space. So get out your ISO-Latin ASCII character list. (Brian provides a link for you.) Essentially URLs should only contain a limited subset of the US-ASCII character set: 0-9 a-z A-Z $ - _ + ! * ' () , Anything else is at best subject to misinterpretation. (In the above list even $ + , are questionnable.) (Hexadecimal notation is a number system with base 16, using numbers 0 through 9 plusletters A through F, whereas binary is base 2 and the usual number system we use is base 10. My original information follows, not yet updated/replaced. _________________________________________________________ Here are a few common cases: %20 is a space (as used between words, MSIE6 will routinely show that, I wish people wouldn't use Microsoft Office style file naming on the Internet.) %2F is / (forward slash, for example as used in pairs after http: and to denote sub-directories) %26 is & - "ampersand" %252D is - (dash, often used in the middle of a domain or directory or file name) %255F is _ (underscore, often used in the middle of a domain or directory or file name) %257E is ~ (tilde, often used at the beginning of a main directory name especially for personal sites hosted in umbrella services) %3A is : %3D is = %3F is ? %40 is @ ("at" symbol in email addresses) %5C is backslash \ So a URL that looks like http%3F%2F%2Fourspace%2F%257Ehalloween%255Fboy%252Dchild/ would actually be http://ourspace/~halloween_boy-child/, but don't bet your life on my decoding. You can carefully edit the URL to replace the characters. You can try to extract the URL from a web page by using View|Source to see the raw HTML code rather than depending on links that open in a frame on the original page the link is from - that way you can identify the actual site. The substitute characters in the list above end with the two- digit hex representation of the decimal number of the original character in the ASCII list of 128 characters and control codes. None of them are in the "extended ASCI" character set, which Internet email would have to handle specially as more than 7 bits are needed. (Some have two additional characters between % and the hex value (always 25 in the above list - hex for the % character is 25 but I do not know if that is significant here). You've seen the =20 or such that sometimes shows up when a line is broken in transmission because it was longer than some software could handle. (That has something to do with the presence or absence of the "quoted printable" option in email settings. The =20 apparently is substituted for the carriage return control character.) You can find tables of ASCII characters with decimal and hex values on the Internet. So much learning, so little time. :-) ------------------------------------------------------------------------- Copyright smilin' Keith Sketchley 2018.06.09 Legalities detailed on http://www.keithsketchley.com/ apply. ------------------------------------------------------------------------- BACK in your browser should return you to the page you came here from.