Skip to content

more intelligent parsing of websites & URLs / URL fields -- check for illegal characters; strip those before inputting #64

@jwmh

Description

@jwmh

SUMMARY:
When website URLs get imported (or even copied/pasted) from other sources, sometimes... there are character-set issues. Sometimes they include invalid, but invisible, characters.
This makes the URL invalid....
.... but your average user will have no clue what's wrong -- they'll just think the business' website is having problems.

STEPS TO REPRODUCE:
0. DON'T change the below venue; keep as an example use-case (until we've fully documented the issue, its causes, and a kept a good example of the 'bad' URL string elsewhere).

  1. Take a look at this venue:
    http://portland.activatehub.org/venues/922
  2. Note it was imported via a FB page.
  3. Click the 'website' link:
    Actual URL: http://www.thewaypost.com/%E2%80%8E
    User-visible URL: http://www.thewaypost.com/
  4. Get a 404 Page Not Found error: "The requested URL /‎ was not found on this server."
  5. Delete the trailing slash (and accompanying invisible character(s)) from the address bar, and load the domain name again.
  6. Watch it work perfectly this time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions