Validating slugs against existing routes in Rails

Written October 18, 2008. Tagged Ruby, Ruby on Rails.

On a web site, it's neat to provide URLs like http://community.com/some_username or http://blog.com/some_category. It's only slightly shorter than /users/some_username or /u/some_username, but looks much better.

Having usernames or tags directly under the root of the site means they can collide with other routes, though. If you're using /login, you don't want users to be able to have that username.

The first thing to do is, of course, to put the username route at the very end. You will need to, or its wildcard nature will catch every request (well, every request not including the path separators . and /). This means that if a user would somehow end up with a name like "login", the login action will still work – but the less-important user page will be eclipsed.

Let's assume our routes are

map.login 'login',     :controller => 'sessions', :action => 'new'
map.user  ':username', :controller => 'users',    :action => 'show'

You could then do this to avoid route collisions in usernames:

class User < ActiveRecord::Base
  validates_format_of :name, :with => /\A[\w-]+\Z/
  validates_uniqueness_of :name
  validate :name_is_not_a_route

protected

  def name_is_not_a_route
    path = ActionController::Routing::Routes.recognize_path("/#{name}", :method => :get) rescue nil
    errors.add(:name, "conflicts with existing path (/#{name})") if path && !path[:username]
  end

end

ActionController::Routing::Routes.recognize_path("/#{name}", :method => :get) takes a path (must begin with a slash) and an optional environment hash. With the routes specified above, we could have left out the :method, but we'll need it for RESTful routes or other routes with method conditions (otherwise they may be recognized by the wildcard route instead, or fail to be recognized altogether).

If the method fails to match a route, you get an ActionController::RoutingError. If it succeeds, you get a hash. The inline rescue above ensures graceful handling even if recognition fails.

Note that the user wildcard route will recognize a lot of stuff (the controller, not the route, will then get to decide if there is such a user), so we actually shouldn't get routing errors, assuming we don't let usernames contain periods or slashes (path separators), but it's good to be defensive. Especially since the model above runs the route validation independent of the format validation.

So if the method succeeds in recognizing a route, you will get a hash:

>> ActionController::Routing::Routes.recognize_path('/login')
=> {:controller=>"sessions", :action=>"new"}
>> ActionController::Routing::Routes.recognize_path('/some_username')
=> {:username=>"some_username", :controller=>"users", :action=>"show"}

If the hash has a value for the :username key, it was recognized by that route. That implies two things: no earlier route matched the username, and it has a format that means it can be properly routed as a username (again, usernames containing slashes or periods would fail here).

If you have other routes that also have a :username key, you may want to make this condition more strict, also confirming the controller and action. No need to worry about query strings, though – they're not part of the recognition process:

>> ActionController::Routing::Routes.recognize_path('/login?username=foo', :method => :get)
# ActionController::RoutingError: No route matches "/login?username=foo" with {:method=>:get}

The uniqueness validation is there to make this point: the route validation only avoids collisions with earlier routes; it doesn't run any controller code, so it has no idea if the username is already taken.

Another thing to note is that (non-wildcard) routing is case-sensitive, so if you have a /login page, a user could pick the name "LOGIN", and both would work. If you don't like this, downcase the slug before checking for a route. Mix-case routes would take more effort.

Of course, this code can't look into the future and prevent users from taking the names of routes you add later. You could run code on every deploy to check that there are no new collisions due to this.

A simpler, but limited, solution I've heard of is to keep all routes shorter than say six characters (/login is fine but /logout would be too long) and only allow longer usernames.