Selecting carefully with Laravel joins

Laravel comes packaged with an ORM called Eloquent, which is one of the better PHP tools for dealing with your database layer using objects. It provides all of the features you’d expect from a modern relational mapper, and cleanly uses some PHP magic to make the process pleasant.

But it has its quirks, and today i’m setting my sights on a not-uncommon operation – the join.

The tables

Let’s say we have two tables: users and posts. users has an id and a name. posts has an id, name, content and user_id (a foreign key to the users table). Each table has the following data

users 
id | name 
1  | jp 
2  | cosmo

posts 
id | name         | content      | user_id 
18 | jp's post    | hello there! | 1 
22 | cosmo's post | hi there!    | 2

Here are their definitions in Eloquent

class User extends Eloquent {
  protected $fillable = ['name'];
  public function posts()
  {
    return $this->hasMany('Post');
  }
}

class Post extends Eloquent {
  protected $fillable = ['name', 'content', 'user_id'];
}

With this definition, we can retrieve the first user easily, and the results are what we’d expect

$user = User::first();
print $user->id;   //prints 1
print $user->name; //prints "jp"

Where it gets tricky

Retrieving a User object directly is nice, but many times when dealing with the database you’ll be issuing joins to retrieve your data. A join is fairly simple in Eloquent

$user = 
  User
    ::join('posts', 'posts.user_id', '=', 'users.id')
    ->where('posts.name', 'LIKE', '%jp%')
    ->first(); 

The join takes a table name, and then the pieces of the join clause. In this case we’re joining on the relationship between the posts.user_id and the users.id. A little bit clunky, but usable.

What isn’t usable, are the results.

print $user->id;   //prints 18
print $user->name; //prints "jp's post"

The object looks the same, but we’re getting what seems to be invalid results. When we ran it before, we printed an id of 1 and a name of “jp” – now we’re seeing an id of 18, which isn’t even available in the users table at all. And the name “jp’s post” is the from the posts table. What’s going wrong?

Under the hood

By default when Eloquent issues a query, it selects all columns

User::all(); //SELECT * FROM users

When you perform a join, it continues issuing the same select call, additionally adding the join and where clauses. So our join from earlier results in the following query

SELECT * 
FROM users 
INNER JOIN posts ON users.id = posts.user_id 
WHERE posts.name LIKE '%jp%' 

The result is a row with two id and name columns. Raw sample output from a database tool would look something like this

id | name | id | name      | content      | user_id
1  | jp   | 18 | jp's post | hello there! | 1

This means that when Eloquent parses the results into the User object, it’s contending with two versions of the same column name. Most likely it’s using a hash internally, and the second key overrides the first. The result is a User object with the id and name of the related Post.

Fixing the issue

The solution turns out to be very simple. You have to be specific with your select clause.


$user = 
  User
    ::join('posts', 'posts.user_id', '=', 'users.id')
    ->where('posts.name', 'LIKE', '%jp%')
    ->select('users.*')
    ->first(); 

Now our select is specifically selecting columns from the users table only, and our results are consistent with our expectations again

print $user->id;   //prints 1
print $user->name; //prints "jp"

How Rails does it

One reason this tripped me up in the first place, is coming to Laravel from Ruby on Rails. In Ruby on Rails, this problem is taken care of for you.

User.joins(:posts).where(name: '%jp%').first
SELECT users.* 
FROM users 
INNER JOIN posts ON users.id = posts.user_id
WHERE name LIKE '%jp%'

I assumed a similar behavior in Laravel without ever checking the results. Personally I think the Rails behavior is the more sensible default, since it keeps your class mapping directly to your table (unless you explicitly tell it otherwise using #select).

Why not use #with?

For people used to using eloquent, it may be suggested to instead use the #with method. Using #with does technically fix the select issue, but the use case it solves and queries it generates aren’t the same.

#with allows you to work around the N + 1 problem and also can give some form of filtering. To achieve a similar affect to the join clause using #with

User::with(['posts' => function ($query)
{
  $query->where('name', 'LIKE', '%jp%');
}])->first();

Using #with you might expect a similar result to the join, but the queries generated work differently.

SELECT * FROM users
SELECT * FROM posts WHERE posts.user_id IN (1) AND name LIKE '%jp%'

#with generates two queries, so if you’re trying to perform an operation on the data with one call, it’s not the way to go. It first fully retrieves the users table information, then uses that to construct a query on the posts table. It’s less efficient and may not always be accurate.

Also – it’s up for debate, but I find that adding a where clause to a #with method is more complicated to read.

Conclusion

The way joins work in Eloquent has bitten me a few times, and it teaches you to be careful about making assumptions when moving around between frameworks and abstraction layers. Laravel feels right at home when you’re used to using other systems like Rails, but truly understanding the framework you’re using and implications its tools have is important to making sure mistakes like this don’t slip through.