Selecting carefully with Laravel joins

Laravel comes packaged with an ORM called Eloquent, which is one of the better PHP tools for dealing with your database layer using objects. It provides all of the features you’d expect from a modern relational mapper, and cleanly uses some PHP magic to make the process pleasant.

But it has its quirks, and today i’m setting my sights on a not-uncommon operation – the join.

The tables

Let’s say we have two tables: users and posts. users has an id and a name. posts has an id, name, content and user_id (a foreign key to the users table). Each table has the following data

users 
id | name 
1  | jp 
2  | cosmo

posts 
id | name         | content      | user_id 
18 | jp's post    | hello there! | 1 
22 | cosmo's post | hi there!    | 2

Here are their definitions in Eloquent

class User extends Eloquent {
  protected $fillable = ['name'];
  public function posts()
  {
    return $this->hasMany('Post');
  }
}

class Post extends Eloquent {
  protected $fillable = ['name', 'content', 'user_id'];
}

With this definition, we can retrieve the first user easily, and the results are what we’d expect

$user = User::first();
print $user->id;   //prints 1
print $user->name; //prints "jp"

Where it gets tricky

Retrieving a User object directly is nice, but many times when dealing with the database you’ll be issuing joins to retrieve your data. A join is fairly simple in Eloquent

$user = 
  User
    ::join('posts', 'posts.user_id', '=', 'users.id')
    ->where('posts.name', 'LIKE', '%jp%')
    ->first(); 

The join takes a table name, and then the pieces of the join clause. In this case we’re joining on the relationship between the posts.user_id and the users.id. A little bit clunky, but usable.

What isn’t usable, are the results.

print $user->id;   //prints 18
print $user->name; //prints "jp's post"

The object looks the same, but we’re getting what seems to be invalid results. When we ran it before, we printed an id of 1 and a name of “jp” – now we’re seeing an id of 18, which isn’t even available in the users table at all. And the name “jp’s post” is the from the posts table. What’s going wrong?

Under the hood

By default when Eloquent issues a query, it selects all columns

User::all(); //SELECT * FROM users

When you perform a join, it continues issuing the same select call, additionally adding the join and where clauses. So our join from earlier results in the following query

SELECT * 
FROM users 
INNER JOIN posts ON users.id = posts.user_id 
WHERE posts.name LIKE '%jp%' 

The result is a row with two id and name columns. Raw sample output from a database tool would look something like this

id | name | id | name      | content      | user_id
1  | jp   | 18 | jp's post | hello there! | 1

This means that when Eloquent parses the results into the User object, it’s contending with two versions of the same column name. Most likely it’s using a hash internally, and the second key overrides the first. The result is a User object with the id and name of the related Post.

Fixing the issue

The solution turns out to be very simple. You have to be specific with your select clause.


$user = 
  User
    ::join('posts', 'posts.user_id', '=', 'users.id')
    ->where('posts.name', 'LIKE', '%jp%')
    ->select('users.*')
    ->first(); 

Now our select is specifically selecting columns from the users table only, and our results are consistent with our expectations again

print $user->id;   //prints 1
print $user->name; //prints "jp"

How Rails does it

One reason this tripped me up in the first place, is coming to Laravel from Ruby on Rails. In Ruby on Rails, this problem is taken care of for you.

User.joins(:posts).where(name: '%jp%').first
SELECT users.* 
FROM users 
INNER JOIN posts ON users.id = posts.user_id
WHERE name LIKE '%jp%'

I assumed a similar behavior in Laravel without ever checking the results. Personally I think the Rails behavior is the more sensible default, since it keeps your class mapping directly to your table (unless you explicitly tell it otherwise using #select).

Why not use #with?

For people used to using eloquent, it may be suggested to instead use the #with method. Using #with does technically fix the select issue, but the use case it solves and queries it generates aren’t the same.

#with allows you to work around the N + 1 problem and also can give some form of filtering. To achieve a similar affect to the join clause using #with

User::with(['posts' => function ($query)
{
  $query->where('name', 'LIKE', '%jp%');
}])->first();

Using #with you might expect a similar result to the join, but the queries generated work differently.

SELECT * FROM users
SELECT * FROM posts WHERE posts.user_id IN (1) AND name LIKE '%jp%'

#with generates two queries, so if you’re trying to perform an operation on the data with one call, it’s not the way to go. It first fully retrieves the users table information, then uses that to construct a query on the posts table. It’s less efficient and may not always be accurate.

Also – it’s up for debate, but I find that adding a where clause to a #with method is more complicated to read.

Conclusion

The way joins work in Eloquent has bitten me a few times, and it teaches you to be careful about making assumptions when moving around between frameworks and abstraction layers. Laravel feels right at home when you’re used to using other systems like Rails, but truly understanding the framework you’re using and implications its tools have is important to making sure mistakes like this don’t slip through.

  • sagalbot

    Very helpful, thanks!

    • You’re welcome! Glad it was helpful for you

  • Kigen Enock

    Quite informative. Thanks Camara

  • Nana

    Very good and works better than User::with

  • Midix

    I find that in general Eloquent is not good for picking partial data and doing custom joins. If you mix leftJoin and with(), then really bad stuff happens (multiple queries, field collisions etc.).

    So, I stick to the rule of thumb – let Eloquent deal with ActiveRecord functionality (that is – processing a single record with its entire hierarchy, possibly using with() for eager loading) and go to the good old Laravel DB::anyjoinyouwant() for cases when you want to fetch multiple records, filter, pick columns etc.

    If you keep in mind that Eloquent is in essence ActiveRecord pattern, then it becomes clear that picking columns and doing custom joins is somewhat contrary to the ActiveRecord idea. When you use ActiveRecord, you expect to work with entire record, the way you have specified it, and working with just some columns is a bit dangerous and can lead to surprises. Maybe that’s one of the reasons why Laravel does not bother to automatically create column projections and deduce model hierarchy from results of leftJoin (in contrary to some other frameworks). Eloquent is meant to be as a somewhat lightweight implementation. I am even surprised that they decided to implement join capabilities on Eloquent because it is asking for troubles with incorrect ActiveRecord pattern usage.

    • Hey Midix – thanks for the thoughtful response. What you’re describing makes sense, and there probably are more stable approaches for handling this kind of scenario than what i’ve outlined here.
      I think it’s valuable to understand the circumstances around joining directly on an Eloquent model though, since it is a feature with some unexpected behavior and one that someone coming from a background in popular MVC frameworks would expect to work a certain way (and one I hadn’t seen anyone write about). But if you can, avoiding it and using more explicit constructs (like straight DB calls) might be the better answer!

      Thanks!

  • tooleks

    Thanks for your article. I have met this issue several times too.
    Have you created an issue on the Laravel framework GitHub page?