Making Tanstack Table 1000x faster with a 1 line change

A few months back I was working on a Javascript frontend for a large dataset using Tanstack Table. The relevant constraints were:

Up to 50k rows of content
Grouped by up to 3 columns

Using react and virtualized rendering, showing 50k rows was performing well. But when the Tanstack Table grouping feature was enabled, I was seeing slowdowns on a few thousand rows, and huge slowdowns on 50k rows.

It might have gone unnoticed if it was 100ms slower, or even 500ms slower. But in the worst case renders would go from less than a second without grouping up to 30-40 seconds with it.

Tracking down the issue

Initially I tried using the chrome Javascript profiler, but it can be tough to use when performance is so slow. The profiler adds noticeable overhead to your code, and since the code took 30-40 seconds already, it was basically unusable¹.

Unable to use the profiler, I reached for an old, simple standby: console.time². This is a convenient way to see how long a section of code takes, logged to your console:

console.time('expensive code');
thisIsExpensive();
console.timeEnd('expensive code');
// console.time
//   expensive code: 1000ms

A side note about optimizing code: as programmers we are full of ideas about what needs to be optimized and are terrible at getting it right.

We make educated guesses at what is important to optimize and where the problem is, and until we measure we’re usually wrong. I try now to wrap everything possible in a benchmark to make sure I’m even in the right place, then start narrowing the benchmark inside of the code from there.

When tracking down a performance issue this would be a general outline:

console.time('everything');
elements.forEach(() => {
  console.time('methodCall');
  methodCall(() => {
    console.time('build');
	build();
	console.timeEnd('build');
  });
  console.timeEnd('methodCall');
});
console.timeEnd('everything');
// build      49ms
// methodCall 50ms
// build      51ms
// methodCall 52ms
// everything 102ms

Back to the table rendering

As usual, before measuring, all of my guesses at the potential performance problem were wrong.

My own code was fine, so this was a case where it was actually a library bug, which was a surprise. There was no code path I could find that was performing poorly - all of the time was spent in the library. When using Tanstack table in React all of the logic happens in a pretty centralized place - the useReactTable hook - so it was easy to see the time was all there

console.time('everything');
customCode();
console.time('useReactTable');
useReactTable(...);
console.timeEnd('useReactTable');
console.timeEnd('everything');
// useReactTable 31500 ms
// everything    31537 ms

One of the nicest things about developing Javascript with packages is that at any time I can open up the node_modules folder and play around with third party code. In this case I was able to modify the Tanstack Table source code directly to add some timing information.
Turning on grouping was when everything slowed down, so it made the most sense to start timing that code

This is an abbreviated version of the grouped row source code, with my first pass at timing what I thought were the likeliest culprits. Pay attention primarily to console.time statements - you don’t have to understand everything going on.

function getGroupedRowModel<TData extends RowData>() {
  console.time('everything');
  //...
	
  console.time('grouping filter')
  const existing = grouping.filter(columnId =>
    table.getColumn(columnId)
  )
  console.timeEnd('grouping filter')
	
  const groupUpRecursively = (
    rows: Row<TData>[],
    depth = 0,
    parentId?: string
  ) => {
    if (depth >= existing.length) {
      return rows.map(row => {
      row.depth = depth
      //...
      if (row.subRows) {
        console.time('subRows')
        row.subRows = groupUpRecursively(
          row.subRows, 
          depth + 1
        )
        console.timeEnd('subRows')
      }
      return row
    });
	
    const columnId: string = existingGrouping[depth]!
    const rowGroupsMap = groupBy(
      rows, 
      columnId
    )
	
    const aggregatedGroupedRows = Array.from(rowGroupsMap.entries()).map(([groupingValue, groupedRows], index) => {
      let id = `${columnId}:${groupingValue}`
      id = parentId ? `${parentId}>${id}` : id
      console.time(
        'aggregatedGroupedRows groupUpRecursively'
      )
      const subRows = groupUpRecursively(
        groupedRows, 
        depth + 1,
        id
      )
      console.timeEnd(
        'aggregatedGroupedRows groupUpRecursively'
      )
      //...
      }
    }
  }
  console.timeEnd('everything');
}

My hunch was that the groupUpRecursively function was to blame - it made logical sense that tens of thousands of recursive calls could cause a slowdown (spoiler: as usual, I was wrong 😑):

First pass was a bust - it logged thousands of the subRows timers - every iteration was fast and there were too many of them to be useful so I cut it.

console.time
 subRows: 0 ms
   at map (packages/table-core/src/utils/getGroupedRowModel.ts:48:25)
       at Array.map (<anonymous>)
       at Array.map (<anonymous>)
       at Array.map (<anonymous>)
       at Array.map (<anonymous>)

console.time
 subRows: 0 ms
   at map (packages/table-core/src/utils/getGroupedRowModel.ts:48:25)
       at Array.map (<anonymous>)
       at Array.map (<anonymous>)
       at Array.map (<anonymous>)
       at Array.map (<anonymous>)

Removing that, I started to get closer. I was accounting for most of the time but I had two problems: I wasn’t accounting for all of the time (everything was 33 seconds and groupUpRecursively was only 23 seconds) and the chunk of time I was logging was too large to usefully identify the problem code:

console.time
  grouping filter: 1 ms
    at fn (packages/table-core/src/utils/getGroupedRowModel.ts:22:17)

console.time
  aggregatedGroupedRows groupUpRecursively: 23248 ms
    at map (packages/table-core/src/utils/getGroupedRowModel.ts:71:23)
        at Array.map (<anonymous>)
        at Array.map (<anonymous>)
        at Array.map (<anonymous>)

console.time
  everything: 33509 ms

I realized I had missed a function call - a little unassuming function called groupBy - so I added a console.time block there next:

console.time('groupBy')
const rowGroupsMap = groupBy(rows, columnId)
console.timeEnd('groupBy')

Got it! Almost the entirety of the 31 seconds was concentrated into 3 calls to groupBy.

console.time
  grouping filter: 2 ms
    at fn (packages/table-core/src/utils/getGroupedRowModel.ts:22:17)

console.time
  groupBy: 10279 ms
    at groupUpRecursively (packages/table-core/src/utils/getGroupedRowModel.ts:59:19)

console.time
  groupBy: 10868 ms
    at groupUpRecursively (packages/table-core/src/utils/getGroupedRowModel.ts:59:19)
        at Array.map (<anonymous>)

console.time
  groupBy: 10244 ms
    at groupUpRecursively (packages/table-core/src/utils/getGroupedRowModel.ts:59:19)
        at Array.map (<anonymous>)
        at Array.map (<anonymous>)

console.time
  aggregatedGroupedRows groupUpRecursively: 21159 ms
    at map (packages/table-core/src/utils/getGroupedRowModel.ts:71:23)
        at Array.map (<anonymous>)
        at Array.map (<anonymous>)

console.time
  everything: 31537 ms

For each grouped column, it was calling groupBy and each call took roughly 10 seconds.

So what the heck was going on in the groupBy function that was causing such a massive slowdown. Can you pick it out?

function groupBy<TData extends RowData>(rows: Row<TData>[], columnId: string) {
  const groupMap = new Map<any, Row<TData>[]>()
	
  return rows.reduce((map, row) => {
    const resKey = `${row.getValue(columnId)}`
	const previous = map.get(resKey)
	if (!previous) {
	  map.set(resKey, [row])
	} else {
	  map.set(resKey, [...previous, row])
	}
	return map
  }, groupMap)
}

I started chopping up the function. I switched the Map to be an object literal in case that was causing some kind of memory overhead and tried changing to a for loop instead of reduce in case the amount of iterations plus closures was causing an issue³.

Those had no effect so next I started commenting out lines of the function. When I commented out this line, all of a sudden everything started finishing instantly:

// map.set(resKey, [...previous, row])

What was the purpose of that line and why was it so slow?

const resKey = `${row.getValue(columnId)}`
const previous = map.get(resKey)
if (!previous) {
  map.set(resKey, [row])
} else {
  map.set(resKey, [...previous, row])
}

On each iteration of the reduce call, the code:

Used the value of that column cell as a map key. Let’s say the value is the string “New York”
If there was no value associated with “New York”, it would set a value of the current row wrapped in an array
If there was already a value, it would use the Javascript spread operator to concatenate the current row onto the end of the previous array value
This means that on each iteration of reduce, the spread operator was creating a new, incrementally larger array, getting slower with each iteration⁴

// 1st iteration
[...previous, row] => [1,2]
// 2nd iteration
[...previous, row] => [1,2,3]
// 3rd iteration
[...previous, row] => [1,2,3,4]
// ...
// 50,000th iteration
[...previous, row] => [1,...49998,49999]

What does spread do behind the scenes?

In our case the spread operator takes an iterable object and loops over it to create new array. It’s probably more nuanced than this, but the spread operator is likely O(n). That means that as the size of the array grows, the longer it takes to spread the values into a new array.

I’m certain there are additional efficiencies provided in the language internals but the following code is essentially equivalent⁵:

const a = [1, 2, 3, 4, 5, 6, 7];
	
// spread
const b = [...a, 8];
	
// manual loop
let c = [];
	
for (let i = 0; i < a.length; i++) {
  c.push(a[i]);
}
c.push(8);

Which means that the original version of that groupBy code was equivalent to this:

// original code
map.set(resKey, [...previous, row]
	
// manual spread
for (let i = 0; i < rows.length; i++) {
  const tempPrevious = [];
  for (let j = 0; j < previous.length; j++) {
    tempPrevious.push(previous[j]);
  }
  tempPrevious.push(rows[i]);
  previous = tempPrevious;
  map.set(resKey, tempPrevious);
}

Seeing the manual spread, one thing sticks out immediately: we have a nested loop.

A nested loop is a good shorthand for identifying one of the slower forms of algorithmic complexity: a “Big O” complexity of O(n^2)⁶. This means that for an array size of 50,000, the number of iterations required would be 50,000^2, or 2.5 billion iterations 😱.

In our specific case the array started out empty and grew to a size of 50,000, so it’s wasn’t quite as bad. Technically it was still a complexity of O(n^2), but practically it was O(n^2/2)⁷. Either way it still meant:

1,249,975,000, or 1 billion 249 million iterations
49,999 discarded arrays allocated for 1,249,925,001 entries
Times three groupBy calls that means 3,749,925,000 iterations, 149,997 discarded arrays and 3,749,775,003 allocated entries

Pour one out for the garbage collector and Javascript runtime 🙅🏻‍♂️☠️🙅🏻‍♂️. Honestly, for how many iterations and how much garbage we accumulated, it was operating pretty well 🫠.

So the question was - what could be done to improve it?

Spread is an immutable pattern, so should the code be kept immutable?

Aside from providing a convenient syntax for combining iterables, the spread operator also allows us to keep code immutable. This can be beneficial because immutable patterns mean we avoid mutations, which means we don’t change data from underneath other code that is using it.

If this code was written to be immutable, there were some other options we could try to improve performance which would behave the same as the spread operator:

// Array.from()
//   ~10 seconds, just as slow 😒
const arr = Array.from(previous)
arr.push(row)
map.set(resKey, arr)
	
// slice()
//   ~10 seconds, just as slow 🙃
const arr = previous.slice()
arr.push(row)
map.set(resKey, arr)
	
// concat(row)
//   ~10 seconds, just as slow 🧐
map.set(resKey, previous.concat(row))
	
// hand written for loop
//   ~14 seconds, slower than spread! 😱 
let arr = []
for (let i = 0; i < previous.length; i++) {
  arr.push(previous[i])
}
arr.push(row)
map.set(resKey, arr)
	
// concat([row])
//   ~4 seconds 🏃‍♂️ 
map.set(resKey, previous.concat([row]))
	
// using the `immer` node package
//   ~100ms 🚀 
return produce(groupMap, (draft) => {
  return rows.reduce((map, row) => {
    const resKey = `fixedKey`
    let previous = map.get(resKey)
    if (!previous) {
      map.set(resKey, [row])
    } else {
      previous.push(row)
    }
    return map
  }, draft)
})

Array.from, slice, and concat all had the same performance characteristics as the spread operator. There were a few surprises however:

A manual for loop was the slowest option, coming in at around 14 seconds. The native code versions of these methods are clearly much more optimized than the manual loop we can write
concat, when handed an array as the argument instead of a plain object, was consistently faster than spread on any v8 platform (node/chrome). It took half the time! Either there was a hard to find mistake in my code or the runtime had a special optimization for this use case. Still too slow however
Using immer performed significantly faster than all other options. immer provides regular Javascript interfaces on immutable data using proxies and structural sharing to make efficient changes to data structures without changing the original source object. However in this case the main benefit of the immer approach is that we were not copying any arrays - just pushing directly to them because they were created inline in our function

But did we even need immutability at all? Was the immer approach good enough? Could we get even faster?

Using push directly

For performance, you can’t beat mutations.

Mutations modify data in place, without copying and creating duplicated data. Looking over our groupBy function, there wasn’t any benefit to using immutable operations. Everything was created in the function so we could safely mutate it before we returned it. I don’t think the original code was written with immutability in mind - using the spread syntax was just convenient.

When using array push, we are effectively running with an algorithmic complexity of O(1). That means as our array grows, the cost of inserting new elements does not. It’s constant time: it’s as fast for 1 item as it is for 50,000.

Here’s the final version of the fix, now 1000x faster for my use case:

function groupBy<TData extends RowData>(rows: Row<TData>[], columnId: string) {
  const groupMap = new Map<any, Row<TData>[]>()
	
  return rows.reduce((map, row) => {
    const resKey = `${row.getValue(columnId)}`
    const previous = map.get(resKey)
    if (!previous) {
      map.set(resKey, [row])
    } else {
      previous.push(row)
    }
    return map
  }, groupMap)
}

Can you spot the difference? It’s this one simple change:

previous.push(row)

previous.push(row) mutates the existing array and brings our time down from 10 seconds per groupBy down to around 10ms, 1000x faster than the original code 🚀.

The takeaway here is that while spread is an expressive and convenient language feature, it’s important to see it for what it is at its core: a for loop. If you see a spread inside of a loop, you’ll want to rewrite it if there is any chance it will operate on a large set of values.

Source code

You can see the full PR here https://github.com/TanStack/table/pull/4495. Because it was such a small fix, the majority of the PR is just a spec to guard against a future performance regression.

And here’s the source code for each attempt, runnable on the Replit platform. If you run it there the timings are much slower because it is running a lower powered CPU than my local machine, but as a relative measure the results are consistent with the numbers in this post.

I also find it can be hard to get a great sense for a profile when profiling react code since certain internal react code paths are hit so often the results are tough to decipher ↩︎
It’s non standard, but in the future I need to remember to reach for console.profile as well.

That could have been helpful here for profiling smaller blocks of code rather than deciphering the whole React render stack.

Considering how slow the code was it still may not have been usable anyways.

You can learn more about it here: https://developer.mozilla.org/en-US/docs/Web/API/console/profile ↩︎
What I’ve found is that Javascript engines have gotten so optimized that a forEach/map/reduce call can be as efficient as a hand written for loop. They’re almost never the performance problem in Javascript, least of all in a large set of iterations where the JIT can aggressively optimize. ↩︎
I should note that the performance was at its worst for poorly distributed groupings. By that I mean if you had a grouped column with only one value then the grouping is concentrated into one key with all 50,000 values. If there were lots of unique values in the column the arrays were smaller and built more quickly ↩︎
The v8 internals are, unsurprisingly, very complicated. Logic for spread seems to exist in a variety of places because it’s such a core feature so I wasn’t able to track down a specific implementation ↩︎
If “Big O” complexity doesn’t mean anything to you, there’s lots of great resources online for it and it’s a great concept to be familiar with. The basic idea is it helps you identify how costly a particular piece of code will be to run.

For a deep dive into it, https://frontendmasters.com/courses/algorithms/ is a great free resource ↩︎
In terms of algorithmic complexity, O(n^2) and O(n^2/2) are considered equivalent, since you generally drop the constant (in this case, 1/2).

But from a practical perspective, if you’re going to have a slow algorithm you’d still rather have it be half as slow as it’s full potential slowness 😅 ↩︎