FT.AGGREGATE ... SORTBY ... GROUPBY ... TOLIST doesn't preserve order or allow me to truncate the list

I have documents which represent live events which happen at different times and are tagged with categories (e.g. nightlife, adventure). My goal is to find events happening, group them by category, and show the next upcoming (up to) 10 events for each category.

This is a pared down version of the real problem, but here’s some demo data to demonstrate

> JSON.SET ev:1 $ '{"eventId":"1", "startTime":1692115200, "category":"nightlife"}'
"OK"

> JSON.SET ev:2 $ '{"eventId":"2", "startTime":1690913704, "category":"nightlife"}'
"OK"

> JSON.SET ev:3 $ '{"eventId":"3", "startTime":1690992000, "category":"adventure"}'
"OK"

> FT.CREATE eventIndex ON JSON PREFIX 1 "ev:" SCHEMA "$.category" as category TAG SORTABLE "$.startTime" as startTime NUMERIC SORTABLE
"OK"

> FT.AGGREGATE eventIndex "*" LOAD 3 "$" AS event SORTBY 2 @startTime ASC GROUPBY 1 @category REDUCE TOLIST 1 @event as events
1) "2"
2) 1) "category"
   2) "nightlife"
   3) "events"
   4) 1) "{\"eventId\":\"1\",\"startTime\":1692115200,\"category\":\"nightlife\"}"
      2) "{\"eventId\":\"2\",\"startTime\":1690913704,\"category\":\"nightlife\"}"
3) 1) "category"
   2) "adventure"
   3) "events"
   4) 1) "{\"eventId\":\"3\",\"startTime\":1690992000,\"category\":\"adventure\"}"

A couple of questions / problems that I could use your help with:

(1) The sort order does not appear to persist in the results of TOLIST. You can see that the timestamp on the first result under “nightlife” is greater than the second, even though I asked it to sort by @startTime ASC earlier in the pipeline. Is there any way to sort the results in TOLIST?

(2) I would like to limit the number of results under each category. This doesn’t appear to be possible from what I can tell. By providing MAX after SORTBY I can control the number of total events that will appear in the results, but that’s not what I want. Is there a way to do this? It seems to me that I want the FIRST_VALUE reducer, except with more than 1.

(3) Also, is there a better way to get the document (or select fields from it) than what I’m doing with LOAD 3 "$" AS event? Multiple TOLIST reducers seemed risky since it would produce multiple lists which are potentially not in the same order, so I just grabbed the whole doc to be sure.

Thanks for any help you can offer

The other solution that occurs to me is to do N+1 queries, with FT.AGGREGATE up front to determine the set of categories (tags) for which to search, and then N FT.SEARCH commands, one for each tag, with a sort & limit.

This strategy is usually considered to be an anti-pattern, but in my use case N is usually going to be less than 50, so maybe that’s better than the FT.AGGREGATE with thousands of documents returned in each group which then need to be sorted and truncated server-side. Assuming that there isn’t support for what I’m trying to do in FT.AGGREGATE directly, is this a better way to go about it?

Thanks again

Noodled on this quite a bit and came to most of the same conclusions that you did. I do think a FIRST_VALUES reducer would be nice. I wonder if the SORYBY would be preserved if sorted by @category and then @startTime?

As far as calling LOAD, you could LOAD just the fields you need as defined in FT.CREATE. Like this:

FT.AGGREGATE eventIndex "*" LOAD 2 category startTime  SORTBY 2 @startTime ASC GROUPBY 1 @category REDUCE TOLIST 1 @event as events

I think that would return all the fields you care about and avoid parsing some JSON.