RedisGraph "Recommended Products" query has a huge execution time

I’m studying immigrating to RedisGraph from ArangoDB and so far I’ve re-written our “Recommended Products” query as below. However, it’s showing weaker performance compared to ArangoDB.

The following are the details, and the query I used to get the recommended products to show to an online shopper user based on other user’s views of other products, filtering out the current user’s views, and also sorting by the products that occured the most.

Schema:

  • Nodes:
  1. user (indexed on ID)
  2. product (indexed on ID)
  • Relations:
  1. purchased
  2. viewed

Query:

MATCH (:product {id: 123})<-[:purchased]-(:user)-[r:purchased]->(p:product)
WHERE NOT (:user {id: 321})-[:purchased]->(p)
WITH p, COUNT(r) as count
WHERE count > 9
RETURN p.id as id, count
ORDER BY count DESC
LIMIT 5

Execution time: 6177.225 ms

As you can see this took over 6 seconds! Compared to what I have on ArangoDB:

Schema:

  • Vertices:
    1. users
    2. products
  • Edges:
    1. purchases (._from = ‘users/[id]’ , ._to = ‘products/[id]’)
    2. views (._from = ‘users/[id]’ , ._to = ‘products/[id]’)

Query:

LET userPurchases = (FOR purchase IN OUTBOUND 'users/321' purchases RETURN purchase._id)
FOR product,purchase IN 2..2 ANY 'products/123' purchases
FILTER purchase._to NOT IN userPurchases
FILTER purchase._from != 'user/321'
COLLECT id = product._key WITH COUNT INTO count
FILTER count > 9
FILTER id != null
SORT count DESC
LIMIT 5
RETURN DISTINCT {id:id,count:count}

Execution time: 33.256 ms

Similarly, I’ve tried using this query to get recommended products based on other user’s views, while filtering out the current user’s views/purchases:

MATCH (:product {id: 4483906})<-[:viewed]-(:user)-[r:viewed]->(p:product)
WHERE NOT (:user {id: 65738229})-[:viewed]->(p)
AND NOT (:user {id: 65738229})-[:purchased]->(p)
WITH p, COUNT(r) as count
WHERE count > 9
RETURN p.id as id, count
ORDER BY count DESC
LIMIT 5

And it failed to execute since it took longer than 30 seconds.

Is there a way I can enhance the performance of these queries?

Thanks,

try to run

MATCH (:product {id: 123})<-[:purchased]-(:user)-[r:purchased]->(p:product), (u:user {id: 321})
WHERE NOT (u)-[:purchased]->(p)
WITH p, COUNT(r) as occurrence
WHERE occurrence > 9
RETURN p.id as product_id, occurrence ORDER BY occurrence DESC LIMIT 5

also which version of RedisGraph you use?

can you please share the result of this query with the GRAPH.PROFILE command?

Hello, thanks for your kind reply.

I’m using RedisGraph v2.0.4.

I ran your suggested query and it did improve the performance a lot, execution time is now “798.497424 milliseconds”, but it’s still nowhere near ArangoDB’s 33 milliseconds.

Here is the result of running the profile command:

>> GRAPH.PROFILE "products" "MATCH (:product {id: 123})<-[:purchased]-(:user)-[r:purchased]->(p:product), (u:user {id: 321}) WHERE NOT (u)-[:purchased]->(p) WITH p, COUNT(r) as occurrence WHERE occurrence > 9 RETURN p.id as product_id, occurrence ORDER BY occurrence DESC LIMIT 5"

 1) "Results | Records produced: 5, Execution time: 0.002056 ms"
 2) "    Limit | Records produced: 5, Execution time: 0.001222 ms"
 3) "        Sort | Records produced: 5, Execution time: 0.031058 ms"
 4) "            Project | Records produced: 195, Execution time: 0.140768 ms"
 5) "                Filter | Records produced: 195, Execution time: 0.864014 ms"
 6) "                    Aggregate | Records produced: 5829, Execution time: 56.235779 ms"
 7) "                        Anti Semi Apply | Records produced: 17352, Execution time: 12.975857 ms"
 8) "                            Cartesian Product | Records produced: 19342, Execution time: 8.581590 ms"
 9) "                                Conditional Traverse | (anon_2:user)-[r:purchased]->(p:product) | Records produced: 19342, Execution time: 35.753414 ms"
10) "                                    Conditional Traverse | (anon_2:user)->(anon_2:user) | Records produced: 1899, Execution time: 2.035268 ms"
11) "                                        Index Scan | (anon_0:product) | Records produced: 1, Execution time: 0.141423 ms"
12) "                                Index Scan | (u:user) | Records produced: 1, Execution time: 0.092827 ms"
13) "                            Expand Into | (u:user)->(p:product) | Records produced: 1990, Execution time: 625.021633 ms"
14) "                                Argument | Records produced: 19342, Execution time: 5.059943 ms"

this version is very old please consider to upgrade to a newer version

from the profile I see that the Expand Into is the part that making it slow because it was needed with 19342 records

is it possible to try the profile of

MATCH (:product {id: 123})<-[:purchased]-(pu:user)
WITH DISTINCT pu
MATCH (pu)-[r:purchased]->(p:product), (u:user {id: 321})
WHERE NOT (u)-[:purchased]->(p)
WITH p, COUNT(r) as occurrence
WHERE occurrence > 9
RETURN p.id as product_id, occurrence
ORDER BY occurrence DESC
LIMIT 5

it is a one step but hope it will help me to understand more about your data to better optimize

Hello, thanks for your recommendation to upgrade, I will see what it can do.

For the recent query you provided, I ran it but it didn’t yield the same results as the previous 2 queries.

But here is the PROFILE result anyway:

>> GRAPH.PROFILE "products" "MATCH (:product {id: 123})<-[:purchased]-(pu:user) WITH DISTINCT pu MATCH (pu)-[r:purchased]->(p:product), (u:user {id: 321}) WHERE NOT (u)-[:purchased]->(p) WITH p, COUNT(r) as occurrence WHERE occurrence > 9 RETURN p.id as job_id, occurrence ORDER BY occurrence DESC LIMIT 5"

 1) "Results | Records produced: 1, Execution time: 0.001077 ms"
 2) "    Limit | Records produced: 1, Execution time: 0.000805 ms"
 3) "        Sort | Records produced: 1, Execution time: 0.002429 ms"
 4) "            Project | Records produced: 1, Execution time: 0.003602 ms"
 5) "                Filter | Records produced: 1, Execution time: 0.003129 ms"
 6) "                    Aggregate | Records produced: 1, Execution time: 1.011681 ms"
 7) "                        Anti Semi Apply | Records produced: 1899, Execution time: 1.440182 ms"
 8) "                            Cartesian Product | Records produced: 1899, Execution time: 0.787042 ms"
 9) "                                Distinct | Records produced: 1899, Execution time: 4.205953 ms"
10) "                                    Project | Records produced: 3798, Execution time: 1.378827 ms"
11) "                                        Conditional Traverse | (pu:user)->(pu:user) | Records produced: 3798, Execution time: 3.788199 ms"
12) "                                            Index Scan | (anon_0:product) | Records produced: 2, Execution time: 0.139739 ms"
13) "                                Conditional Traverse | (pu)-[r:purchased]->(p:product) | Records produced: 2, Execution time: 0.136204 ms"
14) "                                    All Node Scan | (pu) | Records produced: 16, Execution time: 0.008360 ms"
15) "                                Index Scan | (u:user) | Records produced: 1, Execution time: 0.088830 ms"
16) "                            Expand Into | (u:user)->(p:product) | Records produced: 0, Execution time: 72.874393 ms"
17) "                                Argument | Records produced: 1899, Execution time: 0.609279 ms"

it is not the same because now the result are more correct before you had a duplication in the counting
as you can see in the distinct op reducing the number of users

and now the Expand Into op time reduced from 625ms to 72ms

how much time it to to this query not in profile?