RedisGraph algo.SPpaths

I am working on a PoC which is to find the shortest path between two nodes with a graph of 6.3K nodes and 6.4K edges/relationships. The size of the graph is just for the PoC, and we are expected to have graphs with more than 20K nodes and 30K edges. The “stop” node has indexes created on property stop_id. I do know atleast one path exists between the two nodes I am querying for shortest paths.

I am running a docker image of the latest redisgraph:edge version - redislabs/redisgraph:edge
Upon executing this query on cloud VM (with 32 logical cores and hence thread count of 32), the initial response looks like

GRAPH.QUERY “nz_auc” “MATCH (a:stop),(g:stop) WHERE ID(a) = 4751 AND ID(g) = 5400 CALL algo.SPpaths( {sourceNode: a, targetNode: g, relTypes: [‘path’], weightProp: ‘pathCost’, maxLen: 50} ) YIELD path, pathWeight RETURN pathWeight, [n in nodes(path) | n.stop_id] as pathNodes”

    1. “pathWeight”
    2. “pathNodes”
      1. “30”
      2. “[7238, 1761, 7239, 1777, 2005, 2313, 1793, 7401, 8210, 8216, 8218, 8220, 8222, 8224, 8226, 8228, 8035, 8033, 8879, 8031, 8029, 8019, 8017, 8015, 8013, 8011, 8009, 8007, 8005, 8003, 8001]”
        3) 1) “Cached execution: 0”
        ** 2) “Query internal execution time: 2991.219199 milliseconds”**

When I re-run the same query I am expecting the query internal execution time to be much faster due to cached results, but then it takes almost same time as the first or sometimes even more.

GRAPH.QUERY “nz_auc” “MATCH (a:stop),(g:stop) WHERE ID(a) = 4751 AND ID(g) = 5400 CALL algo.SPpaths( {sourceNode: a, targetNode: g, relTypes: [‘path’], weightProp: ‘pathCost’, maxLen: 50} ) YIELD path, pathWeight RETURN pathWeight, [n in nodes(path) | n.stop_id] as pathNodes”

    1. “pathWeight”
    2. “pathNodes”
      1. “30”
      2. “[7238, 1761, 7239, 1777, 2005, 2313, 1793, 7401, 8210, 8216, 8218, 8220, 8222, 8224, 8226, 8228, 8035, 8033, 8879, 8031, 8029, 8019, 8017, 8015, 8013, 8011, 8009, 8007, 8005, 8003, 8001]”
        3) 1) “Cached execution: 1”
        ** 2) “Query internal execution time: 2922.613366 milliseconds”**

I have indexes created on the node property stop_id, and the requirements we have is to have response time ~ 20ms average for shortest path between 2 nodes.

When I run the same query on my local environment(THREAD_COUNT=8 and same docker image - redislabs/redisgraph:edge) on the same graph the response time is much lesser

GRAPH.QUERY “nz_auc” “MATCH (a:stop),(g:stop) WHERE ID(a) = 4751 AND ID(g) = 5400 CALL algo.SPpaths( {sourceNode: a, targetNode: g, relTypes:[‘path’], weightProp: ‘pathCost’, maxLen: 50} ) YIELD path, pathWeight RETURN pathWeight, [n in nodes(path) | n.stop_id] as pathNodes”

    1. “pathWeight”
    2. “pathNodes”
      1. “44”
      2. “[7238, 7810, 7303, 1865, 8977, 1701, 1469, 1861, 7627, 1536, 7503, 7501, 7615, 8069, 8067, 8065, 8063, 8061, 8059, 1076, 8055, 8053, 8051, 8049, 8047, 7999, 1077, 1881, 8868, 1845, 8122, 8124, 8126, 8128, 8130, 8019, 8017, 8015, 8013, 8011, 8009, 8007, 8005, 8003, 8001]”
    1. “Cached execution: 0”
    2. “Query internal execution time: 109.811500 milliseconds”

And rerunning the query returns
GRAPH.QUERY “nz_auc” “MATCH (a:stop),(g:stop) WHERE ID(a) = 4751 AND ID(g) = 5400 CALL algo.SPpaths( {sourceNode: a, targetNode: g, relTypes:[‘path’], weightProp: ‘pathCost’, maxLen: 50} ) YIELD path, pathWeight RETURN pathWeight, [n in nodes(path) | n.stop_id] as pathNodes”

    1. “pathWeight”
    2. “pathNodes”
      1. “44”
      2. “[7238, 7810, 7303, 1865, 8977, 1701, 1469, 1861, 7627, 1536, 7503, 7501, 7615, 8069, 8067, 8065, 8063, 8061, 8059, 1076, 8055, 8053, 8051, 8049, 8047, 7999, 1077, 1881, 8868, 1845, 8122, 8124, 8126, 8128, 8130, 8019, 8017, 8015, 8013, 8011, 8009, 8007, 8005, 8003, 8001]”
    1. “Cached execution: 1”
    2. “Query internal execution time: 57.586100 milliseconds”

The graph data is same and I have used same commands on both environments to create the graph.
My questions are

  1. Why do I see different results for shortest paths in different environments even though the graph data is same?
  2. Why is the response time different for different environments?
  3. Cache-execution:1 - does this mean the result is being fetched from the cache? If so, shouldnot be the response time much lesser?

Any help would be greatly appreciated.

This is not so. We cache query execution plans, not query results (which generally is quite neglectable for long-running queries).
Note that theoretically the graph could change between executions of the same query.

No, it means the the execution plan is being fetched from the cache.

I don’t think that the environment makes this large difference. Are you sure that the graphs are indeed identical? A basic verification would be to ensure that the number of nodes and relationships are identical:
“match (n) return count(n)”
“match ()-[r]->() return count(r)”

You can also create a list of (indegree, number of nodes with such indegree):
“MATCH (n) RETURN indegree(n), count(1) ORDER BY indegree(n)”
and a similar list for outdegree - to verify the maps are similar for both graphs.

Hi Smitha

I have having similar issue. Did you find any solution?

Thanks
Kushal