Concurrency bug with RedisGraph

Hello!

I’m encountering a bug where a certain request I’m sending to RedisGraph seems to be giving back different answers if I have concurrency turned on (i.e. THREAD_COUNT > 1) non-deterministically. I’m asking for a large amount of data from RedisGraph and sometimes the same query gives back 8035 rows and other times ~5000 (i.e. 5020 or 5212). The correct answer to the query is 8035 rows.

Another oddity is that this does not seem to happen when I send the requests one-by-one, in which case I always get the correct output. When I instead send two or more requests at once (on different threads), some will come back correctly while others yield an incorrect response.

I turned the CACHE_SIZE to 1 and THREAD_COUNT to 1 and every request gives back 8035 rows. I turned the THREAD_COUNT to 2 and CACHE_SIZE to 1 and sometimes I get back ~5000 rows with the same query. Is there some other nob I can try to turn?

It seems like something in the RedisGraph code may be caching a matrix or node set but it’s getting cut off or evicted. Any help would be much appreciated!

We upgraded to the latest version of Redis Graph (2.10.4) as well and are still having this problem. Our redis version is (7.0).

Would you mind sharing the query?

We consider the query IP so I don’t think we can share the query directly.

Ok, in case you’re a Redis Enterprise customer you can share it via our customer service
Another option might be to try and alter / simply the query to a point which it still recreates the issue but it is OK to share.

We’re looking into this more offline. I’ll consult with some folks at my company if we have redis enterprise.

We have narrowed it down to a filter acting incorrectly using PROFILE. If we filter the rows manually in python, we no longer see this bug.

Apologies for the delay. We have now fully anonymized our data so we can share the query.

Script to run the query (test.py):

import redis
from redis.commands.graph import Graph
import sys
import random
import string

uid = ''.join(random.choices(string.digits, k=5))
r = redis.Redis(unix_socket_path='/test_socket.sock')
g = Graph(r, name="tg")

q = """
MATCH (r :N1 {path: $path_""" + uid + """})
-[:N1PARENT*0..99]->(fr :N1)
<-[:N1PARENT*0..]-(:N1)
<-[e :N2_TO_N1]-(fi)
<-[ :N3_TO_N2]-(:N3 {name: $name})
RETURN DISTINCT fr, fi
"""
params = {"name": "test", f"path_{uid}": "/"}

def run(name):
    res = g.query(q, params)
    print(name, len(res.result_set))

run(sys.argv[1])

To trigger the bug, we need to launch several concurrent requests. Here’s a simple driver script to do that (test.sh):

#!/bin/bash

for i in {1..50}
do
   python test.py "r$i" &
done

wait < <(jobs -p)

Graph building with dummy data:

In case it helps, here is our configuration file:

A few notes:

  • Sorry about the mix of raw code & pastebin links; as I new user I was only allowed to have at most 2 links in my post
  • The strange variable mangling with uid in test.py is only done to circumvent any caching that Redis would otherwise do for our queries
  • To make the bug easier to reproduce, I added THREAD_COUNT 2 to our config file, but we can trigger it with our full dataset quite easily on a higher thread count as well
  • With these scripts, on RedisGraph 2.10.4, we can consistently trigger the bug every single run

Please let me know if there is anything else I can provide to make debugging easier. Thank you!

Thank you,
I’m looking into it, will update as soon as I have anything.

Hi,
I’ve used the provided scripts:

  1. build.py
  2. test.py
  3. test.sh

I’ve ran test.sh multiple times against RedisGraph V2.10.4 and I’m constantly getting the same number of results:

(python_venv) ➜  concurency_bug ./runner.sh
./runner.sh: line 11: syntax error near unexpected token `<'
./runner.sh: line 11: `wait < <(jobs -p)'
(python_venv) ➜  concurency_bug r14 105
r3 105
r4 105
r2 105
r1 105
r11 105
r10 105
r21 105
r8 105
r17 105
r20 105
r18 105
r5 105
r13 105
r35 105
r34 105
r47 105
r24 105
r45 105
r12 105
r42 105
r26 105
r15 105
r7 105
r16 105
r23 105
r29 105
r50 105
r39 105
r19 105
r36 105
r27 105
r31 105
r41 105
r49 105
r37 105
r32 105
r33 105
r28 105
r43 105
r25 105
r46 105
r22 105

Every run of the script had reported 105 results.

Would you mind sharing the output from the MONITOR command running while your test is being conducted? simply ran redis-cli MONITOR prior to the test execution.

Hello,

Running test.sh (modified for just 5 processes) gives the following output:

[root@f286fe84b1da container]# ./test.sh 
r3 9
r4 105
r2 105
r1 105
r5 8

With the MONTOR command you suggested logging the following:

[root@f286fe84b1da container]# redis-cli -s /test_socket.sock MONITOR 
OK
1671113371.517533 [0 unix:/test_socket.sock] "GRAPH.QUERY" "tg" "CYPHER name=\"test\" path_36369=\"/\" \nMATCH (r :N1 {path: $path_36369})\n-[:N1PARENT*0..99]->(fr :N1)\n<-[:N1PARENT*0..]-(:N1)\n<-[e :N2_TO_N1]-(fi)\n<-[ :N3_TO_N2]-(:N3 {name: $name})\nRETURN DISTINCT fr, fi\n" "--compact"
1671113371.517662 [0 unix:/test_socket.sock] "GRAPH.QUERY" "tg" "CYPHER name=\"test\" path_41176=\"/\" \nMATCH (r :N1 {path: $path_41176})\n-[:N1PARENT*0..99]->(fr :N1)\n<-[:N1PARENT*0..]-(:N1)\n<-[e :N2_TO_N1]-(fi)\n<-[ :N3_TO_N2]-(:N3 {name: $name})\nRETURN DISTINCT fr, fi\n" "--compact"
1671113371.518175 [0 unix:/test_socket.sock] "GRAPH.QUERY" "tg" "CYPHER name=\"test\" path_70638=\"/\" \nMATCH (r :N1 {path: $path_70638})\n-[:N1PARENT*0..99]->(fr :N1)\n<-[:N1PARENT*0..]-(:N1)\n<-[e :N2_TO_N1]-(fi)\n<-[ :N3_TO_N2]-(:N3 {name: $name})\nRETURN DISTINCT fr, fi\n" "--compact"
1671113371.518210 [0 unix:/test_socket.sock] "GRAPH.QUERY" "tg" "CYPHER name=\"test\" path_76569=\"/\" \nMATCH (r :N1 {path: $path_76569})\n-[:N1PARENT*0..99]->(fr :N1)\n<-[:N1PARENT*0..]-(:N1)\n<-[e :N2_TO_N1]-(fi)\n<-[ :N3_TO_N2]-(:N3 {name: $name})\nRETURN DISTINCT fr, fi\n" "--compact"
1671113371.519342 [0 unix:/test_socket.sock] "GRAPH.QUERY" "tg" "CYPHER name=\"test\" path_23161=\"/\" \nMATCH (r :N1 {path: $path_23161})\n-[:N1PARENT*0..99]->(fr :N1)\n<-[:N1PARENT*0..]-(:N1)\n<-[e :N2_TO_N1]-(fi)\n<-[ :N3_TO_N2]-(:N3 {name: $name})\nRETURN DISTINCT fr, fi\n" "--compact"
1671113371.528587 [0 unix:/test_socket.sock] "GRAPH.RO_QUERY" "tg" "CALL DB.LABELS()" "--compact"
1671113371.532376 [0 unix:/test_socket.sock] "GRAPH.RO_QUERY" "tg" "CALL DB.LABELS()" "--compact"
1671113371.538152 [0 unix:/test_socket.sock] "GRAPH.RO_QUERY" "tg" "CALL DB.PROPERTYKEYS()" "--compact"
1671113371.538166 [0 unix:/test_socket.sock] "GRAPH.RO_QUERY" "tg" "CALL DB.PROPERTYKEYS()" "--compact"
1671113371.542063 [0 unix:/test_socket.sock] "GRAPH.RO_QUERY" "tg" "CALL DB.LABELS()" "--compact"
1671113371.542077 [0 unix:/test_socket.sock] "GRAPH.RO_QUERY" "tg" "CALL DB.LABELS()" "--compact"
1671113371.542371 [0 unix:/test_socket.sock] "GRAPH.RO_QUERY" "tg" "CALL DB.PROPERTYKEYS()" "--compact"
1671113371.542401 [0 unix:/test_socket.sock] "GRAPH.RO_QUERY" "tg" "CALL DB.PROPERTYKEYS()" "--compact"
1671113371.547584 [0 unix:/test_socket.sock] "GRAPH.RO_QUERY" "tg" "CALL DB.LABELS()" "--compact"
1671113371.547896 [0 unix:/test_socket.sock] "GRAPH.RO_QUERY" "tg" "CALL DB.PROPERTYKEYS()" "--compact"

Thanks!

Thank you,
Can you share additional details on how you’ve obtained RedisGraph V2.10.4?
Did you build it from source yourself? are you using Docker to run it?
What is the underline hardware used to run the server ?

Hello,

We are using a singularity container with some basic scaffolding for python development on CentOS 7 to run RedisGraph. Just to verify that singularity itself is not the source of the problem, I similarly deployed using the podman container runtime and was able to observe the same bug.

We are building Redis from source (from https://download.redis.io/redis-stable.tar.gz) and have copied the redisgraph.so from /usr/lib/redis/modules/redisgraph.so in the docker://redisfab/redisgraph:2.10.4-x64-centos7 image.

Thanks for your continued help with this!

Thank you for these details.

Is the docker image you’re using to test this issue publicly available?
if so, can you please provide a link to it?

You’ve mentioned that you’re building Redis from source (unlike RedisGraph which you’re copying from redisfab/redisgraph:2.10.4) is the compilation done on the same machine type you’re running your tests on?

I want to make sure the same Os and same architecture is used for both building and running Redis and RedisGraph.

Hello,

The image we are using is not public, but I was able to reproduce this on a publicly available image docker://redisfab/redisgraph:2.10.4-x64-centos7

Terminal 1:

$ singularity instance start \
>     --no-home \
>     --fakeroot \
>     --writable \
>     --bind /path/to/scripts:/root:rw \
>     docker://redisfab/redisgraph:2.10.4-x64-centos7 \
>     testcontainer

Singularity> cp /usr/lib/redis/modules/redisgraph.so /
Singularity> redis-server /root/redis.conf

Note that /path/to/scripts contains the 4 files I previously uploaded.

Terminal 2:

$ singularity shell instance://testcontainer
Singularity> yum install -y python3
Singularity> python3 -m pip install redis
Singularity> cd /root
Singularity> python3 build.py
Singularity> ./test.sh # Note that this script needs to reference python3 instead of python in this env

I’m not familiar with Singularity
Can you please explain this command: cp /usr/lib/redis/modules/redisgraph.so /
The redisfab/redisgraph docker container already comes with a pre built redisgraph module, are you copying a different RedisGraph module built on a different system into testcontainer ?

Apologies for the confusion, this is just me copying that redisgraph.so to the root directory as the Redis config I posted earlier assumes this location:

loadmodule /redisgraph.so CACHE_SIZE 1 THREAD_COUNT 2

Good news, I’m able to reproduce the issue using the redisfab/redisgraph:2.10.4-x64-centos7 container given the CACHE_SIZE 1 and THREAD_COUNT 2 configuration.

I’ll update as soon as I’ve got any insights.

1 Like

Do we have any updates on this please?

Hi,
A fix to this issue is being implemented and will be merged into our master branch within the next few days.

Please follow progress here: do not cache query params by swilly22 · Pull Request #2803 · RedisGraph/RedisGraph · GitHub

2 Likes