Design for quick repeat creation of relationships

Warren_Stephens · January 29, 2020, 5:52pm

Hey! If I have a project that has, say, a set of nodes like this:

250m persons
2m companies
200k locations
If I use the bulk-loader to create those nodes once.

And then I essentially want to repeat the following:

Create 2m to 10m relationships
Perform some analysis, queries, etc.
Delete all of the relationships (without disturbing the nodes)
Then what would be the best way to accomplish this with the best efficiency? Through an API?

Is this viable with that number of relationships?

Thanks for any help.

jeffreylovitz · January 30, 2020, 3:14pm

Hi Warren,

There currently is no API to perform large batch modifications on existing graphs, but we have heard this feature request a few times and hope to roll out a solution that will operate similarly to the bulk-loader within 2-3 months.

At present, you would have to create your relationships through a Cypher query, which at that scale I would expect to take at least 30-60 seconds to execute (very rough estimate). We are currently investigating some potential bottlenecks in relationship creation, but this is likely to be a rather expensive operation regardless.

I’m curious about the use case you’re trying to address with this approach, though. Are we able to re-model the problem such that you can use persistent relationships rather than ephemeral ones? This would be dramatically more efficient than any approach that relies heavily on writing and deleting. I’d be happy to help you try to model this if you can provide additional information!

Jeff Lovitz

Warren_Stephens · January 31, 2020, 3:32pm

Jeff,

Hey! Thank you for the response. A batching feature similar to the bulk-loader sounds promising!

In this sample use case, if I have very many billions of relationships – where the whole amount is more than can be stored on a single RedisGraph server, and where analysis of subsets of 1m to 100m relationships make sense, then that would drive the desire to swap subsets of relationships in and out.

Which leads to a related question. Would it be advantageous to represent some information as nodes, rather than as relationships, in order to make the matrix algebra behind those queries run faster? Or does it not make a speed difference whether something is a node or a relationship?

Topic		Replies	Views
Bulk Create/Update Nodes Redis commands & data structures redisgraph , node	2	531	December 15, 2022
Fastest way to map nodes and edges into RedisGraph RedisGraph	5	1162	June 6, 2020
Redis Graph Memory Issues RedisGraph	11	1248	February 2, 2021
Redisgraph use cases RedisGraph	2	602	March 1, 2020
Bulk loader locking application RedisGraph	3	887	March 9, 2021

Design for quick repeat creation of relationships

Related Topics