Newbie: efficient way to bulk-delete some items

bliako · November 30, 2021, 9:28am

I am totally new to Redis (and its genre). So far it looks and performs great. However I need some guidance in following best practice for my use-case. Explained below. I also have a particular question at the end.

I use Redis for storage of 3,4 types of data all stored as key-value pairs (key=text, value=json string). Some of this data types are ephemeral meaning that they need to be deleted after some short time interval. Whereas another are persistent and need to be kept for the life of the app, for example total statistics: how much time it is running, how many results found, etc. This data is persistent.

I use C for doing the analysis of data in a big loop at the end of which it stores results in Redis (unix-socket,same computer). Then my web-app living in my web-server (same computer) reads Redis (in regular intervals) reads data if available and serves them as html via a broswer to clients.

I use hiredis to interact with Redis from C.
My web-app is Perl-based, Mojolicious and uses Mojo::Redis to interact with Redis.

All this works well and thanks all free-open-source-software for this. Until I want to optimise or implement particular behaviour, for example, at the end of each loop I want to delete all ephemeral items (the results at the end of the loop which have already been presented) but keep all persistent items (the statistics). I have already made sure that my key-names are grouped so that I can do (if I could!) DEL ephemeral-*

So the question is how to bulk-delete based on a key-pattern-regex in a good and best-practice way. Like: delete ephemeral-*

I have searched for this and the majority of the answers regurgitate this: fetch all matching keys using SCAN and then DEL each (with demo bash script provided).

How to do this via C using hiredis in an optimised way?

From what I gathered, it must use a pipeline (hiredis: redisAppendCommand) but what I can’t understand is whether I need to fetch all the keys which match a pattern (e.g. with SCAN) in my C app and then construct another pipeline to tell Redis to delete those keys. Is there a more optimised way to do this? I mean why fetch the keys back (with a cursor!) and then send them again to Redis with DEL??? Can’t I tell Redis to just SCANDEL pattern? Can I tell Redis to execute a small script doing exactly that but on the Redis engine? I simply don’t like this paper-pushing so-to-speak, so I am not sure whether this is the right way to go. It definitely requires a lot of extra programming effort, it’s not just the communication overhead.

I can also settle with storing all the ephemeral keys in a Redis LIST when they are inserted into Redis (meaning transfering same keys-data twice unfortunately). And then tell Redis to read that LIST, treat its items as KEYS and delete them. If there was a command like DEL_LIST_ITEMS_AS_KEYS list it would be great!

I would like to achieve the above with just a single database if possible because I am just not acquainted with multi-db operation and its quirks. If anyone think this is a better solution (i.e. create 2 Redis db: ephemeral-db and persistent-db and then simply FLUSHALL ephemeral-db at the end of each loop, sounds like the nuclear option) then please guide me. The vibe on the net was not so positive, with love-and-hate for multi-dbs.

If you think I should shorten this or ask specific short questions please let me know and I can do that,

Many thanks for your help,

bliako · November 30, 2021, 5:00pm

I have implemented, in C, redis_mdel(redisContext *c, const char *pattern) which uses hiredis C client to scan for the input wildcard pattern and then DEL the keys using pipleline. It’s quite fast, I can’t complain. And I am sure it can be made faster with your comments. It’s just that I feel there are unecessary key transfers but perhaps I don’t see the big picture.

Comments welcome.

gist.github.com

https://gist.github.com/hadjiprocopis/32e1ac08aad5253d26b180fc613ea64b

redis_mdel.c

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <time.h>

/* delete keys matching a pattern, comments welcome
   author: Andreas Hadjiprocopis (https://github.com/hadjiprocopis) / bliako
   date  : 30/11/2021
*/
/* cvector is a headers-only vector implementation

This file has been truncated. show original

KyleB · December 1, 2021, 12:09am

The problem with using SCAN is that it’s O(n). You’ll need to scan the entire keyspace to know that you’ve found all keys starting ephemeral-*. This might be okay if your keyspace is small.

Sometimes Redis users will keep track of keys in a Redis set. This way, you can just iterate over the key names in the set and then delete them. This avoids the O(n) scan of the entire keyspace.

If you’re using hashes, another option would be to add an “ephemeral: true” field to them. Then you can index those hashes with RediSearch (the source-available Redis module), quickly find them, and delete them.

Topic		Replies	Views
Is there any way to remove the Redis keys of completed jobs? Redis commands & data structures lists	4	5852	September 10, 2020
How to delete/remove bulk key in redis using .net core Redis client libraries (Java, Python, JS, etc.) redis-clients	0	626	December 20, 2021
SLOWLOG - sorted set question Redis commands & data structures sorted-sets	4	960	June 8, 2020
About Redis Commands & Data Structures Redis commands & data structures	15	3419	July 18, 2020
Wildcard in key for ReJson queries Redis commands & data structures	2	1340	October 6, 2020

Newbie: efficient way to bulk-delete some items

Related Topics