FT.CREATE PREFIX vs wildcard pattern / regex

We have a scenario where our keys have some structure to them. It appears to be a problem when we want to create a different index for each as we can only specify a prefix and many objects would have the same prefix for a given path.

For example given these keys

task:0
task:0:parameter:0
task:0:parameter:1
task:0:parameter:2
task:1
task:1:parameter:0
task:1:parameter:1

If we create an index with prefix “task:” we would get all the items above.

Is it possible to somehow use FT.CREATE that would only create an index for “task:[0-9]” and separately one for “task:*:parameter:”?

Would regex ever be an option here, or would performance be so bad it wouldn’t be considered?

1 Like

Have you looked into index filter expressions? See the FILTER options on FT.CREATE: Command Reference - RediSearch - Redis Secondary Index & Query Engine

1 Like

I looked at that and maybe I don’t fully understand it, but for my use case it didn’t appear to be powerful enough to do anything beyond startswith(), which it seems like the PREFIX already gives you.

Using regex to match keys isn’t an option at this time – only prefix searches, which work with the same as prefix search queries (i.e., they really will only match a prefix of the keyf).

However, with FILTER you have access to the fields in the hash. So you could throw the ID value that you’re including in the key name into the hash as well, and match on that in a filter expression. Consider these examples of filter expressions:

FT.CREATE books-fiction-idx ON HASH PREFIX 1 ru203:book:details: FILTER "@categories=='Fiction'" (…)

FT.CREATE books-newer-idx ON HASH PREFIX 1 ru203:book:details: FILTER "@published_year>=1990" (…)

FT.CREATE books-older-idx ON HASH PREFIX 1 ru203:book:details: FILTER "@published_year<1990" (…)

In all cases, the field used in the comparison is a hash field. Does that help?

I appreciate your help, but unfortunately for my use case that wouldn’t work. My items keys are a essentially a group of different sequential INTs and I wouldn’t be able to use the filters for tags.

I specifically need to be able to say a key has terms in the key itself.

An example of my keys below would necessitate an index filtering keys on containing “task:” && “parameter:”

Likewise, I would also need an index that filtered on “task:[0-9]” so that the task index wouldn’t also return entries for the parameter keys that share the hash prefix “task:”.

task:0
task:0:parameter:0
task:0:parameter:1
task:0:parameter:2
task:1
task:1:parameter:0
task:1:parameter:1

Ok, with that schema for your key names, you have two options with the current RediSearch feature set:

  • Restructure your key names so that they work with PREFIX (task:parameter:: maybe
  • Include the sequential ID value from the key name in the hash (as well as the key) so that FILTER can use it

And to request something like a REGEX_PREFIX option, you can open a ticket here: Issues · RediSearch/RediSearch · GitHub

I thought we had a field where you can specify the hash will be indexed in idx1 or idx2 - but we don’t.
This could solve the issue. Can anyone think of the disadvantages of such a solution?