Working with huge datasets

mkrzyszc · September 2, 2021, 3:25pm

Hi everyone,

I am playing around with RediSearch on huge data payloads (~500k). I have following data structure that is stored as Json:

 {
        public Guid Id;
        public string InvoiceNumber;
        public int Items;
        public Instant? InvoiceDate;
}

For test purposes InvoiceNumber=$“INV_{N}” and Items = n%20 where n <0,500000> and I have following index:

FT.CREATE invoice on JSON PREFIX 1 "invoice:" SCHEMA $.InvoiceNumber as InvoiceNumber TEXT SORTABLE $.Items as Items NUMERIC SORTABLE

I faced few issues with ft.search

ft.search invoice "@InvoiceNumber:INV_*" limit 0 20

shows that 200 items are available

ft.search invoice "@InvoiceNumber:INV_* @Items:[10 10]" limit 0 20

shows that only 10 items are available

ft.search invoice " @Items:[10 10]" limit 0 20

shows that 25000 items are available

Could you please tell me why MAXEXPANSIONS are applied only to TEXT types?

ft.search invoice "@InvoiceNumber:INV_*"

returns

(integer) 99

(empty array)
(1.52s)

There is empty array, but according to documentation it should return top results accumulated so far

I am not really sure what is happening here. The only solution(and probably not the most elegant), that solves this issue is setting timeout to 0.

With MAXEXPANSIONS set to 500k and TIMEOUT limit behaves really strangely and takes a lot of time. There is no difference for queries with or without limit parameter.

Thank you in advance for your help. I was doing research for few days, but I didn’t manage to find answer.

Regards,
Mkrzyszc

Topic		Replies	Views
How does LIMIT work? RediSearch redisearch	2	3110	August 14, 2021
Query efficiency of ft.aggregate RediSearch redisearch	0	341	August 30, 2023
Redis Search: sortby is not correct when there are more than 1,000,000 docs in DB RediSearch	1	347	September 13, 2023
FT.SEARCH & LIMIT - paging not working as expected in redisearch 1.4.0 RediSearch	3	2135	December 4, 2018
not able to query with numeric fields RediSearch	2	1761	November 8, 2018