I am a newbie to redisearch and I am having a problem. I currently have an index with about 7 fields, one of which is a communityNumber field of type: numeric. there is about 2 million data under this index. When I query using:
127.0.0.1:6379> ft.aggregate userIdx "*" groupby 1 @communityNumber limit 0 0
1) (integer) 500
(4.80s)
it takes about 5 seconds. But when I put the same data into elasticsearch query, it took less than 500 milliseconds. I would like to ask if this is the normal efficiency of aggregated queries? Or am I not querying correctly? What should I do to improve the aggregation query efficiency?
my current version of redisearch is 2.8.4.
Here is the ft.explain message:
127.0.0.1:6379> ft.explain userIdx "*" groupby 1 @communityNumber limit 0 0
"<WILDCARD>}\n"
Below is the ft.info information:
127.0.0.1:6379> ft.info userIdx
1) index_name
2) userIdx
3) index_options
4) (empty array)
5) index_definition
6) 1) key_type
2) JSON
3) prefixes
4) 1) user
5) default_score
6) "1"
7) attributes
8) 1) 1) identifier
2) $.id
3) attribute
4) id
5) type
6) NUMERIC
2) 1) identifier
2) $.communityNumber
3) attribute
4) communityNumber
5) type
6) NUMERIC
3) 1) identifier
2) $.name
3) attribute
4) name
5) type
6) TAG
7) SEPARATOR
8)
4) 1) identifier
2) $.age
3) attribute
4) age
5) type
6) NUMERIC
5) 1) identifier
2) $.createId
3) attribute
4) createId
5) type
6) NUMERIC
6) 1) identifier
2) $.createName
3) attribute
4) createName
5) type
6) TAG
7) SEPARATOR
8)
7) 1) identifier
2) $.createTime
3) attribute
4) createTime
5) type
6) TAG
7) SEPARATOR
8)
9) num_docs
10) "2000000"
11) max_doc_id
12) "2000000"
13) num_terms
14) "0"
15) num_records
16) "14000000"
17) inverted_sz_mb
18) "31.286308288574219"
19) vector_index_sz_mb
20) "0"
21) total_inverted_index_blocks
22) "8118141"
23) offset_vectors_sz_mb
24) "0"
25) doc_table_size_mb
26) "147.71356201171875"
27) sortable_values_size_mb
28) "0"
29) key_table_size_mb
30) "55.313194274902344"
31) records_per_doc_avg
32) "7"
33) bytes_per_record_avg
34) "2.3432908058166504"
35) offsets_per_term_avg
36) "0"
37) offset_bits_per_record_avg
38) "-nan"
39) hash_indexing_failures
40) "0"
41) total_indexing_time
42) "53497.718999999997"
43) indexing
44) "0"
45) percent_indexed
46) "1"
47) number_of_uses
48) (integer) 20
49) gc_stats
50) 1) bytes_collected
2) "0"
3) total_ms_run
4) "0"
5) total_cycles
6) "0"
7) average_cycle_time_ms
8) "-nan"
9) last_run_time_ms
10) "0"
11) gc_numeric_trees_missed
12) "0"
13) gc_blocks_denied
14) "0"
51) cursor_stats
52) 1) global_idle
2) (integer) 0
3) global_total
4) (integer) 0
5) index_capacity
6) (integer) 128
7) index_total
8) (integer) 0
53) dialect_stats
54) 1) "dialect_1"
2) (integer) 1
3) "dialect_2"
4) (integer) 0
5) "dialect_3"
6) (integer) 0
I’ve filed an issue on github for this problem: https://github.com/RediSearch/RediSearch/issues/3805