Size of a Redis Cache Required based on a CSV file?

I have a 3GB file.

How can I know which Redis cache memory is calculated to load this CSV as JSON string for each line in CSV ?

3GB CSV will be how much approx for Redis cache as JSON ?

Even 1GB cache on Azure just for this is like $40 for which I just one connection to be updated on Redis cache one a week.

image

The project will not approve of $50 just for the purpose of caching alone.
Might as well put in a MySQL database and query them each time.

I did this to set data from the first 1000 rows from the 3GB CSV.

import json, redis

r = redis.Redis(host='localhost', port=6379, db=1)
with open('aws-pricing-1000.json', encoding='utf-8-sig') as data_file:
    data = json.load(data_file)
try:
    json_string = json.dumps(data)
    r.set('aws_pricing_1000', json_string)
except Exception as e:
    print(e)
redis-cli info memory | grep 'used_memory.*human';
used_memory_human:4.60M
used_memory_rss_human:8.82M
used_memory_peak_human:7.17M
used_memory_lua_human:37.00K
used_memory_scripts_human:0B

On an approximate basis, so for 1000 lines of CSV, the JSON equivalent in Redis cache is taking 5 MB.

So a ~4154110 lines of a 3GB CSV (wc -l index.csv), would that mean I would need 20 GB of Redis Cache ?

Paying $300 per month for this cache is out of question.

First you should notice that CSV to JSON has a major overhead since you now pay for each field name on each on value.
I would suggest you try to use RedisJSON it might be more efficient in your case.

2 Likes