Scaling with unused Laptops a need

Thalapathy Krishnamurthy
2 min readNov 21, 2019

--

I have been coding a persistence layer for a while now, may be a year. My need was to store a large number of key-Object pairs on disk. It sounds simple and you may jump out to show the large number of key-value stores out there. For some reason, partly due to techie-arrogance and partly due to the need to have control on this module I did not want to use any open source database. I decided to code that myself. I have done that and rewrote it recently to a better one. I wanted it to perform better on a single node, on my laptop, to be precise. Why is this requirement ? Partly because I do not want to put it out on a cloud server with a higher compute configuration spending a lot of money, partly because even if I end up doing that, I want the code to do well on a single node.

What I find it strange on a single node is that, I have to oscillate between Disk, Memory and CPU. Initially it was saving raw JSON on disk. Later I had stripped a JSON into objects and arrays and saved key-object, key-array kind of thing. Keys can also support a hierarchy to reconstruct the JSON back. But I found I was writing all the keys and attributes which are English words. So I put a dictionary look up to have a code for each English phrase I am writing in the Key or as data. Thinking that this will compress and save time during disk writes. Now I see it takes more time than before. And I can now see the dictionary with a large number of keys and codes and the processing overhead to convert back and forth from English words to Integers by dictionary lookup and reverse dictionary lookup is hitting the memory limits (8 GB) on my laptop and hence making it slower than before. I can reduce my batch work load which results in generating all the key and data and try the same and it should be faster with each iteration but increased number of iterations I guess. I can go that way.

With ever increasing data and pressure on computing, the whole point of solving vertical, horizontal scalability is a very big concern for every App developer who needs an ever expanding compute machine at a reasonable cost. This is a barrier and needs to be solved. While we have distributed computing, distributed memory and so on, I feel something seamless like the way I use my Laptop and write my code and not bother about physical limitations which is all transparently taken care of , a sort of distributed OS that automatically spans across CPU, disk, memory transparently keeping all my programming interfaces like languages and editors is a real need.

Many a time I feel If only I can combine my 3–4 laptops or computers I have invested but not being used much into a seamless computing fabric, it would be excellent to continue scaling without wasting time learning cloud options.

--

--

Thalapathy Krishnamurthy
Thalapathy Krishnamurthy

No responses yet