What is a Database And DBMS.
A database is an organized collection of structured information, or data, typically stored electronically in a computer system. A database is usually controlled by a database management system (DBMS). Together, the data and the DBMS, along with the applications that are associated with them, are referred to as a database system, often shortened to just database.
Different Types of databases (DBMS) .
There are different types of databases like sql and NonSql . In memory or persisted , embedded or server based databases .
Most common type of databases that we are aware of are server databases . like Mysql , Postgresql and mongodb .
These databases are spun up as a separate process on top of OS , On a VM , bare metal or cloud . The database listens on a particular port and clients on the same machine or any where else can communicate with the database through the exposed port . In this architecture the database can live on a separate host from the user .
While server based databases are most common . there is another type called embedded databases which are really interesting , and are of great use .
In this post we are going discuss :
- What is an embedded database.
- Creating a simple embedded db.
- Benefits of Embedded Databases.
- Usecases for Embedded Databases.
- Popular Embedded Databases
What is an embedded database.
There are two definitions for Embedded databases :
Database for Embedded Systems such as mobile or consumer devices . These need to have a small footprint and provide adequate performance in an environment with limited resources like Cpu and memory
Databases embedded in applications . The appication doesn’t need to communicate with the database through a server as the database lives in the application itself .
In Both definitions the an embedded database is a set of libraries which provide builtin database functionality with out having a separate database process running .
Creating a simple embedded db.
Lets imagine of an ecommerce monolith application with a products service . Recently you have noticed that some of the queries are expensive . To mitigate the issue you are supposed to implement some on the go caching . For some hypothetical issues you cant use centralized cache like Redis .
So you create a in-memory hashmap (name it ProductsCache ) as an cache . we will store the queries and their results for the first time and return the value from hashmap for subsequent reads .
# Stupid in memory cache using python .
class ProductsCache :
def __init__(self) :
self._cache = {}
def get(self,query:str) :
self._cache.get(query,None) # Return None If not in cache
def put(self,query:str,result:any) :
self._cache[query] = result
Our ProductsCache is an example of a simple embedded database . The ProductsCache lives in the same process as of the application and we can communicate with it directly and we cant access it outside of the program . In a sense it is confined to our program or is embedded in it .
Note : An embedded database doesn’t need to be in-memory and can also persist data to disk .
Some Popular Embedded Databases .
Well our ProductsCache is a good example for a simple embedded db .
But the production will bring a lot of heat . Sorry to say but our silly cache won’t
survive the requirements like the persistence of cache , concurrent read \ writes ,
and a bunch of other things .
Rather than implementing these things ourselves we should
use a time tested existing solution ( Don’t Reinvent the Wheel) .
Embedded Databases have been in existence for a lot of time and are interesting in the features and usecases they have .
Lets see some of the popular ones :
Sqllite : Sqllite is a C-language library that implements a small , fast , self contained , highly reliable , full featured SQL datastore . It is builtin in all mobile phones . It is a relational database and use SQL to query data .
LevelDb : Although SQLite works well in the majority of situations, it has a serious flaw. Since SQLite is a single-thread database, concurrent access is not supported. In circumstances with heavy throughput, it performs incredibly slowly due to its inability to multi-thread.
Google developed LevelDB, which supported multi-threading, in response to this restriction.
It is a fast key-value storage library that provides an ordered mapping from string keys to string values .
RocksDb : is a fork of LevelDb , developed by Meta to optimized for flash and memory.
Most of these embedded databases are written in C (for performance) but a have wrapper for most of the programming language .
Benefits of Embedded Databases.
Due to the nature of their architecture and the fact that embedded databases don’t need a server these databases provide some pretty benefits like :
High Performance :- Embedded Databases have a simple architecture , they don’t need a bulky server module to run . Most of communication happens in the same process so the Latency is very less and write throughput is also large . Which makes these databases very performant for particular tasks .
# Latency Comparisons L1 cache reference ......................... 0.5 ns Branch mispredict ............................ 5 ns L2 cache reference ........................... 7 ns Mutex lock/unlock ........................... 25 ns Main memory reference ...................... 100 ns Compress 1K bytes with Zippy ............. 3,000 ns = 3 µs Send 2K bytes over 1 Gbps network ....... 20,000 ns = 20 µs SSD random read ........................ 150,000 ns = 150 µs Read 1 MB sequentially from memory ..... 250,000 ns = 250 µs Round trip within same datacenter ...... 500,000 ns = 0.5 ms Read 1 MB sequentially from SSD* ..... 1,000,000 ns = 1 ms Disk seek ........................... 10,000,000 ns = 10 ms Read 1 MB sequentially from disk .... 20,000,000 ns = 20 ms Send packet CA->Netherlands->CA .... 150,000,000 ns = 150 ms - credits : https://gist.github.com/hellerbarde/2843375
Low resource consumption :- Embedded database have a low footprint and can be as as small as 1 MB . This can be a game changer for resource scarce conditions like the browser , iot devices and mobile phones .
No Administration Overhead :- Embedded databases need not to worry Administration .
Usecases for Embedded Databases.
Despite being past and performant most embedded databases lack common features like ACID transactions , Sharding , Indexing .
So embedded databases have particular niche usecases like :
When you need very low latencies but don’t need to worry about stuff like indexing , replication example an persitent or in inmemory cache
When Storing data on embedded systems or mobile applications where it is safe to store data locally
Storing local data in browsers using Indexed Db .