Saturday, November 12, 2016

Why NoSQL (vs SQL)

NoSQL databases that enable storing unstructured and heterogeneous data at scale have gained in popularity. For most developers, relational databases are the default or go-to option because a table structure is easy to understand and is familiar, but there are many reasons to explore beyond relational databases. E.g. Increased need to process higher volumes, velocities, and varieties of data at a rapid rate has altered the nature of data storage needs for application developers.
NoSQL is a category of databases distinctly different from SQL databases. NoSQL is often used to refer to data management systems that are “Not SQL” or an approach to data management that includes “Not only SQL". There are a number of technologies in the NoSQL category, including document databases, key value stores, column family stores, and graph databases, which are popular with gaming, social, and IoT apps.










When to use NOSQL:
Let's imagine you're building a new social engagement site. Users can create posts and add pictures, videos and music to them. Other users can comment on the posts and give points (likes) to rate the posts. The landing page will have a feed of posts that users can share and interact with.
So how do you store this data? If you're familiar with SQL, you might start drawing something like this:




Here you see to s show the post and the associated images, audio, video, comments, points, and user info on a website or application, you'd have to perform a query with eight table joins just to retrieve the content. Now imagine a stream of posts that dynamically load and appear on the screen and you can easily predict that it's going to require thousands of queries and many joins to complete the task.

In such situation NoSQL option that simplifies the approach for this specific scenario. By using a single document like the following and storing it in DocumentDB, an Azure NoSQL document database service, you can increase performance and retrieve the whole post with one query and no joins. It's a simpler, more straightforward, and more performant result.
{
    "id":"ew12-res2-234e-544f",
    "title":"post title",
    "date":"2016-01-01",
    "body":"this is an awesome post stored on NoSQL",
    "createdBy":User,
    "images":["http://myfirstimage.png","http://mysecondimage.png"],
    "videos":[
        {"url":"http://myfirstvideo.mp4", "title":"The first video"},
        {"url":"http://mysecondvideo.mp4", "title":"The second video"}
    ],
    "audios":[
        {"url":"http://myfirstaudio.mp3", "title":"The first audio"},
        {"url":"http://mysecondaudio.mp3", "title":"The second audio"}
    ]
}
In addition, this data can be partitioned by post id allowing the data to scale out naturally and take advantage of NoSQL scale characteristics. Also NoSQL systems allow developers to loosen consistency and offer highly available apps with low-latency. Finally, this solution does not require developers to define, manage and maintain schema in the data tier allowing for rapid iteration.

Note: In case you still advocate SQL, you could use a relational solution like SQL Server to store the data and query it using joins, as SQL supports dynamic data formatted as JSON.

The following table compares the main differences between NoSQL and SQL.























Hope this helps.

Few links: http://highscalability.com/blog/2010/12/21/sql-nosql-yes.html

Arun Manglick

1 comment: