File, Object or Block what's the difference
Storage is something everyone one has, uses and want to shrink the cost of.
There is almost nothing out there that could create an internal war like a storage discussion.
Easy topics of war is the standard religion war, if you like NetApp, EMC, IBM, HPE, Fujitsu, Huawei or whatever brand that exist out there. This is always a good way to create an internal discussion.
Another topic that start a war is what kind of interface are you going to use. Should it be directly attach (DAS) or should it be network attached storage (NAS). The internal dialogs do not stop there, because simultaneously, your having a dialog if you are going to use Object Storage, File Storage or Block Storage.
In this blog are we not going to push for any vendor or technology – but we are aiming to provide you with an overview of the difference between Object-, File- and Block Storage and when you should use what.
What is File Storage?
File Storage is probably the most common type of storage, and the easiest to explain. Because it could be very similar to a normal desk that you probably have in front of you now.
A File Storage, also called file-level or file-based storage, is something you can scroll around in and view the files in an easy way.
Imagine you have a pile of documents in front of you, you can put those files in to standard manila folders and put the folders in your drawer. Now, when you want to get hold of a specific document, you go to your drawer and open that folder and now you get hold of you document. That is a file and you could very easy find that file by open that folder.
A File Storage also secures your data by granting you access, called ACLs (Access-Control List). That access gives you limited amount of information about that file like:
- When was the file created
- Last Modified
- Who have access to the file
- Last backed up
- And more
Problem with this is that, if you have hundreds of documents in each folder, and thousands of folders located in multiple drawers it could very hard to find something in the end.
Another bad thing with a file-level storage is that it could be located so it gets delivered to you very slow because it was in the wrong drawer.
But the good thing with File-based storage it can be (depending on vendor) a very scalable solution where you can scale-out with more servers or scale-up where you expand existing storage with more disk.
Basically, what it means, it is a very simple structure way to view your files.
What is Block Storage?
Block Storage chops your data in to blocks and write them down to a physical disk.
When you view the blocks, you have no idea what kind of data that is. Each block has its own unique identifier that matches them together which allows you to read your data again.
Each system gets its own chunk of blocks that are dedicated to that system. If you have a Linux machine it can write its own data in its own format, and a Windows machine can write its own data in its on format without any explanation to other systems.
Block storage is very often configured so you spread out all the blocks on to multiple disks and those disks create a RAID set to help you to perform faster, both from a speed and response time perspective.
This is normally done by using Storage Area Network or some sort of direct attached storage and multiple disks behind it.
Block storage is a great solution to serving applications that need quick responds and transfer data back and forward all the time, like a Database, and for developers who creates their own structure.
Downside is that block storage is very simple, so it’s hard to understand what kind of data you have. And there is normally no metadata involved to explain what this block does. Many vendors have tried to create functionality like heatmaps to help you increase performance for your applications
Normally is Block Storage very expensive because it gets more and more away from spinning disk in to flash storage like NVMe and using the infrastructure behind it is also a cost driven part of it.
Simply block storage is a very stumped but extremely fast storage solution and use it carefully.
What is Object Storage?
Object Storage is probably the newest storage methodology of the three, and is the fastest growing on the market.
Object Storage is a flat structure where there is no folder where you put your data, nor is there a specific drawer that you choose. It is more like a closet where you just drop your piece of photo, bag, luggage, shoes or whatever and it gets stored there. But the main difference is when you drop of your object in this closet you are also adding a post it note with information about that object like the shoes, you writing down it’s color, brand, age, quality, usage, size, owner and much more, this is called metadata, and the closet will automatically split up your object in to multiple chunks and even double those chunks and put each chunk on different shelf's.
So when you ask for an object again you basically just insert your hand - think of the object your wanted (also called HTTP Request or Application Programming Interface, API) - and based on the metadata that’s been written down you get your object in your hand and no more hassle. And you don’t need to search, you don’t need to open any folder. Just ask for your object and you got it.
Another great thing with Object Storage is that it almost unlimited in size, so if you are filling up your closet, it could easy scale and grow to unlimited size, so in theory your closet is the same size as the space and there is no end.
Downside with Object Storage is that it can only be communicated over IP, the same protocol that you use for looking on websites, that means you can provide data sort of fast, but not even close as fast as Block storage can do, and it is normally very slow responding, so that means applications like a database or a virtual machine should not use Object Storage as their primary target.
Another downside is that you can’t modify a specific part of the object, instead you need to download your object, do your modification and then upload the entire object again.
But a perfect match for object storage is application like print que, video files, file archive, backups, web content and much more.
And its normally built on cheap servers with a lot of internal disks, so the maintenance cost is very low.
Object Storage is perfect for applications that are fine with it gets its file in a short time period and it was simple to get it.
What should I choose?
This is always a tricky question, and Cristie Nordic, PEDAB Group and Load Systems can always help you to make the right decision, but here are a few things that works better with each storage solution
Traditional File servers like your home drive.
Running Containers like Docker, Kubernetes, Openshift and more.
AI Projects where you have millions of files.
You have a Virtual Environment like VMware, Microsoft Hyper-V or KVM
You have high demanding applications like Oracle and IBM DB2 databases.
Perfect for Video Streaming
Backup Archive Data
Great complement to archive File Server data
Online File Shares application similar to Dropbox, OneDrive or Google Drive
AI Project where you need metadata about the object you want to use.
Both Object Storage and File Storage is a great start for your Machine Learning (ML) and Artificial Intelligence (AI) journey.