Setting up my home NAS

Outlining my process for setting my home NAS (Network Access Storage) using a Raspberry PI 4 (8GB RAM) with an external 4TB Hard drive. This Article will outline many of the things I learnt along the way. I will be splitting it into many smaller sections so that while some may become irrelevant someday, hopefully some others may stay relevant for longer.

I will keep adding and updating sections of this blog as i learn more since this is more of a brain dump of what i learnt and not a definitive guide on how to setup a NAS

Topics discussed:

How to choose the setup
OMV vs Raspberry Pi OS with Samba
Setting up OMV (Docker and Portainer too)
Testing speeds
Backing up files
Extras I have installed

Choosing the setup

The setup i went with was:

Raspberry pi 4 (8GB RAM) running Open Media Vault
4TB Seagate Iron Wolf (NAS Drive)
USB3 external 3.5in HDD enclosure
Gigabit AC router
Backup 1TB USB Harddrive

The three most important choices for me while setting up a NAS:

Data transfer bottleneck:
This is the slowest point in the chain, no SSD or thunderbolt connection will speed up your NAS if the slowest thing in the chain is bogging you down. (For me it was the gigabit port)
CPU and RAM:
As the number of hosted applications on the NAS increases and you have more concurrent and CPU intensive tasks running such as a PLEX server transcoding 4k movies, the CPU and RAM will throttle performance. (Here is where a entry level out of the box NAS solution failed)
Redundancy:
No NAS is complete without data redundancy. Not all redundancy is the same and you may not need to redundantly backup everything. (In most cases, the data you absolutely cannot lose is far less than the total size of the NAS, just backup that and you should be good)

Reasons for picking each of these:

Raspberry Pi 4

Prior Raspberry Pi’ s had slower IO, with RPi 4 we get a

1.5Ghz CPU
upto 8GB RAM
USB 3.0 (5Gbps)
Gigabit Ethernet.

This is more powerful that an entry level Synology NAS with our only bottleneck being the Gigabit Ethernet. All for a low price of $75 ($35 for the 2GB RAM model)

The Pi also supported Open Media Vault (an open source NAS software) and all major NAS applications. It was well suited for a home server.

4TB Segate NAS drive + Enclosure

The options here were, off the shelf external hdd, off the shelf external ssd, or internal HDD with enclosure. The SSD vs HDD argument was easy for 2 reasons:

SSD Electron Decay
The real bottleneck is still going to be the Gigabit ethernet

The Internal dive choice was less clear, but I went with an internal drive for better flexibility in terms of drive choice (Choosing a NAS drive instead of a Desktop drive for around the same price). At the time of writing, this and the enclosure cost me 120$

Gigabit AC router

At the time of writing, without the setup getting too expensive, the fastest and most reasonable communication port was a gigabit ethernet port. Every device i had in the chain supported it and the transfer speeds were reasonable for almost all my use cases. With this in mind, I just bought the cheapest gigabit router on Amazon.

Total cost of setup

RPI4: $75
Case: $25 (Yes I bought the fanciest one cause why not)
Router: $45
HDD: $100
Enclosure: $25

Total: $280

If you already have a gigabit router, get the cheaper Pi4 (2GB RAM) and buy a cheap 3$ case, the total is ~160$, cheaper than an entry level Synology NAS (Synology DS120j) with far more flexibility and power.

OMV vs Pi with Samba

This was a big decision given that my use case for a home server was very simple, an easily accessible NAS from any device on the network. The Benefit of a NAS software like Open Media Vault was not immediately obvious to me. Since giving it a go, here are my lessons:

OMV (and similar NAS softwares) makes the setup a NAS exponentially easier. Once i had OMV setup on the PI, the next steps were just clicking buttons with a very small learning curve.
You realize that there are a lot of auxillary services that you never knew you needed and probably should have. Things such as:
- FTP, RSync
- A Simple Web GUI
- Access control and monitoring
- Disk management (Mounting, Formatting, Sharing, Disk Allocation etc,)
- Update management
- RAID (not something I used but still)
Samba is not the easiest/best way to file share (yeah, sometimes you just have to RSync)

I took a leap of faith with this one but i am so glad i did, OMV with Docker apps is a beast i did not expect and makes this process actually enjoyable. I will not be doing justice to this section if i dont give credit to Techno Dad Life (a youtube channel for all things OMV related). It helped me out more than I like to admit.

Setup Open Media Vault

I preferred to have my OMV setup remain separate from my base Raspberry pi OS image so that i can still use my Pi for other projects i may have. To do this, you can just install OMV as an application on a headless version of RPi OS. The link i followed:

https://pimylifeup.com/raspberry-pi-openmediavault/
Installing OpenMediaVault to a Raspberry Pi

This process was fairly simple so i wont spend too much time here. A very nice tutorial can be found here:

Setting up Docker and Portainer

OMV 5 is very useful once we have Docker and Portainer installed. Since we have a headless system, Docker and Portainer lets us add almost any docker app to OMV and manage our NAS completely using a Web GUI. To install them we need OMV Extras.

Once again, the instructions are fairly simple and can be found here: https://omv-extras.org/

Testing speeds

Before you can start using the NAS, you may want to make sure that you are getting the speeds you expect from your NAS. Even though you setup everything as expected, you may not experience the speeds promised cause a silly miss here or there, so testing it will help avoid annoyances in the future. The two major bottlenecks to test are:

The network speed between the client device and the NAS
The NAS disk read write speed

Based on my calculations, my disk speed should have been 5Gbps (Limited by USB3.0) and the network speed of 1Gbps (Gigabit port). What i found when testing the first time: 30Mbps. The reason: My computer used the 2.4Ghz connection instead of the Gigabit ethernet or the 5Ghz network

After a few more similar debugging steps, i got speed up to the theoretical max bandwidth of 1Gbps. Moral of the story, always test your hypothesis.

Testing the netowrk

Here i used iperf, a simple command line tool to measure network throughput. You need it installed on both systems. for me this was a mac and the RPi.

Mac: brew install iperf
RPi: sudo apt-get install iperf

Once installed, pick one machine as the server and run iperf -s to startup the server. on the other machine run iperf -c IP-ADDRESS-OF-THE-SERVER. IPerf will run and tell you the throughput of your network e.g.

Common pitfalls:

Non gigabit router
Wifi instead of Wired ethernet
2.4Ghz instead of 5Ghz
Anything long the chain (For me the last bottleneck was my USB C hub. Using a non thunderbolt wire to connect to the hub reduced the speeds to 30%)

Testing drive

Here the easiest method i found was to use dd.

dd is a powerful tool and has the potential to wipe your disk, be very careful before running it

With that warning out of the way, this is how we will use dd:

To test write speed we use

dd if=/dev/zero of=/path/to/disk/test.img bs=1G count=1 oflag=dsync

This should give us something like this:

Breaking down the command:

dd: The Copy and convert command (called dd because cc is already in use by C compiler)
if: Input file. Here we use a 0 bit to create dummy data and test write speed. This will not change since we are only testing write speed and do not care what we write.
of: Output file. This is the dummy file we write to. The most important thing here is the path of the file. I did the test after partially filling my NAS so writing directly into the mounted system was not an option. I instead chose to write a dummy file called test.img into a folder in the mounted drive. This may decrease your speed but will be a more accurate representation of the real world write speeds.
bs=1G: The number of bytes to write. Here I wanted to test for 1 Gigabyte. Make sure you have atleast this much free RAM on the system before running the command
count: To limit the number of blocks to copy. This in combination with bs ensure that no more than 1GB of data is written.
oflag: To prevent cache from being used and giving false results A nice comparison of the flag is here

Now to test read speed

We will use the temporary file that we just wrote to test read speeds.

dd if=/path/to/disk/test.img of=/dev/null bs=1G count=1 iflag=nocache

We should get an output something like this:

The read speed is significantly faster than the write

Now since this not a direct write to the drive, we do have some IO overhead. However, this is much closer to our theoretical 5Gbps bottleneck than before.

Clean up temporary file

Ensure that you clear up the temporary file we created to test our disk speed. Simply rm /path/to/disk/test.img.

With this we have tested and validated our speeds and identified our bottlenecks. In most cases, this will be the network.

Backing up

This has been the most contentious section for me and I still dont have a definitive answer for this. Here are some of your options with their pros and cons:

SMB/CIFS Share:

Pros	Cons
Shows up as a normal folder in file explorer	Copies are slower, especially with multiple smaller files
	If one file in the copy fails, the whole copy fails (Hard to synchronise)

…

FTP

Pros	Cons
Syncing is easier since each file is copied separately and one failure does not fail the whole sync.	Not as easy to view files as SMB/CIFS
The failed copy can also be retried easily	Still not ideal for may smaller files

…

RSync

Pros	Cons
Fastest option for syncing, can handle small and large files with ease	no good GUI, must know how to use the CLI tool first
Only syncs the diff, so stopping and restarting the sync can be done without worries	Very powerful, especially with the -delete command. Can wipe out data accidentally.

…

Beyond Compare

Pros	Cons
Extremely easy to visualize the diff between files and sync partials	Pro features are paid. This is less of an issue cause mot features are free and the license is very reasonable for the price.
Granular control over syncs	Same as FTP and SMB, copying small files and looking up their diff takes time

…

Other considerations

Borg Backup: This is a tool thats very useful for incremental backups but since i need the files on my server to be accessible independently, this was left out of evaluation.
Syncthing: Realtime syncing. This was overkill for my usecase and required something be running always on the devices, i did not go for this.

So what did i end up going with? A combination of each depending on the usecase. The best part of a custom NAS solution like this is that you are not limited to any one solution. For example:

To browse, view and one off copy files I use SMB
To sync photos and documents i use Beyond compare (Need fine grained control over what goes where)
To move a lot of data, rsync
And FTP when rsync is too cumbersome to use

Note: One way to deal with a lot of small files while backing up archival data is to simply bundel it into a single file. We can do this using tar like:

# To bundle
tar -cf TAR_FILE_NAME.tar FOLDER_NAME 
tar -czf TAR_FILE_NAME.tar FOLDER_NAME // To compress as well

# To expand
tar -xf TAR_FILE_NAME.tar

# To individually tar each folder in a parent folder
for i in *
do
  # remove the echo when you are sure the out is what you want
  echo tar -czf \"$i.tar\" \"$i\"
done

Extra Packages

Docker: To allow containerised applications to run using NAS data (e.g. PLEX) without interfering with each other
Portainer: To manage docker containers using the web UI. Is a container by itself
Plex: To stream local media content
Heimdall: A startscreen to access all the apps on the server
NetData: To monitor RPi system stats
Code Server: VS code on the browser running on the NAS
More coming soon….