DISCORD

GITHUB

Home

Welcome to The Gridâ„¢.

We hope you enjoy your stay.

SXLAR WASTELAND OFFICIAL WEBSITE


4:20PM

1001 / 114

Overview

Data Hoarding (sometimes referred to as digital hoarding) is the practice of collecting, storing, and archiving digital assets and media usually for the purpose of preservation. While the common addage "what you post online is permanent rings true 99% of the time, lost media absolutely does exist, and sometimes things just dissapear. The obvious source of this is small/niche commmunities revolving around a certain artist, musician, show, or other piece of obscure media, but this isn't the only data that falls victim to dissapearance. Moist Cr1tikal has talked many times about his original youtube videos that he made when he first started his channel and how he'd love to have those videos. And yet dispite his search for the past however many years, as well as the help of his community which is millions deep, his frist videos remain lost media.

Because of this, many people have taken to becoming what are known as "Data Hoarders." Archiving, and indexing massive ammounts of data just on the off chance that the media dissapears one day. The Internet Archive is probably the largest and well known example of data hoarding however, individuals and their own personal libraries are still immeasurably valubale, both as a source to add to the internet archive, but also to act as a decentralized backup for it, as the Internet Archive has faced copyright threats many times over its lifespan.

Getting Started

Getting started with "archiving all the media you like" can definitely seem like a daunting task, so here's a simple way to get started. It's a simple script you can setup to automatically download youtube videos from a playlist, meaning you can link either your liked videos playlist or a custom "backup" playlist, and then on a monhtly/weekly basis, you can run the script and it will back up any videos from that playlist that haven't been downloaded already.

To get started download yt-dlp from github, and download the latest exe file. Then open your command prompt (press start and type in cmd) and type "yt-dlp --version" to verify the file is installed correctly. Next open a text editor like notepad or notepad++ and paste the following script:

@echo off
:: Set the download directory (modify this to your preferred directory)
set "DOWNLOAD_DIR=M:\YT\%%(channel)s\%%(title)s.%%(ext)s"
:: Set the playlist URL (replace the placeholder with your playlist link)
set "PLAYLIST_URL=https://www.youtube.com/yourplaylistlinkhere"
:: Run yt-dlp to download new videos
yt-dlp.exe -f "bestvideo+bestaudio/best" -o "%DOWNLOAD_DIR%" --download-archive archive.txt "%PLAYLIST_URL%"
:: Exit the script
exit

This is what's known as a batch file, and is a script used to run commands in a specific order inside of windows. All of the lines that start with "::" are comments and explain what each corresponding line of code does. To save it make sure you set the file to "All files" (or select batch file if you're using notepad++) and save it with any name, ending the file with .bat (you don't need to do this with notepad++ as selecting batch file will set the extension automatically). Giving you a file that looks like "youtubeDownloader.bat"

FAQ


>> How Can I store that much data?

You have two general types of storage at your disposal, however best practice reccomends you use both Physical Storage as well as Cloud Storage to guarentee saftey of your backups. Generally, people go by what is known as the 3-2-1 Rule: Have at least 3 copies of your data, 2 of which are local, and 1 of which is off-site. For a home setup this would likely be your main storage, a backup of your main storage, and then a second backup either at a second location, or in a cloud.

The idea behind this is if something happens to the physical location itself (i.e. a flood or fire), the on-site backup of your data is likely to be damaged/destroyed, and so you have the off-site copy to restore from.


>> What Physical Storage Should I use?

In terms of physical storage you have 2 options as well: Hard Drives, more commonly referred as HDDs, and Solid State Drives, or SSDs. While SSDs are certainly important pieces of equipment, when it comes to the large storage seen with datahoarding, HDDs will almost always be the preferred choice.

The biggest reason for this, is price. While you don't need a full 500TB setup to start datahoarding, you'll find out that storage costs add up pretty quickly. SSDs are great for data you load frequently, or need to load quickly (especially when it comes to things like gaming or video editing) but for pure storage, they offer very little benefit, especially when you take their increased cost into consideration.

Once you start getting into terrabytes of data that your storing, you can start looking into setting up a proper NAS, or Network Attatched Server, which is the most common way of locally storing massive amounts of data for individuals. But you don't have to worry about that if you're just starting out, and if you're already ready for it, you can read more about it here.


>> What Cloud Storage Should I use?

The Google Workspace (formerly G Suite) costs $10 a month for the unlimited plan (its labeled as being 1TB however, at the time of writing, this limit is still not enforced) with the only limit being around 700GB per day uploading and around 10TB per day downloading. This is by far the best choice, with it still offering Rclone support, which is a very popular file managing command line that is specialized for cloud services.