Automated YouTube Archival

bash python3 youtube

This repository has been archived on 2023-11-11. You can view files and clone it, but cannot push or open issues or pull requests.

Go to file

Cyberes b84710b222 limit filename length, disable yt-dlp updating, resolve config paths		2023-05-06 18:24:47 -06:00
process	limit filename length, disable yt-dlp updating, resolve config paths	2023-05-06 18:24:47 -06:00
ydl	limit filename length, disable yt-dlp updating, resolve config paths	2023-05-06 18:24:47 -06:00
.gitignore	…
Example systemd Service.md	…
LICENSE	…
README.md	…
downloader.py	limit filename length, disable yt-dlp updating, resolve config paths	2023-05-06 18:24:47 -06:00
requirements.txt	…
targets.sample.txt	…
targets.sample.yaml	…

README.md

automated-youtube-dl

Automated YouTube Archival.

A wrapper for youtube-dl used for keeping very large amounts of data from YouTube in sync. It's designed to be simple and easy to use.

I have a single, very large playlist that I add any videos I like to. On my NAS is a service uses this program to download new videos (see [Example systemd Service.md]).

Features

Uses yt-dlp instead of youtube-dl.
Skip videos that are already downloaded which makes checking a playlist for new videos quick because youtube-dl doesn't have to fetch the entire playlist.
Automatically update yt-dlp on launch.
Download the videos in a format suitable for archiving:
- Complex format that balances video quality and file size.
- Embedding of metadata: chapters, thumbnail, english subtitles (automatic too), and YouTube metadata.
Log progress to a file.
Simple display using tqdm.
Limit the size of the downloaded videos.
Parallel downloads.
Daemon mode for running as a system service.

Installation

sudo apt update && sudo apt install ffmpeg atomicparsley phantomjs
pip install -r requirements.txt

Usage

./downloader.py <URL to download or path of a file containing the URLs of the videos to download> <output directory>

To run as a daemon, do:

/usr/bin/python3 /home/user/automated-youtube-dl/downloader.py --daemon --sleep 60 <url> <ouput folder>

--sleep is how many minutes to sleep after completing all downloads.

Folder Structure

Output Directory/
├─ logs/
│  ├─ youtube_dl-<UNIX timestamp>.log
│  ├─ youtube_dl-errors-<UNIX timestamp>.log
├─ download-archive.log
├─ Example Video.mkv

download-archive.log contains the videos that have already been downloaded. You can import videos you've already downloaded by adding their ID to this file.

Videos will be saved using this name format:

[%(id)s] [%(title)s] [%(uploader)s] [%(uploader_id)s]

Arguments

Argument	Flag	Help
`--no-update`	`-n`	Don't update yt-dlp at launch.
`--max-size`		Max allowed size of a video in MB. Default: 1100.
`--rm-cache`	`-r`	Delete the yt-dlp cache on start.
`--threads`		How many download processes to use (threads). Default is how many CPU cores you have. You will want to find a good value that doesn't overload your connection.
`--daemon`	`-d`	Run in daemon mode. Disables progress bars and sleeps for the amount of time specified in `--sleep`.
`--sleep`		How many minutes to sleep when in daemon mode.
`--silent`	`-s`	Don't print any error messages to the console. Errors will still be logged in the log files.
`--ignore-downloaded`	`-i`	Ignore videos that have been already downloaded and let youtube-dl handle everything. Videos will not be re-downloaded, but metadata will be updated.