some final fixes

2023-03-14 22:42:23 -06:00 · 2023-03-14 22:42:23 -06:00 · 1abc01ef2f
parent b6f52310a4
commit 1abc01ef2f
2 changed files with 14 additions and 5 deletions
--- a/README.md
+++ b/README.md
@ -2,20 +2,28 @@

 _Ultra-high quality PDFs from VitalSource._

-This is an automated, all-in-one scraper to convert VitalSource textbooks into PDFs. Features include:
+This is an automated, all-in-one scraper to convert VitalSource textbooks into PDFs with no compromises. Features include:

 - Automated download of pages.
 - Automated OCR.
 - Correct page numbering (including Roman numerals at the beginning).
 - Table of contents creation.
- No funny stuff. No weird endpoints are used and no hacky scraping is preformed.
- Almost completly transparent. All actions are ones that a normal user would do.
+- No funny stuff. No weird endpoints and no hacky scraping.
+- Almost completely transparent. All actions are ones that a normal user would do.

-The goal of this project is for this to "just work." There are many other VitalSource scrapers out there that are weird, poorly
+The goal of this project is for it to "just work." There are many other VitalSource scrapers out there that are weird, poorly
 designed, or broken. I designed my scraper to be simple while producing the highest-quality PDF possible.

+**This only works with PDF books!** The URL must look something like this: https://bookshelf.vitalsource.com/reader/books/{isbn}/pageid/{page_id}
+
+**This URL format won't work!** https://bookshelf.vitalsource.com/reader/books/{isbn}/epubcfi/6/22[%3Bvnd.vst.idref%3Dt{author}{isbn}c00_02]!/4
+
+Maybe someday the scraper could be updated to work with more book formats...
+
 ## Install

+This program only works on Linux. You can use WSL on Windows.
+
 ```bash
 sudo apt install ocrmypdf jbig2dec
 pip install -r requirements.txt
--- a/vitalsource2pdf.py
+++ b/vitalsource2pdf.py
@ -145,7 +145,8 @@ if not args.skip_scrape or args.only_scrape_metadata:
    if not args.only_scrape_metadata:
        _, total_pages = get_num_pages()

-        print('You specified a start page so ignore the very large page count.')
+        if args.start_page > 0:
+            print('You specified a start page so ignore the very large page count.')
        total_pages = 99999999999999999 if args.start_page > 0 else total_pages

        print('Total number of pages:', total_pages)