youtube-dl

Commit Graph

Author	SHA1	Message	Date
dirkf	a25e9f3c84	[compat] Use `compat_open()`	2023-07-25 13:19:43 +01:00
dirkf	b2ba24bb02	[InfoExtractor] Add `_match_valid_url()` class method and refactor * API compatible with yt-dlp * also support Sequence of patterns in _VALID_URL * one place to compile _VALID_URL * TODO: remove existing extractor shims	2023-07-19 22:14:50 +01:00
dirkf	b2741f2654	[InfoExtractor] Add search methods for Next/Nuxt.js from yt-dlp * add _search_nextjs_data(), from https://github.com/yt-dlp/yt-dlp/pull/1386 thanks selfisekai * add _search_nuxt_data(), from https://github.com/yt-dlp/yt-dlp/pull/1921, thanks Lesmiscore, pukkandan * add tests for the above * also fix HTML5 type recognition and tests, from `222a230871`, thanks Lesmiscore * update extractors in PR using above, fix tests.	2023-07-19 22:14:50 +01:00
dirkf	1e8ccdd2eb	[InfoExtractor] Support groups in _`search_regex()`, etc	2023-07-19 22:14:50 +01:00
dirkf	42b098dd79	[InfoExtractor] Handle unquoted values in OpenGraph searches	2023-02-14 02:53:16 +00:00
dirkf	604762a9f8	[common:jwplayer] Improve jwplayer extraction and parsing (#31000 ) * don't crash parser if jwplayer_data is invalid (empty, or no formats) * use `label` in `sources[n]` as `format_id` * relax `jwplayer().setup(...)` RE (also rework PR #27274 enhancement) * detect more manifest formats in _parse_jwplayer_formats() (from PR #29596) * improve metadata extraction (from PR #25433) * remember URLs in a set * use parse_resolution() in format * extract filesize in format (from yt-dlp) Co-authored-by: kikuyan <kikuyan@users.noreply.github.com> Co-authored-by: martin54 <martin54@users.noreply.github.com>	2022-11-11 00:49:13 +00:00
dirkf	11b284c81f	[Common:JWPlayer] Fix x1000 scaling error See https://github.com/yt-dlp/yt-dlp/issues/5106#issuecomment-1264625161	2022-10-11 12:36:44 +00:00
Sergey M․	70d0d4f9be	[compat] Use more conventional name for compat SimpleCookie	2021-04-06 14:22:28 +07:00
Remita Amine	162bf9e10a	[compat] add compat_SimpleCookie	2021-04-04 19:49:24 +01:00
Remita Amine	6beb1ac65b	[extractor/common] keep support for non standard JSON-LD VideoObject author values	2021-04-04 19:16:17 +01:00
Remita Amine	e165f5641f	[extractor/common] fix JSON-LD VideoObject author extraction	2021-04-04 16:28:26 +01:00
Remita Amine	1df2596f81	[extractor/common] fix _get_cookies method for python 2(#20673 , #23256 , #20326 , closes #28640 )	2021-04-03 07:54:16 +01:00
Sergey M․	477bff6906	Introduce release_timestamp meta field (refs #28386 )	2021-03-10 03:36:31 +07:00
Remita Amine	67299f23d8	[youtube] Rewrite Extractor - improve format sorting - remove unused code(swf parsing, ...) - fix series metadata extraction - fix trailer video extraction - improve error reporting - extract video location	2021-02-01 14:53:01 +01:00
Remita Amine	22feed08a1	[common] remove unwanted query params from unsigned akamai manifest URLs	2020-12-19 20:14:44 +01:00
Sergey M․	1727541315	[extractor/common] Improve JSON-LD interaction statistic extraction (refs #23306 )	2020-12-13 20:24:13 +07:00
Sergey M․	eae19a4473	[extractor/common] Document duration meta field for playlists	2020-12-13 16:53:23 +07:00
Sergey M․	5a1fbbf8b7	[extractor/common] Fix inline HTML5 media tags processing and add test (closes #27345 )	2020-12-09 00:05:21 +07:00
Sergey M․	91dd25fe1e	[extractor/common] Add support for dl8-* media tags (closes #27283 )	2020-12-07 01:08:22 +07:00
Sergey M․	06bf2ac20f	[extractor/common] Eliminate media tag name regex duplication	2020-12-07 00:56:29 +07:00
Sergey M․	6ad0d8781e	[extractor/common] Fix media type extraction for HTML5 media tags in start/end form	2020-12-07 00:45:16 +07:00
Remita Amine	da4304609d	[extractor/commons] improve Akamai HTTP formats extraction	2020-12-03 00:33:55 +01:00
Remita Amine	664dd8ba85	[extractor/common] improve Akamai HTTP format extraction - Allow m3u8 manifest without an additional audio format - Fix extraction for qualities starting with a number Solution provided by @nixxo based on: https://stackoverflow.com/a/5984688	2020-12-02 21:49:09 +01:00
Remita Amine	193422e12a	[extractor/common] add generic support for akamai http format extraction	2020-11-22 12:54:55 +01:00
Josh Soref	71ddc222ad	Fix typos (#27084 ) * spelling: authorization Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: brightcove Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: creation Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: exceeded Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: exception Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: extension Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: extracting Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: extraction Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: frontline Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: improve Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: length Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: listsubtitles Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: multimedia Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: obfuscated Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: partitioning Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: playlist Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: playlists Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: restriction Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: services Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: split Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: srmediathek Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: support Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: thumbnail Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: verification Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: whitespaces Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>	2020-11-21 22:00:05 +07:00
Sergey M․	c7178f0f7a	[extractor/common] Output error for invalid URLs in _is_valid_url (refs #21400 , refs #24151 , refs #25617 , refs #25618 , refs #25586 , refs #26068 , refs #27072 )	2020-11-18 23:31:35 +07:00
Sergey M․	ce5b904050	[extractor/common] Relax interaction count extraction in _json_ld	2020-09-19 06:33:17 +07:00
Sergey M․	ad06b99dd4	[extractor/common] Extract author as uploader for VideoObject in _json_ld	2020-09-19 06:13:42 +07:00
Sergey M․	f8c7bed133	[extractor/common] Handle ssl.CertificateError in _request_webpage (closes #26601 ) ssl.CertificateError is raised on some python versions <= 3.7.x	2020-09-18 03:41:16 +07:00
Sergey M․	6c22cee673	[extractor/common] Use compat_cookiejar_Cookie for _set_cookie (closes #23256 , closes #24776 ) To always ensure cookie name and value are bytestrings on python 2.	2020-05-05 06:00:37 +07:00
Sergey M․	4433bb0245	[extractor/common] Extract multiple JSON-LD entries	2020-05-02 23:40:30 +07:00
Sergey M․	13b08034b5	[extractor/common] Skip malformed ISM manifest XMLs while extracting ISM formats (#24667 )	2020-04-07 22:55:59 +07:00
Sergey M․	7947a1f7db	Remove no longer needed compat_str around geturl	2020-02-29 19:19:24 +07:00
Sergey M․	e2f8bf5888	[extractor/common] Convert ISM manifest to unicode before processing on python 2 (#24152 )	2020-02-29 17:29:30 +07:00
Remita Amine	5ef62fc4ce	[dailymotion] improve extraction - extract http formats included in m3u8 manifest - fix user extraction(closes #3553)(closes #21415) - add suport for User Authentication(closes #11491) - fix password protected videos extraction(closes #23176) - respect age limit option and family filter cookie value(closes #18437) - handle video url playlist query param - report alowed countries for geo-restricted videos	2019-11-26 22:18:21 +01:00
Sergey M․	7360c06fac	[extractor/common] Add data, headers and query to all major extract methods preserving standard order for potential future use	2019-11-16 05:55:54 +07:00
Remita Amine	f81dd65ba2	[extractor/common] clean jwplayer description HTML tags	2019-11-09 13:11:59 +01:00
Remita Amine	3ec86619e3	[common] initialize headers param with empty dict	2019-11-06 07:18:29 +01:00
Remita Amine	57033e35e5	[common] fix typo	2019-11-05 23:41:57 +01:00
Remita Amine	b6139cb0c3	[common] pass headers to _extract_(m3u8\|mpd)_formats methods	2019-11-05 22:56:25 +01:00
Sergey M․	25e911a968	[extractor/common] Make _is_valid_url more relaxed	2019-10-03 00:53:07 +07:00
Petr Vaněk	5e1c39ac85	[extractor/common] Fix typo in thumbnails resolution description (#21817 )	2019-07-17 22:47:53 +07:00
Sergey M․	f856816b94	[extractor/common] Strip src attribute for HTML5 entries code (closes #18485 , closes #21169 )	2019-05-23 23:52:11 +07:00
Sergey M․	ce2fe4c01c	[extractor/common] Add doc string for _apply_first_set_cookie_header	2019-05-20 23:23:18 +07:00
Sergey M․	e3c1266f49	[extractor/common] Move workaround for applying first Set-Cookie header into a separate method	2019-05-18 03:17:15 +07:00
Sergey M․	8ed7a23328	[extractor/common] Fix typo	2019-05-11 04:53:48 +07:00
Sergey M․	3089bc748c	Fix W504 and disable W503 (closes #20863 )	2019-05-11 03:57:40 +07:00
Remita Amine	c25720ef6a	[vimeo] add support live streams and improve info extraction(closes #19144 )	2019-04-21 17:20:52 +01:00
Sergey M․	d493f15c11	[extractor/common] Improve HTML5 entries extraction and add some realworld tests	2019-03-17 09:09:32 +07:00
Sergey M․	79d2077edc	[extractor/common] Fix url meta field for unfragmented DASH formats (closes #20346 )	2019-03-15 00:42:14 +07:00

1 2 3 4 5 ...

590 Commits