Commit Graph

285 Commits

Author SHA1 Message Date
sebdelsol f5b47dbdd5 forgot truncate 2022-04-04 13:22:28 +02:00
sebdelsol 8084b3f9c4 some headless evasions 2022-04-04 13:20:25 +02:00
sebdelsol 843173827a fix find_chrome_executable() on x86 Windows 2022-03-18 16:11:55 +01:00
sebdelsol 0bf986ee8b fix corrupt prefs when the file already exists 2022-03-18 15:41:53 +01:00
UltrafunkAmsterdam b2e804e977 quickfixes 2022-03-16 23:08:43 +01:00
Leon 174554c600
3.1.5r3
fixes fixes and more fixes, and a big thank you to @sebdelsol
2022-03-16 22:39:32 +01:00
sebdelsol 4879698118 - fix unlinking driver at exit
- speedup exit process
 - fix creation of driver in multithreaded scenario
 - experimental_option now supports "nested" string  (eg: example: options.add_experimental_option("prefs": {"profile.default_content_setting_values.images": 2 })   )

Author:    sebdelsol <seb.morin@gmail.com>
Author:    UltrafunkAmsterdam
2022-03-16 22:24:05 +01:00
sebdelsol 5c467b31eb fix unlinking at exit and fix driver creation file handling for multithread
Author:    UltrafunkAmsterdam<ultrafunkamsterdam@users.noreply.github.com>
Author:    sebdelsol <seb.morin@gmail.com>
2022-03-16 21:56:17 +01:00
Leon fdd8e3c705
Merge pull request #543 from ultrafunkamsterdam/3.1.5
3.1.5
2022-03-14 00:40:55 +01:00
Leon 5c0d2e4cb8
Update __init__.py 2022-03-14 00:37:12 +01:00
UltrafunkAmsterdam fa007b1742 added quic test cloudflare script for windows 2022-03-14 00:23:09 +01:00
UltrafunkAmsterdam a448fc685d Merge branch '3.1.5' of http://github.com/ultrafunkamsterdam/undetected-chromedriver into 3.1.5 2022-03-14 00:21:26 +01:00
UltrafunkAmsterdam bf1cf1bc14 changed the way how patcher works (for those using multiple sessions/processes).
when not specifying a executable_path (the default, and recommended!), the filename
    gets randomized to <somehex>_chromedriver[.exe]. this should fix the issue for multiprocessing
    (although Chrome/driver itself has restrictions in this as well, see it using processhacker).
    As i told before, webdriver is a purely io-based operation which only sends and pulls data. multiprocessing/threading isn't going to help much. You'd better use asyncio.)

    added google-chrome-stable to the list, as some distro's have this name.

 Chrome(advanced_elements)bool, optional, default: False

        makes it easier to recognize elements like you know them from html/browser inspection, especially when working in an interactive environment

        default webelement repr:
        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>

        advanced webelement repr
        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

    note: when retrieving large amounts of elements ( example: find_elements_by_tag("*") ) and **print** them, it does take a little more time for all the repr's to fetch

**driver_executable_path=None**
also known as executable_path
    if you really need to specify your own chromedriver binary.
    (don't log issues when you are not using the default. the downloading per session happens for a reason. remember this is a detection-focussed fork)

**browser_executable_path=None**
        ( = browser binary path )
    to specify your browser in case you use exotic locations instead of the more default install folders

 **advanced_elements=False**
        if set to True, webelements get a nicer REPR showing. this is very convenient when working
        interactively (like ipython for example).

        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

        instead of

        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2",element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>
2022-03-14 00:20:35 +01:00
UltrafunkAmsterdam 2da29ae4f7 changed the way how patcher works (for those using multiple sessions/processes).
when not specifying a executable_path (the default, and recommended!), the filename
    gets randomized to <somehex>_chromedriver[.exe]. this should fix the issue for multiprocessing
    (although Chrome/driver itself has restrictions in this as well, see it using processhacker).
    As i told before, webdriver is a purely io-based operation which only sends and pulls data. multiprocessing/threading isn't going to help much. You'd better use asyncio.)

    added google-chrome-stable to the list, as some distro's have this name.

 Chrome(advanced_elements)bool, optional, default: False

        makes it easier to recognize elements like you know them from html/browser inspection, especially when working in an interactive environment

        default webelement repr:
        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>

        advanced webelement repr
        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

    note: when retrieving large amounts of elements ( example: find_elements_by_tag("*") ) and **print** them, it does take a little more time for all the repr's to fetch

**driver_executable_path=None**
also known as executable_path
    if you really need to specify your own chromedriver binary.
    (don't log issues when you are not using the default. the downloading per session happens for a reason. remember this is a detection-focussed fork)

**browser_executable_path=None**
        ( = browser binary path )
    to specify your browser in case you use exotic locations instead of the more default install folders

 **advanced_elements=False**
        if set to True, webelements get a nicer REPR showing. this is very convenient when working
        interactively (like ipython for example).

        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

        instead of

        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2",element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>
2022-03-14 00:06:59 +01:00
UltrafunkAmsterdam 7c25fff16e 3.1.5r2
changed the way how patcher works (for those using multiple sessions/processes).

    when not specifying a executable_path (the default, and recommended!), the filename
    gets randomized to <somehex>_chromedriver[.exe]. this should fix the issue for multiprocessing
    (although Chrome/driver itself has restrictions in this as well, see it using processhacker).
    As i told before, webdriver is a purely io-based operation which only sends and pulls data. multiprocessing/threading isn't going to help much. You'd better use asyncio.)

find_chrome_executable:
    added google-chrome-stable to the list, as some distro's have this name.

advanced_webelements:  bool, optional, default: False
        makes it easier to recognize elements like you know them from html/browser inspection, especially when working in an interactive environment

        default webelement repr:
        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>

        advanced webelement repr
        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

    note: when retrieving large amounts of elements ( example: find_elements_by_tag("*") ) and **print** them, it does take a little more time for all the repr's to fetch

Chrome() parameters

    driver_executable_path=None
     ( = executable_path )
    if you really need to specify your own chromedriver binary.

    (don't log issues when you are not using the default. the downloading per session happens for a reason. remember this is a detection-focussed fork)

    browser_executable_path=None
        ( = browser binary path )
    to specify your browser in case you use exotic locations instead of the more default install folders

    advanced_elements=False
        if set to True, webelements get a nicer REPR showing. this is very convenient when working
        interactively (like ipython for example).

        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

        instead of

        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>
2022-03-14 00:05:27 +01:00
UltrafunkAmsterdam 4cf3eb70ac 3.1.5r1
changed the way how patcher works (for those using multiple sessions/processes).

    when not specifying a executable_path (the default, and recommended!), the filename
    gets randomized to <somehex>_chromedriver[.exe]. this should fix the issue for multiprocessing
    (although Chrome/driver itself has restrictions in this as well, see it using processhacker).
    As i told before, webdriver is a purely io-based operation which only sends and pulls data. multiprocessing/threading isn't going to help much. You'd better use asyncio.)

find_chrome_executable:
    added google-chrome-stable to the list, as some distro's have this name.

advanced_webelements:  bool, optional, default: False
        makes it easier to recognize elements like you know them from html/browser inspection, especially when working in an interactive environment

        default webelement repr:
        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>

        advanced webelement repr
        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

    note: when retrieving large amounts of elements ( example: find_elements_by_tag("*") ) and **print** them, it does take a little more time for all the repr's to fetch

Chrome() parameters

    driver_executable_path=None
     ( = executable_path )
    if you really need to specify your own chromedriver binary.

    (don't log issues when you are not using the default. the downloading per session happens for a reason. remember this is a detection-focussed fork)

    browser_executable_path=None
        ( = browser binary path )
    to specify your browser in case you use exotic locations instead of the more default install folders

    advanced_elements=False
        if set to True, webelements get a nicer REPR showing. this is very convenient when working
        interactively (like ipython for example).

        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

        instead of

        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>
2022-03-13 23:49:25 +01:00
UltrafunkAmsterdam b876db7e9a 3.15
changed the way how patcher works (for those using multiple sessions/processes).

    when not specifying a executable_path (the default, and recommended!), the filename
    gets randomized to <somehex>_chromedriver[.exe]. this should fix the issue for multiprocessing
    (although Chrome/driver itself has restrictions in this as well, see it using processhacker).
    As i told before, webdriver is a purely io-based operation which only sends and pulls data. multiprocessing/threading isn't going to help much. You'd better use asyncio.)

find_chrome_executable:
    added google-chrome-stable to the list, as some distro's have this name.

advanced_webelements:  bool, optional, default: False
        makes it easier to recognize elements like you know them from html/browser inspection, especially when working in an interactive environment

        default webelement repr:
        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>

        advanced webelement repr
        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

    note: when retrieving large amounts of elements ( example: find_elements_by_tag("*") ) and **print** them, it does take a little more time for all the repr's to fetch

Chrome() parameters

    driver_executable_path=None
     ( = executable_path )
    if you really need to specify your own chromedriver binary.

    (don't log issues when you are not using the default. the downloading per session happens for a reason. remember this is a detection-focussed fork)

    browser_executable_path=None
        ( = browser binary path )
    to specify your browser in case you use exotic locations instead of the more default install folders

    advanced_elements=False
        if set to True, webelements get a nicer REPR showing. this is very convenient when working
        interactively (like ipython for example).

        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

        instead of

        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>
2022-03-13 23:47:57 +01:00
UltrafunkAmsterdam a6cf33b0e2 3.15
changed the way how patcher works (for those using multiple sessions/processes).

    when not specifying a executable_path (the default, and recommended!), the filename
    gets randomized to <somehex>_chromedriver[.exe]. this should fix the issue for multiprocessing
    (although Chrome/driver itself has restrictions in this as well, see it using processhacker).
    As i told before, webdriver is a purely io-based operation which only sends and pulls data. multiprocessing/threading isn't going to help much. You'd better use asyncio.)

find_chrome_executable:
    added google-chrome-stable to the list, as some distro's have this name.

advanced_webelements:  bool, optional, default: False
        makes it easier to recognize elements like you know them from html/browser inspection, especially when working in an interactive environment

        default webelement repr:
        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>

        advanced webelement repr
        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

    note: when retrieving large amounts of elements ( example: find_elements_by_tag("*") ) and **print** them, it does take a little more time for all the repr's to fetch

Chrome() parameters

    driver_executable_path=None
     ( = executable_path )
    if you really need to specify your own chromedriver binary.

    (don't log issues when you are not using the default. the downloading per session happens for a reason. remember this is a detection-focussed fork)

    browser_executable_path=None
        ( = browser binary path )
    to specify your browser in case you use exotic locations instead of the more default install folders

    advanced_elements=False
        if set to True, webelements get a nicer REPR showing. this is very convenient when working
        interactively (like ipython for example).

        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

        instead of

        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>
2022-03-13 23:43:11 +01:00
UltrafunkAmsterdam 087fa8d732 Patcher:
changed the way how patcher works (for those using multiple sessions/processes).

    when not specifying a executable_path (the default, and recommended!), the filename
    gets randomized to <somehex>_chromedriver[.exe]. this should fix the issue for multiprocessing
    (although Chrome/driver itself has restrictions in this as well, see it using processhacker).
    As i told before, webdriver is a purely io-based operation which only sends and pulls data. multiprocessing/threading isn't going to help much. You'd better use asyncio.)

find_chrome_executable:
    added google-chrome-stable to the list, as some distro's have this name.

 advanced_webelements:  bool, optional, default: False
        makes it easier to recognize elements like you know them from html/browser inspection, especially when working in an interactive environment

        default webelement repr:
        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>

        advanced webelement repr
        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

    note: when retrieving large amounts of elements ( example: find_elements_by_tag("*") ) and **print** them, it does take a little more time for all the repr's to fetch

Chrome() parameters

    driver_executable_path=None
     ( = executable_path )
    if you really need to specify your own chromedriver binary.

    (don't log issues when you are not using the default. the downloading per session happens for a reason. remember this is a detection-focussed fork)

    browser_executable_path=None
        ( = browser binary path )
    to specify your browser in case you use exotic locations instead of the more default install folders

    advanced_elements=False
        if set to True, webelements get a nicer REPR showing. this is very convenient when working
        interactively (like ipython for example).

        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

        instead of

        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>
2022-03-13 23:22:41 +01:00
UltrafunkAmsterdam 2710213a7e Patcher:
changed the way how patcher works (for those using multiple sessions/processes).

    when not specifying a executable_path (the default, and recommended!), the filename
    gets randomized to <somehex>_chromedriver[.exe]. this should fix the issue for multiprocessing
    (although Chrome/driver itself has restrictions in this as well, see it using processhacker).
    As i told before, webdriver is a purely io-based operation which only sends and pulls data. multiprocessing/threading isn't going to help much. You'd better use asyncio.)

find_chrome_executable:
    added google-chrome-stable to the list, as some distro's have this name.

 advanced_webelements:  bool, optional, default: False
        makes it easier to recognize elements like you know them from html/browser inspection, especially when working in an interactive environment

        default webelement repr:
        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>

        advanced webelement repr
        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

    note: when retrieving large amounts of elements ( example: find_elements_by_tag("*") ) and **print** them, it does take a little more time for all the repr's to fetch

Chrome() parameters

    driver_executable_path=None
     ( = executable_path )
    if you really need to specify your own chromedriver binary.

    (don't log issues when you are not using the default. the downloading per session happens for a reason. remember this is a detection-focussed fork)

    browser_executable_path=None
        ( = browser binary path )
    to specify your browser in case you use exotic locations instead of the more default install folders

    advanced_elements=False
        if set to True, webelements get a nicer REPR showing. this is very convenient when working
        interactively (like ipython for example).

        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

        instead of

        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>
2022-03-13 23:05:22 +01:00
Leon b13d94e08a
[win] quick and clean test to check cloudflare bypass
QUICK TEST FOR UNDETECTED-CHROMEDRIVER TO CHECK IF CLOUDFLARE IAUAM CAN BE PASSED
   
To make it as clean as possible without interfering packages or plugins:
     - this creates a new python virtual environment
     - installs undetected chromedriver
     - executes a test
     - cleans up the virtual environment
2022-03-13 20:04:26 +01:00
Leon a4cc4a8b72
Update setup.py 2021-12-28 16:36:15 +01:00
Leon 8dfad76703
Merge pull request #323 from MattWaller/master
Update README.md
2021-12-26 12:34:15 +00:00
Leon 1044a7b767
Merge pull request #420 from brandfocus/master
Fix for Newlines are not allowed in setuptools Python 3.9.9
2021-12-26 12:32:35 +00:00
brandfocus 8568c8946d Fix for Newlines are not allowed in setuptools Python 3.9.9 2021-12-26 11:33:37 +00:00
Leon 33aa8c3905
Merge pull request #418 from ultrafunkamsterdam/bugfix
bugfix
2021-12-24 14:52:53 +00:00
Leon 1245e160c5
Update __init__.py 2021-12-24 14:51:15 +00:00
Leon 3bf4cdf7a9 3.1.2 - some 'bug' fixes 2021-12-24 15:39:16 +01:00
admin f9e9e77218 bug fix 2021-12-23 18:23:25 +01:00
Leon f93426478c
Update README.md 2021-12-23 17:51:30 +01:00
Leon f7306fad3e
bugfix spawn 2021-12-23 17:10:04 +01:00
Leon 403b491e76
Update setup.py 2021-12-22 15:15:55 +01:00
Leon 6fbd50ef21
Update setup.py 2021-12-22 15:12:54 +01:00
Leon 77509e6da2
3.1.0 2021-12-22 13:18:47 +00:00
UltrafunkAmsterdam 154f7fcdb3 3.1.0! 2021-12-22 14:07:27 +00:00
Leon abac314741
removed executable_path in favor of browser_executable_path
This makes it easier , when needed, in edge cases , to specify your browser executable.
2021-12-21 16:42:09 +00:00
Leon 8a3870bd6d
removed "delay" from constructor, added user_data_dir
simplify specifying a custom user_data_dir by passing it directly to the constructor.
however if a user_data_dir is specified in the options object,  the one in options will take precedence.
2021-12-21 16:31:04 +00:00
Leon e62ccc68b1
Update README.md 2021-12-16 06:13:31 +01:00
UltrafunkAmsterdam b60820a600 3.1.0rc1
-----------
  this version is for test purposes only and contains breaking changes
  -----------

  - v2 is now the "main/default" module. usage:

        import undetected_chromedriver as uc
        driver = uc.Chrome()
        driver.get('https://nowsecure.nl')

  - The above is the README for this version. or use the regular instructions, but
    skip the `with` black magic and skip references to v2.
  - v1 moved to _compat for now.
  - fixed wrong dependancies
  - ~~~~ added "new" anti-detection mechanic ~~~~

  - the above ^^ makes all recent changes and additions obsolete
  - Removed ChromeOptions black magic to fix compatiblity issues

  - restored .get() to (near) original.
       - most changes from 3.0.4 to 3.0.6 are obsolete, as t
       - no `with` statements needed anymore, although it will still
         work for the sake of backward-compatibility.
       - no sleeps, stop-start-sessions, delays, or async cdp black magic!
       - this will solve a lot of other "issues" as well.
  - test success to date: 100%
  - just to mention it another time, since some people have hard time reading:
    headless is still WIP. Raising issues is needless
2021-12-16 05:53:41 +01:00
Leon c1d02484d9
Update __init__.py 2021-11-29 14:40:10 +01:00
Leon a84c53f3d5
Update cdp.py 2021-11-29 14:39:21 +01:00
Leon 9e41928375
Update dprocess.py 2021-11-17 09:38:01 +01:00
Leon e7a2908e4c
Merge pull request #357 from ultrafunkamsterdam/3.0.4
3.0.4
2021-11-16 18:47:17 +01:00
UltrafunkAmsterdam ec49c0086b 3.0.4 2021-11-16 18:43:52 +01:00
UltrafunkAmsterdam 77a3c3020f ^ 2021-11-14 13:06:34 +01:00
UltrafunkAmsterdam 9f9bd66d79 ^ 2021-11-14 12:32:22 +01:00
Matthew Waller 2ba10800c7
Update README.md
fix enable_cdp_event to enable_cdp_events -- typo causes script to fail.
2021-10-01 00:04:32 -07:00
Leon 1e363b18be
Update README.md 2021-07-30 01:47:49 +02:00
Leon 9ad1bb3e0b
3.0.3 https://github.com/ultrafunkamsterdam/undetected-chromedriver/pull/255
read https://github.com/ultrafunkamsterdam/undetected-chromedriver/pull/255
2021-07-30 01:31:15 +02:00
UltrafunkAmsterdam 97288266bc 3.0.3 - fixed a bug where driver hangs long time on quit - and now passing executable_path explicitly now causes chromedriver to not redownload since some people have issues downloading 3mb but expect to build next-gen scrapers 2021-07-30 00:23:13 +02:00