Commit Graph

21 Commits

Author SHA1 Message Date
unknown e55104be8d added more compatibility for multi thtreads / processing using Chrome(user_multi_procs=True) => ensure you have at least 1 undetected_chromedriver in the roaming appdata/undetected_chromedriver 2023-05-09 22:08:53 +02:00
UltrafunkAmsterdam ca5fe635b9 patch referenced before assignment 2023-02-10 19:19:10 +01:00
UltrafunkAmsterdam 5b636cb768 3.4.5
damn versioning

    **patch to fix headless mode**
    currently headless undetected:
    https://i.imgur.com/CME9ElR.png

    (but still unsupported!)

    -https://stackoverflow.com/a/73840130/7058266
    -https://support.google.com/chrome/a/answer/7679408#hdlssMod110

    thanks @mdmintz for this info
2023-02-08 17:48:52 +01:00
UltrafunkAmsterdam 6e471aaac2 PATCH WEDNESDAY
3.4.5
damn versioning

    **patch to fix headless mode**

    -https://stackoverflow.com/a/73840130/7058266
    -https://support.google.com/chrome/a/answer/7679408#hdlssMod110

    thanks @mdmintz for this info
2023-02-08 17:32:16 +01:00
UltrafunkAmsterdam 93adbcf0ef 4.4.5
**patch to fix headless mode**

-https://stackoverflow.com/a/73840130/7058266
-https://support.google.com/chrome/a/answer/7679408#hdlssMod110
thanks @mdmintz for the info
2023-02-08 17:08:44 +01:00
UltrafunkAmsterdam d3fe33fceb 3.4.4 - Fixed 2023-02-08 01:27:50 +01:00
UltrafunkAmsterdam 305803ca95 fix for linux find_elements: SyntaxError: missing ) after argument list 2023-02-05 18:37:28 +01:00
UltrafunkAmsterdam 166438cde2 fix for #1035 2023-02-05 15:14:39 +01:00
UltrafunkAmsterdam c12fcfc0a8 Oh yes, do wanna rockin' with the best
--------------------

https://youtu.be/kMjhrh_XDWk?t=48

--------------------

Big update! be careful as it -potentially- could break your code.

- rewritten the anti-detection mechanism
  instead of removing and renaming variables, we just keep them, but prevent them from being injected in the first place

- rewritten the file naming, to prevent ending up with 1000 of {randomstring}_chromedriver.exe 's
  instead it is just called undetected_chromedriver.exe

- cleanup
  removed compat,v2 files and tests folder
2023-02-04 22:02:46 +01:00
UltrafunkAmsterdam 3bce88f82c 3.2.0 2022-12-28 15:46:52 +01:00
UltrafunkAmsterdam 07abe814a6 more refactoring; fix bug that browser stays opened when script exits 2022-12-26 01:48:01 +01:00
Leon 5df8e00a5a
Merge pull request #643
Set a specific data_path for Patcher if the environment is AWS Lambda
2022-12-17 19:02:16 +00:00
UltrafunkAmsterdam 6adeb2d285 optimized import maze 2022-11-28 23:47:38 +01:00
AktanKasymaliev 444d9e4aba paht for aws lambda 2022-05-23 23:07:24 +06:00
sebdelsol 5c467b31eb fix unlinking at exit and fix driver creation file handling for multithread
Author:    UltrafunkAmsterdam<ultrafunkamsterdam@users.noreply.github.com>
Author:    sebdelsol <seb.morin@gmail.com>
2022-03-16 21:56:17 +01:00
UltrafunkAmsterdam b876db7e9a 3.15
changed the way how patcher works (for those using multiple sessions/processes).

    when not specifying a executable_path (the default, and recommended!), the filename
    gets randomized to <somehex>_chromedriver[.exe]. this should fix the issue for multiprocessing
    (although Chrome/driver itself has restrictions in this as well, see it using processhacker).
    As i told before, webdriver is a purely io-based operation which only sends and pulls data. multiprocessing/threading isn't going to help much. You'd better use asyncio.)

find_chrome_executable:
    added google-chrome-stable to the list, as some distro's have this name.

advanced_webelements:  bool, optional, default: False
        makes it easier to recognize elements like you know them from html/browser inspection, especially when working in an interactive environment

        default webelement repr:
        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>

        advanced webelement repr
        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

    note: when retrieving large amounts of elements ( example: find_elements_by_tag("*") ) and **print** them, it does take a little more time for all the repr's to fetch

Chrome() parameters

    driver_executable_path=None
     ( = executable_path )
    if you really need to specify your own chromedriver binary.

    (don't log issues when you are not using the default. the downloading per session happens for a reason. remember this is a detection-focussed fork)

    browser_executable_path=None
        ( = browser binary path )
    to specify your browser in case you use exotic locations instead of the more default install folders

    advanced_elements=False
        if set to True, webelements get a nicer REPR showing. this is very convenient when working
        interactively (like ipython for example).

        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

        instead of

        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>
2022-03-13 23:47:57 +01:00
UltrafunkAmsterdam 087fa8d732 Patcher:
changed the way how patcher works (for those using multiple sessions/processes).

    when not specifying a executable_path (the default, and recommended!), the filename
    gets randomized to <somehex>_chromedriver[.exe]. this should fix the issue for multiprocessing
    (although Chrome/driver itself has restrictions in this as well, see it using processhacker).
    As i told before, webdriver is a purely io-based operation which only sends and pulls data. multiprocessing/threading isn't going to help much. You'd better use asyncio.)

find_chrome_executable:
    added google-chrome-stable to the list, as some distro's have this name.

 advanced_webelements:  bool, optional, default: False
        makes it easier to recognize elements like you know them from html/browser inspection, especially when working in an interactive environment

        default webelement repr:
        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>

        advanced webelement repr
        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

    note: when retrieving large amounts of elements ( example: find_elements_by_tag("*") ) and **print** them, it does take a little more time for all the repr's to fetch

Chrome() parameters

    driver_executable_path=None
     ( = executable_path )
    if you really need to specify your own chromedriver binary.

    (don't log issues when you are not using the default. the downloading per session happens for a reason. remember this is a detection-focussed fork)

    browser_executable_path=None
        ( = browser binary path )
    to specify your browser in case you use exotic locations instead of the more default install folders

    advanced_elements=False
        if set to True, webelements get a nicer REPR showing. this is very convenient when working
        interactively (like ipython for example).

        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

        instead of

        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>
2022-03-13 23:22:41 +01:00
UltrafunkAmsterdam 2710213a7e Patcher:
changed the way how patcher works (for those using multiple sessions/processes).

    when not specifying a executable_path (the default, and recommended!), the filename
    gets randomized to <somehex>_chromedriver[.exe]. this should fix the issue for multiprocessing
    (although Chrome/driver itself has restrictions in this as well, see it using processhacker).
    As i told before, webdriver is a purely io-based operation which only sends and pulls data. multiprocessing/threading isn't going to help much. You'd better use asyncio.)

find_chrome_executable:
    added google-chrome-stable to the list, as some distro's have this name.

 advanced_webelements:  bool, optional, default: False
        makes it easier to recognize elements like you know them from html/browser inspection, especially when working in an interactive environment

        default webelement repr:
        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>

        advanced webelement repr
        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

    note: when retrieving large amounts of elements ( example: find_elements_by_tag("*") ) and **print** them, it does take a little more time for all the repr's to fetch

Chrome() parameters

    driver_executable_path=None
     ( = executable_path )
    if you really need to specify your own chromedriver binary.

    (don't log issues when you are not using the default. the downloading per session happens for a reason. remember this is a detection-focussed fork)

    browser_executable_path=None
        ( = browser binary path )
    to specify your browser in case you use exotic locations instead of the more default install folders

    advanced_elements=False
        if set to True, webelements get a nicer REPR showing. this is very convenient when working
        interactively (like ipython for example).

        <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>

        instead of

        <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>
2022-03-13 23:05:22 +01:00
UltrafunkAmsterdam 97288266bc 3.0.3 - fixed a bug where driver hangs long time on quit - and now passing executable_path explicitly now causes chromedriver to not redownload since some people have issues downloading 3mb but expect to build next-gen scrapers 2021-07-30 00:23:13 +02:00
Leon e598e1ca1b
Quickfix 3.0.1
3.0.1 bugfix (bigfux)
2021-06-02 13:46:23 +02:00
Leon ebafbe1db6
3.0.0 (#180)
*3.0.0
added lots of features and bugfixes

- You can now subscribe to Chrome Devtools Protocol Events like networking.
- splitted the project up in seperate modules now
- fixed locale (accept-language)
- you can enter your user-data-folder as property of
    ChromeOptions() now.
- The ChromeOptions had a makeover, and i took the one from alpha 4,
    people having troubles with mobile emulation and other bullshit,
    can try again now.
- fixed the logic where sometimes options did not
    respect the given values
- for headless (though still not supperted for undetectability),
    added some real cool features which need to be set in
    the options object):

    defaults:
    emulate_touch = True
    mock_permissions = True  # headless had notificationpermissions
                                setup in a distinguisable way.
    mock_chrome_global = False
    mock_canvas_fp = True  # patch fingerprint

EXTENSIONS ARE NOT SUPPORTED BY CHROME IN HEADLESS MODE
YET. IF YOU WANT TO USE THEM, CREATE A PROFILE AND INSTALL
EXTENSIONS BY USING A REGULAR CHROME SESSION FIRST.
ALSO LOGIN TO GMAIL WHILE YOU'RE ON A GENUINE SESSION.

WHEN FINISHED, COPY THE USERDATA FOLDER OF CHROME TO SOME KNOWN
LOCATION (and make maye 2 copies?). BY HAVING GMAIL LOGGED IN
FIXES ALSO THE UNSAFE BROWSER MESSAGE FROM GOOGLE (AT LEAST FOR
ME IT WORKS)


* 2.2.2

* fixed a number of bugs
- specifying custom profile
- specifying custom binary path
- downloading, patching and storing now (if not explicity specified)
    happens in a writable folder, instead of the current working dir.

Committer: UltrafunkAmsterdam <UltrafunkAmsterdam@github>

* tidy up

* uncomment block

* - support for specifying and reusing the user profile folder.
    if a user-data-dir is specified, that folder will NOT be
    deleted on exit.
    example:
        options.add_argument('--user-data-dir=c:\\temp')

- uses a platform specific app data folder to store driver instead
    of the current workdir.

- impoved headless mode. fixed detection by notification perms.

- eliminates the "restore tabs" notification at startup

- added methods find_elements_by_text and find_element_by_text

- updated docs (partly)

-known issues:
    - extensions not running. this is due to the inner workings
        of chromedriver. still working on this.
    - driver window is not always closing along with a program exit.
    - MacOS: startup nag notifications. might be solved by
        re(using) a profile directory.

- known stuff:
    - some specific use cases, network conditions or behaviour
      can cause being detected.

* Squashed commit of the following:

commit 7ce8e7a236cbee770cb117145d4bf6dc245b936a
Author: ultrafunkamsterdam <info@blackhat-security.nl>
Date:   Fri Apr 30 18:22:39 2021 +0200

    readme change

commit f214dcf33f26f8b35616d7b61cf6dee656596c3f
Author: ultrafunkamsterdam <info@blackhat-security.nl>
Date:   Fri Apr 30 18:18:09 2021 +0200

    - make sure options cannot be reused as it will
        cause double and conflicting arguments to chrome



    - support for specifying and reusing the user profile folder.
        if a user-data-dir is specified, that folder will NOT be
        deleted on exit.
        example:
            options.add_argument('--user-data-dir=c:\\temp')

    - uses a platform specific app data folder to store driver instead
        of the current workdir.

    - impoved headless mode. fixed detection by notification perms.

    - eliminates the "restore tabs" notification at startup

    - added methods find_elements_by_text and find_element_by_text

    - updated docs (partly)

    -known issues:
        - extensions not running. this is due to the inner workings
            of chromedriver. still working on this.
        - driver window is not always closing along with a program exit.
        - MacOS: startup nag notifications. might be solved by
            re(using) a profile directory.

    - known stuff:
        - some specific use cases, network conditions or behaviour
          can cause being detected.
2021-05-24 10:26:02 +02:00