Torsel is a Python module designed to manage multiple Tor instances and automate web tasks using Selenium. It is particularly useful for web automation and web scraping tasks that require IP rotation to enhance anonymity.
This project is currently under development and subject to ongoing updates and enhancements. Please note that features and functionality may change as the project evolves. It hasn't been tested on macOS, so any feedback is welcome! If you're interested in collaborating, check out the Contributing section below
- Cross-Platform Support: Compatible with Linux, Windows and macOS.
- Automated IP Rotation: Seamlessly rotate IP addresses using multiple Tor instances.
- Web Scraping and Automation: Ideal for tasks that require anonymity.
- Easy Configuration: Automatically sets up, configures, and manages Tor instances.
- Integration with Selenium: Run your Selenium scripts with the added anonymity of Tor.
- Flexible and Advanced Cookie Management: Load and manage custom cookies across multiple instances with support for both simple and advanced mapping configurations.
- Bypassing IP-Based Restrictions: Torsel can help bypass some IP-based restrictions by rotating IP addresses through Tor nodes.
- Tor Exit Node Blocking: Be aware that some websites actively block traffic from Tor exit nodes, which may limit the effectiveness of this approach.
- Cookie Loading Limitations: Some sites may have restrictions that prevent successful cookie loading, loading cookies will not always work.
You can install Torsel directly from PyPI:
pip install torsel
On Linux machines make sure you have tor and chromium installed with the following command:
sudo apt install tor chromium
You need to have the Tor binary available to invoke the path pointing to it inside the Torsel object.
Here you can download the expert bundle with the Tor binaries.
This simple example scrapes the IP address 10 times, demonstrating IP rotation using Tor:
from torsel import Torsel
# Selenium function to invoke in the Torsel object
def collect_ip(driver, wait, EC, By):
driver.get("http://icanhazip.com")
wait.until(EC.text_to_be_present_in_element((By.TAG_NAME, "body"), "."))
ip_address = driver.find_element(By.TAG_NAME, "body").text.strip()
print(f"[+] Current Tor IP: {ip_address}")
# Torsel object
torsel = Torsel(headless=True, # Invoke Torsel in headless mode and run
tor_path="/usr/bin/tor", # path to executable
tor_data_dir="/tmp/tor_profiles") # tor profiles path dest
torsel.run(10, collect_ip)
For detailed examples on how to use Torsel, please refer to the examples directory.
- Detailed simple example (Single thread IP rotation)
- Verify Tor IP rotation with multithreading
- Script to analyze the frequency of IP usage
- Simple session cookie loading with a single instance and single URL
- Simple session cookie loading with multiple instances across the same URL
- Load and verify session cookies for two different URLs with a single instance
- Load and verify different session cookies for two URLs across multiple instances
Torsel is highly configurable to suit various use cases:
- total_instances: Number of Tor instances to create.
- max_threads: Maximum number of concurrent threads.
- tor_base_port: Starting port for Tor SOCKS connections.
- tor_control_base_port: Starting port for Tor control connections.
- tor_path: Path to the Tor executable.
- tor_data_dir: Directory to store Tor profile data.
- user_agent: Specifies the user_agent, if None a random one is selected.
- headless: Run Selenium in headless mode if
True
. - verbose: Enable detailed logging if
True
. - cookies_dir: Directory to store and load cookies (optional).
- cookies_mapping: A mapping of URLs to specific cookie files, allowing for advanced session management across multiple instances (optional).
Additionally, within the Selenium-related configurations, Torsel automatically handles the following parameters for functions declared within it:
- driver: Managed by Torsel and passed automatically to your function. No need to instantiate or manage it yourself.
- wait: An instance of WebDriverWait configured with a 10-second timeout, provided by Torsel.
- By: The By module from Selenium, used for locating elements (e.g., by ID, class name).
- EC (ExpectedConditions): The ExpectedConditions module from Selenium, used to define conditions like element visibility or text presence.
- action_num: The number of the current action being executed, provided automatically by Torsel.
- instance_num: The instance number of the Tor connection in use, passed automatically to your function.
- log: A logging function provided by Torsel to output messages during execution.
Hey! Any kind of contribution is welcome. Send PR if you have improvements or examples of use to contribute!
This project is licensed under the MIT License - see the LICENSE file for details.