Skip to content

A Python command-line tool designed to collect TikTok data from Google's search results using SerpAPI.

Notifications You must be signed in to change notification settings

estebanpdl/tik-spyder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TikSpyder


TikSpyder is a command-line tool designed to collect TikTok data from Google's search results using SerpAPI and yt-dlp for downloading TikTok videos. This tool utilizes Python's asynchronous capabilities and multithreading to enable efficient data collection and video downloading.



GitHub forks GitHub stars Open Source Made-with-python Twitter estebanpdl Buy Me A Coffee



🔍 Description

TikSpyder extracts TikTok video links from Google's search results and downloads the videos. It also supports storing and retrieving collected data in a SQLite database and exporting the data to CSV files.

Given the dynamic nature of search results and the constantly evolving landscape of TikTok's platform, it's important to note that the data collected by TikSpyder represents a sample rather than a comprehensive dataset. However, this sample can still be valuable for monitoring trends and identifying emerging narratives.

To get the most out of TikSpyder, it is recommended to test your query using Google's advanced search features. This can help refine your search query, improve the relevance of your results, and test specific keywords more effectively. By taking advantage of these features, you can ensure that you're collecting the most relevant data for your research or analysis.


🚀 Features

  • Collects TikTok video links using SerpAPI.
  • Collects and downloads thumbnails for TikTok videos.
  • Collects related content to the search query.
  • Stores collected data in a SQLite database.
  • Exports data to CSV files.
  • Downloads TikTok videos using yt-dlp.
  • Supports asynchronous and multithreaded downloading for improved performance.

⚙️ Requirements

  • Python >= 3.11.7
  • SerpAPI key
  • Install the required Python libraries listed in requirements.txt.

🔧 Installation

  1. Clone the repository
git clone https://github.com/estebanpdl/TikSpyder.git
cd TikSpyder
  1. Install the required packages
pip install -r requirements.txt

or

pip3 install -r requirements.txt
  1. Once you obtain an API key from SerpAPI, populate the config/config.ini file with the described values. Replace api_key_value with your API key.
[SerpAPI Key]
api_key = api_key_value

📚 Usage

python main.py [OPTIONS]

Command Line Arguments

python main.py --help

# or

python main.py -h
Help options:
  -h, --help        Show this help message and exit.

SerpAPI options:
  --q               The search term of phrase for which to retrieve TikTok data.
  --user            Specify a TikTok user to search for videos from.
  --google-domain   Defines the Google domain to use. It defaults to google.com.
  --gl              Defines the country to use for the search. Two-letter country code.
  --hl              Defines the language to use for the search. Two-letter language code.
  --cr              Defines one or multiple countries to limit the search to.
  --lr              Defines one or multiple languages to limit the search to.
  --depth           Depth of iterations to follow related content links.

Google advanced search options:
  --before          Limit results to posts published before the specified date. Format: YYYY-MM-DD.
  --after           Limit results to posts published after the specified date. Format: YYYY-MM-DD.

Optional arguments and parameters:
  -o , --output     Specify the output data path. By default, output is saved in the ./data/ directory with a timestamp as the filename.
  --download        Specify whether to download TikTok videos from SerpAPI response.
  --max-workers     Specify the maximum number of threads to use for downloading TikTok videos.

Search query

python main.py --q "F-16 AND Enemy AND (Ukraine OR Russia)" --gl us --hl en --after 2024-02-01 --before 2024-05-31 --output {output_directory}/ --download

# Note: Replace '{output_directory}' with the desired output path.

Explanation

The search query --q "F-16 AND Enemy AND (Ukraine OR Russia)" specifies that TikSpyder should search for TikTok videos related to the keywords "F-16", "Enemy", "Ukraine", and "Russia".

The --gl us option specifies that the search should be limited to the United States, while the --hl en option specifies that the search results should be in English.

The --after 2024-02-01 and --before 2024-05-31 options limit the search results to videos published between February 1, 2024 and May 31, 2024.

By default, collected data is saved in the ./data/ directory with a timestamp as the filename. To customize the output location, use the --output option followed by the desired directory path. For example, --output my_directory/ would save the data in the my_directory/ directory.

The --download option specifies that TikSpyder should download the TikTok videos associated with the search results.


☕ Support

If you find TikSpyder helpful, please consider buying me a coffee to support ongoing development and maintenance. Your donation will help me continue to improve the tool and add new features.

Buy Me A Coffee


📝 TODO

  • Image classification using AI multimodal models
  • Whisper integration
  • Network mapping
  • Handle duplicate link values
  • Explore doccano annotation tool
  • Ollama integration

About

A Python command-line tool designed to collect TikTok data from Google's search results using SerpAPI.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages