Skip to content

This project involves analyzing the data obtained from the IMDb website using web scraping method and predicting the ratings of movies with a series of linear regression models.

License

Notifications You must be signed in to change notification settings

hkolatan/DS_Bootcamp_IMDb_Movie_Scraping-Analysis

Repository files navigation

IMDb Movie Analysis Project

This project involves analyzing the data obtained from the IMDb website using web scraping method and predicting the ratings of movies with a series of linear regression models.

Data Sources

  • IMDb: We will scrape movie data from the IMDb website, which is one of the most comprehensive sources for movie information.

Tools and Techniques

  • Web scraping with beautifulsoup: We will use this Python library to extract data from the IMDb website.
  • Selenium: We will use this Python library to automate the scraping process and make it more efficient.
  • EDA: We will perform exploratory data analysis to gain insights into the data and identify patterns.
  • Linear regression: We will use linear regression to build a predictive model that can recommend movies based on our viewing history.
  • Feature engineering: We will create new features from the existing data to improve the accuracy of our predictive model.

Deliverables

  • Presentation File: We will create a visual and oral presentation to showcase our project and findings.
  • Project Repository: We will create a GitHub repository to share our code and project details.
  • Blog Post: We will publish a blog post on the internet (e.g. Medium) to share our project and findings with the broader data science community.

Conclusion

This project aims to provide insights into what kind of movies we would love to watch as a team based on our Netflix view history and IMDb data. We will use a combination of web scraping, exploratory data analysis, linear regression, and feature engineering to build a predictive model. Our project will showcase our data science skills and provide us with valuable experience in working with real-world datasets.

License

This project is licensed under the MIT License.

Contributors

Muhammed Maral
Halil Kolatan

Contact

Please feel free to contact the project team for any questions or feedback.

About

This project involves analyzing the data obtained from the IMDb website using web scraping method and predicting the ratings of movies with a series of linear regression models.

Topics

Resources

License

Stars

Watchers

Forks