Creating Anime Database With Web Scraping-Introduction
- Tejas Haritsa V K
- Aug 6, 2020
- 2 min read

Web Scraping with Python
Introduction
What is Web Scraping? Web scraping is an automated method used to extract large amounts of data from websites. The data on the websites are unstructured. Web scraping helps collect these unstructured data and store it in a structured form. Some other words used for Web Scraping are: Web Crawling, Web Data Extraction, Web Harvesting, etc.
After Hearing all that the next question on anyone’s mind would be
Is Web Scraping legal? In general Web Scraping by itself is not illegal afterall you can scrape a website for educational purposes. As a general rule of thumb any data obtained from webscraping from sites that do not allow web scraping cannot be used for commercial purposes as it results in violation of law and is therefore illegal. To know whether a website allows web scraping or not, you can look at the website’s “robots.txt” file. You can find this file by appending “/robots.txt” to the URL that you want to scrape. eg. “https://....YOUR URL HERE…./robots.txt”
Now with that out of the way lets get into it.
Overview
In this post we’ll be going through 1. How to inspect & analyze a web page for web scraping 2. How to get started with web scraping 3. Collect Unstructured Data in a Structured Format
After we are through with this post we’ll have an Anime database containing all the Anime currently available on the website, which includes: 1. Anime Title 2. Description 3. Current/Latest Season 4. Episodes Aired 5. Status 6. Initial Air Date 7. Genre 8. Sub/Dub 9. Series/Movie 10. URL
Requirements
1. Python 3 2. Jupyter Notebook 3. urllib 4. BeautifulSoup 5. regex 6. string 7. csv 8. time
Installation
* For Conda Users: 1. conda install jupyter notebook 2. conda install beautifulsoup4
* For PIP Users: 1. pip install jupyter notebook 2. pip install beautifulsoup4
Contents
1. Web Scraping Tutorial.ipynb (Notebook) — How to inspect & analyze a webpage for web scraping 2. Anime Website Data Scraping.ipynb (Notebook) — How to get started with web scraping & Collect Unstructured Data in a Structured Format
Link to the Notebooks:
Link to Part-II with Hands on Tutorial:
Hope this has been useful, Please leave a like if you liked this article.
תגובות