web.project package¶
Submodules¶
web.project.csv_database module¶
web.project.prod_extract module¶
web.project.web_crawler module¶
Web Crawler designed to find products and ratings for products developed and targeting seniors.
-
class
project.web_crawler.
WebCrawler
(url=None, about=None, sub_url=None, page=None, data=None, clean=False)[source]¶ Bases:
object
Web Crawler
-
cleanup
()[source]¶ Clean up csv files in the current directory, and saves them to csv folder. Returns:
self.clean: bool - file cleaned.
-
csv_to_database
()[source]¶ Returns extracted csv data to an SQL database. Returns:
self.clean: bool - file cleaned.
-
get_data
()[source]¶ Get the data that the webcrawler is parsing Returns:
self.data: string - page data.- Example:
>>> example_data = crawler.get_data()
-
get_description
()[source]¶ Get the description of the product, located within the targeted web page. Returns:
self.about: string - description of product.- Example:
>>> example_description = crawler.get_description()
Get the categories parsed within the webcrawler. Returns:
self.categories: list - list of categories within the navigation bar.- Example:
>>> example_categories = crawler.get_nav_categories()
Get the category links within the webcrawler. Returns:
self.catlinks: list - list of category links within the navigation bar.- Example:
>>> example_catlinks = crawler.get_nav_catlinks()
-
get_page
()[source]¶ Gets the page that the webcrawler is parsing data from. Returns:
self.page: string - the page of the url.- Example:
>>> example_page = crawler.get_page()
-
get_sub_url
()[source]¶ Gets the url that the webcrawler will be accessing. Returns:
url: string - the url.- Example:
>>> example_url = crawler.get_url()
-
get_url
()[source]¶ Gets the url that the webcrawler will be accessing. Returns:
url: string - the url.- Example:
>>> example_url = crawler.get_url()
-