Skip to content Skip to sidebar Skip to footer

Following The Information Using Scrapy In Nested Div And Span Tags

I am trying to make web crawler, using scrapy from python, that extracts the information that google shows in the right side when you make a search, for example: I want to extract

Solution 1:

xpath is not always a good approach to get data. Many times xpaths is changed accordingly to changed in DOM and even changed in every load.

And use these modules with scrapy when crawl famous websites.

  1. scrapy-rotating-proxies
  2. scrapy-user-agents

otherwiese google detect you request as robot request and block the page load.

The better way to find something on page by classes and id

(Note - you have to notice that class and id not changed on every load and on every query changed).

Post a Comment for "Following The Information Using Scrapy In Nested Div And Span Tags"