Online Web Scraping & Data Extraction Tool | URLitor

URLitor - Web Scraping & Data Extraction tool

Simply add a list of URLs, specify the HTML element you want to retrieve from the page and hit submit. That's it, say bye to copy and paste from the browser.

xPath is a language for finding information in an XML document. xPath uses expressions to select nodes or node-sets in an XML document. The expressions look very similar to the expressions that you see when dealing with traditional computer file systems. You can use xPath within a lot of programming languages, but I have build this tool to allow users without programming language knowledge to extract (scrape) information from different XML and html pages on the web. In order to make things easier, I have predefined some of the most commonly used xPath expressions you can choose from the drop-down menu. Users who are familiar with xPath can add their custom xPath expressions. The tool can scrape a maximum of 100 URLs per time and accepts up to 10 xPath expressions. Here are some custom expressions you can either add or modify:
xPath Expression Description
//div[2] Selects the second div hierarchically.
//link[@rel='canonical']/@href Selects the location (ref) of the tag that has set the rel atrribute equal to canonical.
/html/head/meta[@name='description']/@content Selects the content at
//*[@class='class-name'] Selects all elements that have 'class-name' as CSS class.
//h2 | //title Selects the first H2 and the page title.
//*[name()='h1' or name()='title'] Selects the first H2 and the page title (same as above).
//*[contains(@class, 'thumb')] Selects all elements that have CSS class which contains 'thumb'.
//parent::*[text()='Welcome'] Selects the parent of the element that has 'Welcome' as text.
Please note that the web scraper has just been launched as a Beta version and might be error-prone. If you encounter any issues or have suggestions for improvement to make, please contact me.