JavaScript Web Scraper

CodeCanyon JavaScript Web Scraper 8598806
Script \ JavaScript \ Media
JavaScript \ Script \ Method \ Content \ JQuery \ Scraper \ Library \ Server



JavaScript - JavaScript Web Scraper 8598806 by farazkelhini @ CodeCanyon

Script \ JavaScript \ Media

JavaScript Web Scraper is a web application that helps you produce JavaScript codes to automatically get HTML from a web page on the internet and display it on your website or blog. For example, say you want to get Tennis rankings from cbssports.com, all you need to do is give the URL of that page and a CSS selector to the app; it then generates some code for you to put in the source code of your page, so whenever someone visits your page they get Tennis rankings directly from cbssports.com.

Typically, getting HTML from other websites is done using PHP, which slows down your server; JavaScript Web Scraper lets you do that with JavaScript without any pressure on your server. Additionally, there’s an option to fall back to traditional method via ajax.

You can easily do complex content extraction in a flexible manner; the content can be a full web page or a set of tags.

Note: This app assumes that you have a basic knowledge of HTML.

Cross-Browser and Cross-Platform Compatible
JavaScript Web Scraper generates codes that work perfectly across various browser versions and devices. Even older browsers like Internet Explorer 6 work well.

Installation
JavaScript Web Scraper relies on jQuery library to get and display HTML. You should include the jQuery library right before the closing body tag of your page, like the following (If you have already included jQuery in your page, you shouldn’t include it second time).

<script src="http://codecanyon.net//ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
<script>window.jQuery || document.write('<script src="http://codecanyon.net/item/javascript-web-scraper/js/jquery-1.11.1.min.js"></script>')</script>

The first line of code downloads the jQuery library from Google servers, if for some reason the library is not accessible, the second line of code loads the library from your js folder (remember to include a copy of jQuery library in your js folder).

Afterward, you just need to generate your code and paste it after the jQuery library.

Scraping a Webpage
In order to get a piece of HTML from a website, first you need to look at the source code of that page and see in which element the data you want is located. (To view the source code of a page in Firefox you can right click on the page and click “View Page Source” or “Inspect Element”, or you can use a tool like Firebug).

For example, say you want to get currency exchange rates from this page . If you look at the source code of the page you can see that the data you want is located inside this element.

<div id="contentR" class="grid-cell sidebar">
..
</div>

To scrap this element simply enter a CSS selector into the “HTML element” field to pinpoint the location of this element. In this case you can enter either #contentR or .grid-cell.sidebar and then generate the code.

Now you just need to paste the generated code after the jQuery library right before the closing body tag.

Retrieving Methods
JavaScript Web Scraper provides you with three methods; all these methods give you the same result. (note that you do NOT need know how these techniques work to scrap a webpage).

YQL – This method uses the Yahoo Query Language (YQL) Web Service, which enables the app to access Internet data with SQL-like commands. The hourly cap is 2,000 requests/hour per IP, which is more than enough for most applications. The YQL method might not work on some websites.

JSONP – Allows cross domain JSONP access. The requested data will be processed by a remote server and will be sent back to the browser in JSON format. It requires less code than the other methods. The downside of this method is that the server is not 100% uptime.

AJAX – This approach requires cooperation of a PHP script. It’s handy when you want your own server to handle the requests. Once your server gets the data, it will hand it to the JavaScript code via ajax. Both scripts will be automatically generated for you. You just need to copy and paste the files to your server and make sure the path in the JavaScript code correctly point to the PHP file.

Limitations
JavaScript Web Scraper cannot scrap certain types of content.

Videos – YouTube, Dailymotion and most other video sharing websites.

Dynamically Generated Content – Content which is dynamically generated based on certain criteria like geographic location of the user, or capabilities of the user’s device, or things that are visible only to logged in users.

Flash Based Content – The contents of flash files (the flash file itself can be retrieved).

Ajax – Content that is generated using AJAX technique.

Support
Upon purchasing this plugin, you will be provided excellent support by the script developer, nothing is outsourced.

FAQ
How can I get the entire page of a website?

Just leave the optional “HTML element” field empty. There are some restrictions though; JSONP and AJAX methods only get the contents of head and body tags, and the YQL method just gets the contents of body tag and removes any JavaScript code inside of it.

Why can’t I scrap the HTML of a particular page?

Check the limitations and see if any of them apply to that page. If not, try using a different CSS selector or select another method (e.g. AJAX). If it still doesn’t work, feel free to contact me.

Note
We assumes no responsibility for any abusive use of this software product and/or violation of any terms of usage of the grabbed web pages. If you decide to use this software product, do it with responsibility and make sure that you are allowed to display the grabbed HTML contents from the web page by checking its terms of usage. This software product is sold exclusively on codecanyon.net.

Changelog

Version 1.3 (December 21, 2014)

- The YQL method now fully supports HTML5 tags.

- Improved the performance of JSONP method.

Version 1.2 (November 14, 2014)

- Web scraper now works in WordPress.

Version 1.1 (October 28, 2014)

- Improved the documentation.

- Fixed a few minor bugs.

Last Update: 21 December 14; Compatible Browsers: IE6, IE7, IE8, IE9, IE10, IE11, Firefox, Safari, Opera, Chrome; Files Included: JavaScript JS, HTML, CSS; Software Version: jQuery.

Onyx - jQuery Instagram Gallery.

Keywords: get data, get html, grab site content, html, html extractor, html grabber, html ripper, html5, javascript, jquery, tag extractor, tag ripper, web extractor, web grabber, web ripper.