You can easily do complex content extraction in a flexible manner; the content can be a full web page or a set of tags.
Note: This app assumes that you have a basic knowledge of HTML.
Cross-Browser and Cross-Platform Compatible
The first line of code downloads the jQuery library from Google servers, if for some reason the library is not accessible, the second line of code loads the library from your js folder (remember to include a copy of jQuery library in your js folder).
Afterward, you just need to generate your code and paste it after the jQuery library.
Scraping a Webpage
In order to get a piece of HTML from a website, first you need to look at the source code of that page and see in which element the data you want is located. (To view the source code of a page in Firefox you can right click on the page and click “View Page Source” or “Inspect Element”, or you can use a tool like Firebug).
For example, say you want to get currency exchange rates from this page . If you look at the source code of the page you can see that the data you want is located inside this element.
<div id="contentR" class="grid-cell sidebar">
To scrap this element simply enter a CSS selector into the “HTML element” field to pinpoint the location of this element. In this case you can enter either #contentR or .grid-cell.sidebar and then generate the code.
Now you just need to paste the generated code after the jQuery library right before the closing body tag.
YQL – This method uses the Yahoo Query Language (YQL) Web Service, which enables the app to access Internet data with SQL-like commands. The hourly cap is 2,000 requests/hour per IP, which is more than enough for most applications. The YQL method might not work on some websites.
JSONP – Allows cross domain JSONP access. The requested data will be processed by a remote server and will be sent back to the browser in JSON format. It requires less code than the other methods. The downside of this method is that the server is not 100% uptime.
Videos – YouTube, Dailymotion and most other video sharing websites.
Dynamically Generated Content – Content which is dynamically generated based on certain criteria like geographic location of the user, or capabilities of the user’s device, or things that are visible only to logged in users.
Flash Based Content – The contents of flash files (the flash file itself can be retrieved).
Ajax – Content that is generated using AJAX technique.
Upon purchasing this plugin, you will be provided excellent support by the script developer, nothing is outsourced.
How can I get the entire page of a website?
Why can’t I scrap the HTML of a particular page?
Check the limitations and see if any of them apply to that page. If not, try using a different CSS selector or select another method (e.g. AJAX). If it still doesn’t work, feel free to contact me.
We assumes no responsibility for any abusive use of this software product and/or violation of any terms of usage of the grabbed web pages. If you decide to use this software product, do it with responsibility and make sure that you are allowed to display the grabbed HTML contents from the web page by checking its terms of usage. This software product is sold exclusively on codecanyon.net.
Version 1.3 (December 21, 2014)
- The YQL method now fully supports HTML5 tags.
- Improved the performance of JSONP method.
Version 1.2 (November 14, 2014)
- Web scraper now works in WordPress.
Version 1.1 (October 28, 2014)
- Improved the documentation.
- Fixed a few minor bugs.
Onyx - jQuery Instagram Gallery.