Category Archives: parsing
Extract Urls from a remote webpage using PHP
Scraping data from website is extremely popular now a days. I have written a simple website parser class to grab all the urls from a website. Shared the class below for all to see and fun. We will use the parser class below to extract all image sources and hyper links from a website. Uses:Create…
Render HTML file in Rails using Nokogiri
You may need to render static HTML template file from a controller action without modifying anything in to file. And, may need to replace some content during render. A template may contain relative path for css, image etc. which may raise an exception as rails may not route that automatically. At that case you have…
Extract href links from a website content using regular expression
One of my previous post I have discussed about to Grab website content using cURL. In this post I have given a sample code snippet to extract all hyper links from grabbed content using regular expression. By using following class you can grab site content and extract all hyper links: class ScrapWebsite{ var $target_url =…
Grab website content using cURL
A website content can grab easily using cURL by following the steps below: Enable cURL Copy following codes in to PHP file Change target URL Run the page from webserver $target_url = “http://www.morshed-alam.com”;$options = array( CURLOPT_RETURNTRANSFER => true, CURLOPT_HEADER => false, CURLOPT_FOLLOWLOCATION => true, CURLOPT_ENCODING => “”, CURLOPT_USERAGENT => “spider”, CURLOPT_AUTOREFERER => true, CURLOPT_CONNECTTIMEOUT =>…