PHILADELPHIA REFLECTIONS
The musings of a Philadelphia Physician who has served the community for nearly six decades

Related Topics

Computers and Websites
Much of the early development of the electronic computer took place in Philadelphia. We lost the lead, but it might return.

Website Development
The website technology supporting Philadelphia Reflections is PHP, MySQL and DHTML. The web hosting service is Internet Planners.

Regex URL Matching

On this site we check for the existence of a URL whenever an entry is updated

There are two key technologies at work

  • A PHP function that checks whether a URL is valid (thanks to marufit at gmail dot com in the PHP Manual)

  • Regex (regular expression) in a preg_replace_callback routine; this one is mine, all mine

function url_exists($url) 
{
// 
// checks whether a URL actually exists on the Internet
//
$handle   = curl_init($url);
if (false === $handle)
   {
    return false;
   }
curl_setopt($handle, CURLOPT_HEADER, false);
curl_setopt($handle, CURLOPT_FAILONERROR, true); 
curl_setopt($handle, CURLOPT_NOBODY, true);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, false);
$connectable = curl_exec($handle);
curl_close($handle);   
return $connectable;
}


function aExists($matches)
{
//
// function called by preg_replace_callback
//
// $matches[0] is the complete match
// $matches[1] the match for the first subpattern
//	enclosed in '(...)' and so on

//
// checks to see if a regular link exists
// something similar is done for img src= also
//

$srcURL = $matches[3];
		
if (url_exists($srcURL)) {do something; return "";}  
else {do something else; return "";}
}

$foo = preg_replace_callback(
            '/(.*?)(<a .*?href=")([^"]*)("[^>]*>)(.*?)(<\/a>)/i',
            "aExists",
            $source_string);

(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

(1347)

Please enter your comments here

Name

Comments

captcha image