Philadelphia Reflections

The musings of a physician who has served the community for over six decades

Volumes

Computers, Websites, and other Digital Gadgetry
What is novel today is old-hat tomorrow; but what is old-hat to someone today is still novel for someone else. These are our own thoughts about a variety of electronic novelties, for whoever finds them of interest.

George IV

The Age of the Philadelphia Computer
Computers have a long slow history. The computer industry, however, had an abrupt start and sudden decline, in Philadelphia.

Website Development

The website technology supporting Philadelphia Reflections is PHP, MySQL and DHTML. The web hosting service is Internet Planners. The development of this website has provided an opportunity to learn new technology, to try out different techniques for getting noticed by the search engines and the trials and tribulations of dealing with malicious hackers and spammers who range from the annoying to the abusive. This collection of articles documents some of our experiences and we hope that people surfing the web looking for solutions to problems we've encountered will benefit.

The primary purpose of this website is to deliver high quality content on the subjects of Philadelphia, Philadelphia History, medicine, medical economics and other subjects of interest to its author, Dr. George R. Fisher.

However, early in 2006 the site was attacked by spammers who broke in using security holes in the previous implementation of PHP. In the subsequent reconstruction of the site, there's been an opportunity to try out lots of new technology and techniques, some of which are detailed here.



XHTML vs. HTML

The markup language used by web browsers continues to evolve. The most current version (as of April 2009) is XHTML 1.1, an XML version of HTML.

Many browsers, most particularly IE, do not support XHTML. Technically speaking, they support only the "text/html" mime type, not "application/xhtml+xml". Lots of web developers have gone to the trouble of sticking closing tags ( />) in their BR, HR, META and INPUT tags and a DOCTYPE at the top but then serve the code as "text/html".

This produces a syntactic mish mash which may be worse than using strict HTML 4.01.

Why "worse"? Because of the possibility of unintended results from providing incorrect instructions to the browser. If you care about the output produced by the browser, which most developers and content providers emphatically do, then you have to be careful about what instructions you give the browser. You simply cannot count on getting what you want if what you're telling the browser to do is syntactically incorrect.

However, it's a little difficult to see just what good XHTML is:

Internet cognoscenti speak disparagingly of "tag soup" but the Internet is a lot more about content than it is about syntax, so who really cares?

Well, somehow, I do. A little. Since we use PHP on this site, we have the opportunity to figure out what features are supported by a browser and render the correct types of tags, mime-types, etc.

Check out the HTTP headers and the page source to see the following script in action:

  1. It renders XHTML 1.1 whenever it encounters a browser that can support it
  2. It uses output buffering (which demonstrably if illogically improves rendering response time)
  3. It sends the whole thing using gzip compression if the browser will support it
  4. But also, it concedes certain issues based on experience for the sake of a smoothly-operating website
<?php
//
//  This script figures out what kind of mime type (HTML vs XHTML) the browser supports and sends the correct headers
//  It also initiates compression, specifies cache-ing and sends other <meta http-equiv headers
// 
//  My thanks to http://www.workingwith.me.uk/articles/scripting/mimetypes for the basic idea and structure
// 
//  $_SERVER["ACCEPT"] describes the mime_types a browser supports in a comma-separated list:
// 
//    mime_type,mime_type,mime_type
// 
//  If a browser prefers one mime_type or group of mime_types, it adds a q-value
// 
//    mime_type,mime_type;q=x.x, mime_type,mime_type,mime_type,...,mime_type;q=x.x
// 
//  The q-value is a number between 0.0 and 1.0 ... the higher the number, the greater the preference
//  The idea is that if we can serve more than one mime_type we should serve the browser's higher preference
//
//  ob_start("ob_gzhandler"); does all the work to compress the output if the browser can handle it
//  ob_start("fix_code"); calls the "fix_code" function instead, so initiating gzip is my responsibility
//
//  $_SERVER["HTTP_USER_AGENT"] is an opaque decription of the browser itself
//
//  $_SERVER['HTTP_ACCEPT_ENCODING'] describes compression capabilities
//
//  I output these three variables as an HTML comment so I can debug things more easily
//
//  Despite my desire to do things "right", you will see I accomodate myself to the reality of user-supplied content 
//  and browser peculiarities in order to have a working website
//

function fix_code($buffer)
  {
  #
  # Called for HTML browsers to delete all the lovely close-brackets
  # it's up to me to initiate the gzipping because ob_start is called by "fix_code" instead of "ob_gzhandler"
  #
  if (stristr($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip'))
    {
    header("Content-Encoding: gzip"); // notifies the far-end to un-gzip 
    return (gzencode(str_replace(" />", ">", $buffer),6,FORCE_GZIP));
    }
    else
      {
      return (str_replace(" />", ">", $buffer));
      }
  }

#
# default values
#
$charset          = "UTF-8";       # See http://en.wikipedia.org/wiki/UTF-8
$mime             = "text/html";   # Plain vanilla
$cache_control    = "max-age=200"; # Cache expires after 200 seconds

$xhtml_q          = 0;
$html_q           = 0;

# see http://www.w3.org/QA/2002/04/valid-dtd-list.html
$DOCTYPE_xhtml11  = "<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.1//EN' 'http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd'>\n"; 
$DOCTYPE_xhtml10  = "<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN' 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'>\n";
$DOCTYPE_wap      = "<!DOCTYPE html PUBLIC '-//WAPFORUM//DTD XHTML Mobile 1.2//EN' 'http://www.openmobilealliance.org/tech/DTD/xhtml-mobile12.dtd'>\n";
$DOCTYPE_html401  = "<!DOCTYPE html PUBLIC '-//W3C//DTD HTML 4.01//EN' 'http://www.w3.org/TR/html4/strict.dtd'>\n";
$DOCTYPE_html401l = "<!DOCTYPE html PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'>\n";

$html_xhtml       = "<html xmlns='http://www.w3.org/1999/xhtml' xml:lang='en'>\n\n";
$html_iphone      = "<html xmlns='http://www.w3.org/1999/xhtml' xml:lang='en' manifest='iphone.manifest'>\n\n";
$html_html401     = "<html lang='en'>\n\n";
$html_html401_IE  = "<html lang='en' xmlns:v='urn:schemas-microsoft-com:vml'>\n\n";  # xmlns:v='urn:schemas-microsoft-com:vml' is recommended by Google for maps display using IE
$html_plain       = "<html>\n\n";

# parental control tag
$pics_Label       = '(pics-1.1 "http://www.icra.org/pics/vocabularyv03/" l 
	gen true for "http://philadelphia-reflections.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0) 
	gen true for "http://www.philadelphia-reflections.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0) 
	gen true for "http://search.freefind.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0) 
	gen true for "http://www.search.freefind.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0) 
	gen true for "http://statcounter.com" r (n 0 s 0! v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0) 
	gen true for "http://www.statcounter.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0) 
	gen true for "http://c3.statcounter.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0) 
	gen true for "http://www.c3.statcounter.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0))';

# I include the following HTML comment for my ongoing debugging purposes
$show_info        = "<!-- \nHTTP_USER_AGENT      $_SERVER[HTTP_USER_AGENT]\nHTTP_ACCEPT_ENCODING $_SERVER[HTTP_ACCEPT_ENCODING]\nHTTP_ACCEPT          $_SERVER[HTTP_ACCEPT]\n -->\n\n";

# note that I eval $prolog_type below so that the xml header (if any) gets the right charset
$prolog_type      = '$DOCTYPE_html401l $html_plain $show_info';

#
# the logic
# 

# W3C Validator
if (stristr($_SERVER["HTTP_USER_AGENT"],"W3C_Validator")) 
  {
  ob_start("ob_gzhandler");
  $mime        = "application/xhtml+xml";
    # UTF-8 produces character-type errors
    $charset     = "iso-8859-1";
  $prolog_type = '$xml_header $DOCTYPE_xhtml11 $html_xhtml $show_info';
  }
  else
    {
    # fancy wap-enabled handheld device
    if(stristr($_SERVER["HTTP_ACCEPT"],"application/vnd.wap.xhtml+xml")) 
      { 
      ob_start("ob_gzhandler");
        # per http://www.ready.mobi/ and http://www.w3.org/TR/mobileOK-basic10-tests/ application/xhtml+xml is preferred
//      $mime        = "application/vnd.wap.xhtml+xml";
        $mime        = "application/xhtml+xml";
      $prolog_type = '$xml_header $DOCTYPE_wap $html_plain $show_info';
      }
      else
        {
        # non-wap xhtml-enabled browser
        if(stristr($_SERVER["HTTP_ACCEPT"],"application/xhtml+xml")) 
          { 
          # retrieve the q values for "application/xhtml+xml" and "text/html"

          if (preg_match('%application/xhtml\+xml[^;]*?;q=([1|0]\.[1-9]+)%i', $_SERVER["HTTP_ACCEPT"], $matches)) 
            {
            $xhtml_q = (float)$matches[1];
            }

          if (preg_match('%text/html[^;]*?;q=([1|0]\.[1-9]+)%i', $_SERVER["HTTP_ACCEPT"], $matches)) 
            {
            $html_q = (float)$matches[1];
            }

          # if the q value for HTML is greater than for XHTML
          # then treat output as HTML 4.01 strict (Opera 9.64, for instance)

          if($html_q > $xhtml_q) 
            {
            ob_start("fix_code");
            $mime        = "text/html";
              # UTF-8 produces character-type errors
              $charset     = "iso-8859-1";
            $prolog_type = '$DOCTYPE_html401 $html_html401 $show_info';
            }

            # otherwise, go with XHTML
            else
              {
              ob_start("ob_gzhandler");
                # for the time-being application/xhtml+xml is too strict for us: unless your tags are PERFECT, it blows up
//              $mime        = "application/xhtml+xml";
                $mime        = "text/html";
                # UTF-8 produces character-type errors
                $charset = "iso-8859-1";

              # see "Safari Web Content Guide for iPhone OS" for cache manifest description
              if (stristr($_SERVER["HTTP_USER_AGENT"],"iPhone")) 
                {
                $prolog_type = '$xml_header $DOCTYPE_xhtml11  $html_iphone $show_info';
                }
                else
                 {
                  $prolog_type = '$xml_header $DOCTYPE_xhtml11  $html_xhtml $show_info';
                 }
              }
            }
          
          else
            {
            # plain text/html browser
            if(stristr($_SERVER["HTTP_ACCEPT"],"text/html")) 
              { 
              ob_start("fix_code");
              $mime        = "text/html";
                # UTF-8 produces character-type errors
                $charset     = "iso-8859-1";
              $prolog_type = '$DOCTYPE_html401 $html_html401 $show_info';
              }
              else
                {
                # if the browser doesn't specify any X/HTML mime type, treat like HTML 4.01 Transitional (IE 7, for instance)
                ob_start("fix_code");
                $mime        = "text/html";
                  # UTF-8 produces character-type errors
                  $charset     = "iso-8859-1";
                $prolog_type = '$DOCTYPE_html401l $html_plain $show_info';
                # if IE then include Google's recommended "xmlns:v  ..." 
                if(stristr($_SERVER["HTTP_USER_AGENT"],"MSIE")) 
                  {
                  $prolog_type = '$DOCTYPE_html401l $html_html401_IE $show_info';
                  }
                }
            }
        }
    }

#
# output the mime type, prolog type and other <meta http-equiv= variables
#
header("Content-Type: $mime; charset=$charset");
header("Content-Language: en-us");
header("Vary: Accept");

header("Cache-Control: $cache_control");

header("Content-Script-Type: text/javascript");
header("Content-Style-Type: text/css");
header("imagetoolbar: no");

// parental controls from http://www.icra.org/
header("pics-Label: $pics_Label");

// privacy header created at http://www.p3pwiz.com/
header("P3P: policyref=\"http://www.philadelphia-reflections.com/w3c/p3p.xml\", CP=\"NID DSP NOI COR\"");

$xml_header       = "<?xml version='1.0' encoding='$charset' ?>\n";
eval("\$prolog_type = \"$prolog_type\";");

print $prolog_type;
?>

Here's an interesting article on Doctype Switching: http://gutfeldt.ch/matthias/articles/doctypeswitch.html

The Philadelphia Reflections webmaster: George IV

(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

Webpage Printing

This site offers a Print button for all Reflections and Topics. Formatting the text on the pages to print nicely works quite well; but how to specify what to do with images remains a bit unclear (as of August 2006). Although 95% of users employ Internet Explorer because Microsoft supplies it free with new computers, IE is just about the worst browser to use for printing. Safari is much better, and Firefox is pretty good. Opera is also satisfactory, but Internet Explorer is not recommended. The other browsers are free; find them in Google and download them. For the usual user, that's all you have to know.

If you are curious about the technicalities, read on. The "trick", if it can be called that, to special print formatting is the media attribute for CSS styling. The main stylesheet for this website is called in a LINK statement as follows:

<link rel="stylesheet" type="text/css" media="all" href="stylesheets/reflectionsLayout.css">

The media attribute tells the browser to use this sylesheet for all media types, i.e., for screens and printers. In the pages that are formattted to print is a stylesheet that cascades below the main stylesheet and therefore supercedes it. This stylesheet controls the printing. IE seems to have its own views on font size so we use some conditional comments to coax it to our way of thinking.

Here and there throughout the website are pages that contain onscreen navigation ("jump to top" and that sort of thing). We hide them when printing by saying class="navstrip" which you can see will result in those elements being hidden.

The specification of

<body onload="window.print()">

(all lower case for XHTML purposes) is what forces the print dialog to appear.

The remaining problem is how to specify CSS formatting for images so that text flows around them as we want. The formatting seems to work on screen for all browsers but only on some browsers for printing.

<style type="text/css" media="print">

  body        { margin: 0; padding: 0; width: 100%; }				
				
    #wrapper    { margin: 0; padding: 0; width: 100%; }
		
      #center     { margin: 0; padding: 0; }

      #content    { font-size: 11pt; line-height: 100%; font-family: "Times New Roman", Times, serif; }

        .navstrip   { visibility: hidden; }
				
</style>

<!--[if IE]>
<style type="text/css" media="print">

      #content    { font-size: 14pt; }
				
</style>
<![endif]-->

(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

Floating Three-Column CSS Layout

A current fad in web page styling is to use CSS exclusively to define the basic page sections. The "old" way of doing this was to use tables, but that's no longer stylish. Instead, we are exhorted to use CSS exclusively.

A very common page layout has a head and a foot with three columns sandwiched in between. Philadelphia Reflections uses this layout.

Most descriptions of this layout style that I have found Googling around the Internet involve absolute positioning which very often does not adapt well to differing screen sizes and browser window sizes. What we use here makes use of floating columns, which re-size themselves very nicely.

Several anomalies and quirks should be noted:

These quirks and anaomalies make me think that maybe this either isn't quite kosher or else may be superceded by later CSS definitions. But for the time being, this works very happily and both the HTML and the CSS validate perfectly well.

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
  <head>
    <title> Floating Three-column CSS example</title>

    <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">

<style type="text/css">


 #head {
  background-color:blue;
  color:white;
  text-align:center;
  }

 #wrap {
  }

 #left {
  float:left;
  width: 30%;
  }

 #right {
  float:right;
  width:30%;
  }

 #center {
  }

 #clear {
  clear:both;
  }

 #foot {
  background-color:red;
  color:white;
  text-align:center;
  }

</style>
  </head>

  <body>

    <div id="head">
      <p>Head</p>
    </div>

    <div id="wrap">

      <div id="left">
        <p>Left</p>
      </div>

      <div id="right">
        <p>Right</p>
      </div>

      <div id="center">
        <p>Center</p>
      </div>

      <div id="clear"></div>

    </div>

    <div id="foot">
      <p>Foot</p>
    </div>

  </body>
</html>

Editing HTML with PHP scripts

When creating scripts that allow a user to edit HTML, you have to ensure that the browser doesn't confuse the input with HTML to be rendered. I struggled with this long and hard and throughout the utilities section of this website are various hacks that I created with brute force. They work, but they are mostly ugly and all were time consuming.

Well, guess what? The PHP manual has a section on this subject and the solution is really rather elegant. Chaper 56. PHP and HTML. It's worth reading, but the essential bits are reproduced below:


Example 56-1. A hidden HTML form element
<?php
   echo "<input type='hidden' value='" . htmlspecialchars($data) . "' />\n";
?> 

Example 56-2. Data to be edited by the user
<?php
   echo "<textarea name='mydata'>\n";
   echo htmlspecialchars($data)."\n";
   echo "</textarea>";
?> 


Example 56-3. In a URL
<?php
   echo "<a href='" . htmlspecialchars("/nextpage.php?stage=23&data=" .
       urlencode($data)) . "'>\n";
?> 

Open a new window with XHTML

Once upon a time you could say

<a href="link" target="_blank">Click to open a new window</a>

and a new window would open. Highly annoying if used very often, but sometimes it's the right thing to do.

And then XHTML comes along and this is not longer legal.

target="_blank" is "deprecated" without a single word as to what a poor developer is to substitute.

Here's what you do:

<a href="link" onclick="window.open(this.href); return false;">Click to open a new window</a>

DHTML, PHP and MySQL References

Start with

DHTML and CSS for the World Wide Web:
Visual QuickStart Guide
by Jason Cranford Teague

DHTML = HTML + CSS + DOM + JavaScript

This book will get you up and running quickly with client-side programming.

Then you need to learn server-side programming.

PHP is an open source server-side scripting language that is easy to learn and very powerful. MySQL is the same ... open source relational database.

The text book for these technologies is

PHP and MySQL for Dynamic Web Sites:
Visual QuickPro Guide
by Larry Ullman

Master those two books and you'll be creating very powerful scripts on both the client and server side that produce dynamic and elegant results.

While you're in the process of doing this, you will constantly need to reference manuals for syntax, functions, etc. There are many, but two will suffice for 90% of what you need:

PHP Manual

W3 Schools

Web hosting providers

Choosing a web hosting service provider is difficult. There's no brand to rely on and an Internet search turns up confusing claims and offers. We have used two providers:

Based on our experience, Internet Planners is a reasonable choice for web hosting; Network Solutions is a bad choice.

Regular Expressions

Anyone who has used the expression *.doc to search for Word files has used Regular Expressions ("regex") without realizing it. Regex arose from mathematical theory and is available in many programming languages; it is simply the only way to deal with large amounts of text. And yet most people are completely unaware of it.

Philadelphia Reflections uses regex extensively for two primary purposes: (1) checking input from forms and (2) modifying HTML input in during the creation of articles for the site.

The text PHP and MySQL by Larry Ullman has a very good introduction to regex in his chapter on security.

The great advantage of regex is that it can identify very complex patterns in a mass of text. The great disadvantage of regex is that it has developed in sort of an underground way and there exist numerous varieties that are essentially incompatible. PHP offers two regex functions: one for the POSIX Extended variety of regex and he other for the Perl language compatible vesion called PCRE. POSIX is less powerful but far easier to learn. JavaScript offers its own variety of regex which isn't quite the same as either of the two PHP versions.

References include the Ullman book, the PHP online manual has a number of handy tips on regex use in its two supported varieties, the O'Reilly book Mastering Regular Expressions is interesting and Jan Goyvaerts has a very helpful website (http://www.regular-expressions.info/) and book Regular Expressions: The Complete Tutorial.

My experience is that this area requires diligent hacking which may be sub optimal but unavoidable ... for this purpose, Jan Goyvaerts' Regex Buddy is indispensible; you simply must get this program if you hope to make anything of Regex.

Here are examples of checking for a valid email address in both Javascript and PHP:

Javascript


// check email

var namePattern = /^[a-zA-Z0-9][a-z0-9_.-]*@[a-z0-9.-]+\.[a-z]{2,4}$/i;

document.comment_form.email.value = trim(document.comment_form.email.value);
					
if	(
	(  document.comment_form.email.value.length > 0)
			&&
	(! document.comment_form.email.value == "[none]")
			&&
	(! document.comment_form.email.value.match(namePattern))
	) 
		{
			alert("Please enter a valid email address");
    			document.comment_form.email.focus();
       			document.comment_form.email.select();
			problem = "yes";
       			return false;
   		}

PHP


// check email

$emailpattern = "^[a-zA-Z0-9][a-z0-9_.-]*@[a-z0-9.-]+\.[a-z]{2,4}$";
		
if	(
	(trim(strlen($_POST['email']))  > 0)
		
			and
					
	(!$_POST['email'] == "[none]")
			
			and

	(!eregi ($emailpattern, stripslashes(trim($_POST['email']))))
	)
		{
		$inputerror		=	TRUE;
		$inputerrormessage	.=	"<br />* An invalid email address was entered";
		}


Incomprehensible? Yes, absolutely.

Useful? More than you can realize until you are actually faced with the problem of, say, verifying that a user has input a valid email format, or trying to figure out whether a user-input IMG tag is using the correct syntax; or else maybe trying to convert a huge web page from XHTML 1.1 to HTML 4.01 because you've determined that the browser is syntactically crippled.

And, once you get deep into it, the stuff is actually intriguing and fun.

Javascript: document.write and XHTML

For reasons that make no sense to me, the Javascript command document.write does not work when your page is rendered properly in XHMTL (as described elsewhere in this Topic).

I have searched the web in vain to find a Javascript solution. Many are offered but none work worth a damn.

So don't bother. Use PHP's echo function. It works perfectly.

Captcha

With the rise of spam entries in web forms, a security feature called "captcha" has been developed.

CAPTCHA stands for "Completely Automated Public Turing test to tell Computers and Humans Apart". The idea is that only a human could read the letters contained in the image and then enter them in the form. "Accessibility", ie., designing websites to accommodate people with handicaps is obviously hindered by Captcha; but at least given our experience with this website, spamming is a huge problem and the inability of handicapped people to leave comments is a price we are willing to pay to rid ourselves of spam. The W3C, the Godhead of web standards, does not agree with me and lectures at length on the futility of captcha: Inaccessibility of CAPTCHA. Whatever. I may get around to implementing some of their recommendations later, if we continue to be spammed.

Spammers have countered captcha in a number of ways. The first is OCR, which is why the images have fuzzy backgrounds and distorted letters: trying to defeat OCR programs. As OCR techniques have improved, captcha programs have moved from letters to "objects" such as kittens, boxes, etc., which are thought to be harder for computers to recognize; harder for people, too: cat vs kitten, for example. I am amazed to learn during my captcha research that there are spammers who offer micro-payments to people in India, etc. to enter hundreds of spam manually in captcha-ed websites that have defeated their automated spamming systems. Move, counter move; seemingly endlessly.

In this website captcha has been implemented using PHP: the comments form that appears at the end of every page has an image created using the PHP image-creation routines which has random characters in it. If the characters in the image are entered correctly in the form, the comments are entered into the database.

I cribbed the PHP captcha code from http://www.white-hat-web-design.co.uk/articles/php-captcha.php and it worked right out of the box with the minor exception that the form HTML didn't quite pass XHTML muster; easily fixed. (I have subsequently discovered that PHP security and sessions don't play well together; this problem remains unresolved and I've had to turn off captcha processing for my secure pages.)

I implemented a number of other spam counter measures before I got around to captcha, which involved noticing what the spammers did and writing code to frustrate it. I am constantly on the lookout for new security techniques to implement.

RSS, Atom, Syndication, etc.

The world is full of XML and XML-like file formats for syndication purposes

Here's the list of files we generate automatically for submission to search engines and such.

(For right now, things are a bit abbreviated)

http://www.philadelphia-reflections.com/reflectionsRSS.xml (RSS Syndication file)
http://www.philadelphia-reflections.com/reflectionsATOM.xml (Atom Syndication file)
http://www.philadelphia-reflections.com/sitemap.xml (Google sitemap)
http://www.philadelphia-reflections.com/siteinfo.xml (A9/Amazon siteinfo.xml)
http://www.philadelphia-reflections.com/reflectionsIDIF1.xml (Yahoo IDIF file 1)
http://www.philadelphia-reflections.com/reflectionsIDIF2.xml (Yahoo IDIF file 2)
http://www.philadelphia-reflections.com/reflectionsIDIF3.xml (Yahoo IDIF file 3)
http://www.philadelphia-reflections.com/reflectionsIDIF4.xml (Yahoo IDIF file 4)
http://www.philadelphia-reflections.com/reflectionsIDIF5.xml (Yahoo IDIF file 5)
http://www.philadelphia-reflections.com/IDIFpointer.txt (Yahoo IDIF pointer file)
http://www.philadelphia-reflections.com/urllist.txt (Yahoo urllist.txt)

Validate Short RSS | The Short RSS File itself
Validate Short rss (lower case) | The Short RSS File itself (lower case)
Validate Short ATOM | The Short ATOM File itself

Weblogs.com extended successfully pinged
Weblogs.com successfully pinged
blo.gs successfully pinged
Technorati successfully pinged
Ping-O-Matic successfully pinged
Syndic8 successfully pinged (Feed ID 477463)

Ping Blogroller manually
Ping MyYahoo manually



The RSS and Atom validator (http://feedvalidator.org/) has a length restriction. I don't know what it is, exactly, but it bombs if your file is "too long". Since most syndication readers run the validator before they'll accept a feed, I have resorted to creating a short file, which is what I point to in my meta tags.



Here's how I provide change frequency and priority for our Google sitemap (in PHP ... $mod is the variable containing the date last modified)

$GOOGLEpriority = "0.0"; $GOOGLEfreq = "yearly";	// default

if ($mod > mktime(0,0,0) - 86400*210)	{$GOOGLEpriority = "0.1"; $GOOGLEfreq = "monthly";}	// past 210 days
if ($mod > mktime(0,0,0) - 86400*180)	{$GOOGLEpriority = "0.2"; $GOOGLEfreq = "monthly";}	// past 180 days
if ($mod > mktime(0,0,0) - 86400*150)	{$GOOGLEpriority = "0.3"; $GOOGLEfreq = "monthly";}	// past 150 days
if ($mod > mktime(0,0,0) - 86400*120)	{$GOOGLEpriority = "0.4"; $GOOGLEfreq = "monthly";}	// past 120 days
if ($mod > mktime(0,0,0) - 86400*90)	{$GOOGLEpriority = "0.5"; $GOOGLEfreq = "monthly";}	// past 90 days
if ($mod > mktime(0,0,0) - 86400*60)	{$GOOGLEpriority = "0.6"; $GOOGLEfreq = "monthly";}	// past 60 days
if ($mod > mktime(0,0,0) - 86400*30)	{$GOOGLEpriority = "0.7"; $GOOGLEfreq = "monthly";}	// past 30 days
if ($mod > mktime(0,0,0) - 86400*7)	{$GOOGLEpriority = "0.8"; $GOOGLEfreq = "weekly";}	// past 7 days
if ($mod > mktime(0,0,0) - 86400)	{$GOOGLEpriority = "0.9"; $GOOGLEfreq = "daily";}	// yesterday
if ($GOOGLEmoddate == date("Y-m-d"))	{$GOOGLEpriority = "1.0"; $GOOGLEfreq = "hourly";}	// today



IDIF is a stupid format: it includes the entire blog_contents, so the files are huge. In the process of setting this up, I learned that flat files have a maximum size of 1.4 megs or so (the size of an old floppy disk), so I had to create more than one.

Which explains the stupid concept of a "pointer file"; instead of just giving Yahoo the IDIF file itself, you give it a pointer file with URLs pointing to the multitude of IDIF files. Really stupid.

News flash, after finding the Journal Of Ovid on the web, I learned about length restrictions for the input fields (described below). This information was not contained on the Yahoo web site describing their file format. It considerably reduced the file sizes but I retained the structure of multiple files because who knows what I'll learn next?

IDIF title must be a maximum of 80 characters
IDIF description must be a maximum of 180 characters
IDIF body must be a maximum of 1000 characters
I'm only guessing about keywords

Thanks to the Journal Of Ovid on the web for this secret information

From the inside out: trim, replace whitespace (thanks to the PHP manual for this), shorten to maximum length

$IDIFtitle		= substr( preg_replace ('/\s\s+/', ' ', trim($title) ), 0, 80 );
$IDIFdescription	= substr( preg_replace ('/\s\s+/', ' ', trim($description) ), 0, 180 );
$IDIFkeywords		= substr( preg_replace ('/\s\s+/', ' ', trim($keywords) ), 0, 79 ) . " ";
$IDIFblog_contents	= substr( preg_replace ('/\s\s+/', ' ', trim($blog_contents) ), 0, 1000 );

Yahoo is said to support a simple text file list of URLs "urllist.txt" Documentation, of course, is scarce


(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

Web Standards Validation

There are two primary aspects of a website that need validation:



1. (X)HTML

You can use the W3C's QA Markup Validation Service.
The URL to test the main page of Philadelphia Reflections is http://validator.w3.org/

Firefox has several useful add-ons for (X)HTML validation; one that uses Tidy is here: Html Validator

2. CSS

The W3C has a validation service for CSS, too.
For Philadelphia Reflections, the following URL checks all the CSS definitions in the main page: http://jigsaw.w3.org/css-validator/ (note: this validator is a little flakey: it produces different answers for the same file; you have to refresh a couple of times to get the whole story)

Firefox has several useful web developer add-on tools; try this one: Web Developer


Once you've gotten the HTML and CSS basics under control, there are other aspects of your site that you will want to validate:

Broken Links

The W3C will check all your links for both response time and validity.
http://validator.w3.org/checklink/checklink

Tidy

There is an absolutely lovely program called HTML Tidy, origianlly written by Dave Raggett and decribed by the W3C here: http://www.w3.org/People/Raggett/tidy/

Calls to Tidy are available in some newer renditions of PHP (sadly, not the one we are using), however, on Widows (only) versions of Firefox and Mozilla, you can download an extension that will provide all the Tidy functions in your browser! ... https://addons.mozilla.org/firefox/249/. This a fantastic feature that I use all the time.

Syndication XML Validation

Validating RSS and Atom files is greatly facilitated by http://feedvalidator.org/. It has a number of quirks, the worst of which is that it has a length limitation that we exceed and so we have to provide "short" syndication files since all the feed aggregators use this facility and reject any feeds that aren't validated by it.

Google Sitemap Validation

If you submit a sitemap to Google through their Webmaster Tools facility they will validate your sitemap when they load it. An external validation tool is available here: Validome Google Sitemap(s) Validator

Yahoo and Microsoft have agreed to support Google's Sitemap protocol and to support the inclusion of the line "Sitemap: http://www.philadelphia-reflections.com/sitemap.xml" in robots.txt. If other search engines adopt this facility it will make it much easier to get into the world's many search engines ... they'll pick up this line instead of us having to hunt them down.

Meta Tag Validator

As you puzzle the mysteries of search-engine indexing, you'll want to check your meta tags: http://www.widexl.com/remote/search-engines/metatag-analyzer.html

gzip Compression & Headers

When you start getting really fancy and want to include automatic gzip compression, you'll want to see it in action and you'll want to check out all of your HTTP headers: http://www.gidnetwork.com/tools/gzip-test.php

Response Time

Of course, the reason you''re experimenting with gzip is because you're concerned about response time.
(1) Try this site for a detailed analysis: http://www.websitepulse.com/help/tools.php
(2) Firefox to the rescue again: FasterFox is another lovely add-in: https://addons.mozilla.org/firefox/1269/

Geo Tags

Check the validity of your geo tags here: Geo-Tag Validator

Big List

http://uitest.com/en/analysis/ is the mother of all lists of validations routines

Ampersand Madness: Convert & to &amp; to prevent XHTML errors

The whole subject of "encoding" gives me a headache.



Encoding In General

The first thing you have to know is: what is HTML encoding ... so look here:
http://htmlhelp.com/reference/html40/entities/
or here:
http://www.cookwood.com/html/extras/entities.html

(These are HTML encodings; URL encoding is something else again ... look here:
» http://www.blooberry.com/indexdot/html/topics/urlencoding.htm)

Ampersand Encoding and Conversion

Later on, you'll find out that the ampersand is a huge source of XHTML errors because it has to be written

but you will struggle endlessly with how to get the darn thing to stay converted. First of all, content providers feel justifiably justified in including bare naked "&"s wherever they please; second of all, you will find that encoded ampersands get stripped back to their bare naked selves by browsers and other well-meaning sorts.

So, my undying thanks to Michael Ash's Regex Blog for providing the regex pattern in the following bit of PHP code:


$pattern = '/&(?!(?i:\#((x([\dA-F]){1,5})|(104857[0-5]|10485[0-6]\d|1048[0-4]\d\d|104[0-7]\d{3}|10[0-3]\d{4}|0?\d{1,6}))|([A-Za-z\d.]{2,31}));)/i';
				
$replacement = '&amp;';
				
$string = preg_replace ( $pattern, $replacement, $string);

I don't know how it can possibly work, and I may yet eat my words, but for the moment it seems to do the trick.

Ampersand Encoding In RSS

Another thing: &#x26; is the only ampersand encoding form acceptable to both RSS and Atom. So, look at the souce of this page and you will find that I use this encoding in the title ... that's because the title goes into the Title field of my RSS and Atom feeds.

CSS Zen Garden Suggestions

Here is a list of links (that open in their own pages) that show some of my favorite web designs. The CSS Zen Garden is a website that illustrates what can be done with clever CSS design. The HTML and the content are exactly the same in each of these links, only the CSS changes; but what a difference!


C Note

Dark Rose

Dead or Alive

Egyptian Dawn

Invitation

Mediterranean

Mozart

Odyssey

Tranquille

Zen Pool

webZine

White Lily

Another website to consider is http://www.freecsstemplates.org/

Font Families

The following link shows the results of a survey done to find out which font families are installed on Windows machines. This should help determine which fonts to use.

http://www.codestyle.org/css/font-family/sampler-WindowsResults.shtml

Identifont is a site that helps identify good font choices.

Geo Positioning

Geo Tagging refers to adding latitude and longitude information to websites and photographs. This has been around for a long time but it has taken the advent of Google Earth for it to really start to catch on.

This blog entry has geo meta tags that you can see if you look at the HTML source ("View > [Page] Source"). The input was as follows:

Address: 82 Devonshire St Boston MA

Lat: 42.3578 Lon: -71.0577

Descriptive Place Name: Fidelity Investments headquarters

Region: US-MA Country Code: US Country Name: United States


This creates meta tags in the HTML Header as follows:

<!-- geo tags for 82 Devonshire St Boston MA -->
<meta name="ICBM"          content="42.3578, -71.0577" />

<meta name="geo.country"   content="US" />
<meta name="geo.region"    content="US-MA" />
<meta name="geo.placename" content="Fidelity Investments headquarters" />
<meta name="geo.position"  content="42.3578; -71.0577" />

<meta name="tgn.name"      content="Fidelity Investments headquarters" />
<meta name="tgn.nation"    content="United States" />

The Region, Country Code and Country Name can be found here: ISO-3166-1 Country Names

geo.placename and tgn.name are often rendered as the city name but are intended to describe the geographical feature ("Pyramids of Giza" or something). This tag is optional.

HTML geo meta tags can be validated here: {geo tag validation}



There is a search engine of long standing that reads HTML geo meta tags and indexes the website based upon its location; for searching, it groups sites based on their geographic proximity: GeoURL.



Photographs can also contain geo meta data, so-called EXIF data (Firefox has an EXIF viewer AddOn).

JPEG is the most common image format and the easiest to deal with. The combination of Picassa2 and Google Earth allow you easily to add this information to your own photos.

The process of adding lat and lon to your photographs is this:

1. Select one or more photos in Picassa
2. Select Tools > GeoTag > GeoTag with Google Earth ...
3. This starts Google Earth and you can "fly" to the location of the picture

4. A small Picasa window will appear in Earth's lower-right corner displaying thumbnails of the pictures you selected; press the "Geotag" button.
5. When all of your pictures are tagged, press the "Done" button

Slowly, camera manufacturers are providing GPS capability. Some few have GPS devices built in and some others allow an external GPS device to be attached, although both Canon and Nikon are way behind the curve ... if you own either, you can essentially forget it: the best - lousy - solution is to carry around a GPS with you and synchronize the times ... ugly.


The Google Maps API allows maps to be embedded in a website as is done here. Google Maps API

The JavaScript required to embed the map on this page can also be seen in the HTML source ("View > [Page] Source"). In addition to JavaScript, you need a DIV with an ID of "map" or whatever is specified in the JavaScript document.getElementById entry, which specifies the height and width of the map to be displayed.

To embed these maps you must register with Google


In addition, there are extensions to ATOM and RSS to include lat and lon in your syndication feeds; there are three standards that I have found: GeoRSS (ATOM example) , W3C Geo (RSS example) and an "ICBM RSS Module". This website extends the namespaces of both its ATOM feed and its RSS feed to include all the tags.

Google, Yahoo and Microsoft all now support GeoRSS as a feed to their map programs. My sense of it is that KML is a richer protocol, allowing more features, but fundamentally all these XML variants do mostly the same thing.

Google Sitemaps can include links to KML files (and ATOM, now, too). Part of the sitemap generation on this site is some code that picks up every *.kml and *.kmz file in the /kml/ folder and adds them to our sitemap.xml file.


Google Earth is filled with delights, not the least of which is a Flight Simulator! Google Earth Flight Simulator Keyboard Controls


KML ( Keyhole Markup Language, Keyhole being the predecessor to Google Earth) is an XML protocol that allows you to incorporate Google Earth into graphical presentations. Google KML Overview

Google Earth Outreach helps you get started: Google Earth Outreach

An extraordinary collection of KML files you can view is found here: Spectacular satellite images of the world

I found a KML editor here: NorthGates' KML Editor for Windows. It's rudimentary but very handy for what it does do.

Here's the Google Earth tools list where I found the KML editor: EarthPlot Software Tools For Google Earth


The way we serve the KML in the link that connects to Google Earth from individual blogs uses the following PHP script as its base:

<?php

// See Google Earth's KML 2.1 Reference
// http://code.google.com/apis/kml/documentation/kml_tags_21.html

$lat			= $_GET['lat'];
$lon			= $_GET['lon'];
$placename		= $_GET['placename'];
	
$altitude		= $_GET['altitude'];
$range			= $_GET['range'];
$heading		= $_GET['heading'];
$tilt			= $_GET['tilt'];
	
if ($altitude	== NULL) {$altitude	= 0;}
if ($range	== NULL) {$range	= 1000;}
if ($heading	== NULL) {$heading	= 0;}
if ($tilt	== NULL) {$tilt		= 0;}
	
$description	= "<h3><font color=\"#ea9f20\"><a href=\"http://www.philadelphia-reflections.com/\">
		PHILADELPHIA REFLECTIONS</font></a></h3>
		<p>The musings of a Philadelphia physician who has served the community for six decades.</p>";
	
	
header('Content-Type: application/vnd.google-earth.kml+xml');
header('Content-Disposition: inline; filename="philadelphia-reflections.kml"');

echo '<?xml version="1.0" encoding="UTF-8"?>'; 

?>

<kml xmlns="http://earth.google.com/kml/2.0">

  <Placemark>
    <name><?php echo $placename; ?></name>
    <description>
        <![CDATA[<?php echo $description; ?>]]>
    </description>
    
    <LookAt>
      <longitude><?php echo $lon; ?></longitude>
      <latitude><?php echo $lat; ?></latitude>
      
      <altitude><?php echo $altitude; ?></altitude>
      <range><?php echo $range; ?></range>
      <tilt><?php echo $tilt; ?></tilt>
      <heading><?php echo $heading; ?></heading>
      
      <altitudeMode>relativeToGround</altitudeMode>
    </LookAt>
    
    <Point>
      <coordinates><?php echo "$lon,$lat,$altitude"; ?></coordinates>
    </Point>
    
  </Placemark>

</kml>


Of course, GPS devices are an integral part of this process of Geo Positioning. GPS devices are supposed to support the open-source protocol GPX,
which is an XML-based description of waypoints and routes. Wikipedia describes GPX here: GPS eXchange Format

The GPX protocol's official website is here: GPX The GPS Exchange Format

Google Earth supports raw GPX (File > Open ...) and when you open a GPX file in Google Earth, it converts it to KML. But if you want stand alone programs to do this:

If you have non-standard GPS data, you may want to have a look at GPS Babel for conversion of native GPS formats as well as the GPS Utility and G7ToWin

A nice blog on these things relative to Google Maps is here: Using XSL to Transform Google Earth (KML) and GPX to Google Maps API


At Philadelphia Reflections, we are creating tours by carrying a GPS and a camera around on our travels. The GPS track becomes a path and waypoints become placemarks. When you come home, download the GPS data in GPX format and open up the GPX file in Google Earth. Use Google Earth to edit the placemark balloons, including pictures and text.



There are many, many sightseeing blogs around that take you to interesting places on Google Maps and Google Earth. A place to start looking is Sightseeing with Google Satellite Maps


Somehow, the concept of "mashup" is related to all of this but it sort of sounds like the term "multimedia" a few years ago ... fancy in concept but somewhat vague in reality.

Google has a Mashup Editor and Wikipedia has a definition but it's not clear what it all adds up to.



(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

Process .htm and .html as php

It is sometimes helpful to include php scripting in files that do not have the file extension of php.

There is quite a lot of discussion on the web about this but at least on this server the answer is not what most people think.

In the .htaccess file in the root folder include these two lines:

AddType application/x-httpd-php .html .php .htm
AddHandler application/x-httpd-php .html .php

Call KML files from within a blog or topic

Here's how to create a button in your blogs or topics that calls KML or KMZ files you create. In the Modify A Blog utility you can include a KML or KMZ file, but to call it explicity from within the blog, you can create a button as shown here.

  1. Create and save your KML file.
  2. FTP the file to the kml folder in Philadelphia Reflections
  3. Use the following code in a blog or topic to call the kml file


<button onclick="location.href='http://www.philadelphia-reflections.com/kml-read.php?file=Franklin.kmz'">Button Label</button>

To create this button:



Instead of Franklin.kmz, put any .kml or .kmz file that is in the kml folder:
http://www.philadelphia-reflections.com/kml/

Send a KML file from disk using PHP

Sending a kml or kmz disk file is as easy as clicking on it. But different browsers react differently, some asking you which program to use others storing the file on your desk top, etc. Preprocessing the file through PHP can reduce some of these annoyances.

<?php

//
// reads and sends a kml or kmz file
// located in /whatever/kml/
//
// calling protocol:
// this-program.php?file=somefile.kml
//

// read the input and check that it's a kmz or kml file
// ....................................................

$kml_file	= $_GET['file'];

if (($kml_file === NULL) or ($kml_file == "")) {exit ("error message");}
		
if ((substr($kml_file, -4) != ".kmz") AND (substr($kml_file, -4) != ".kml"))	
	{
	exit ("error message");
	}


// prepend the file path information to the file name and check that it exists
// ...........................................................................

$kml_file_name = "/whatever/kml/" . $kml_file;
	
if (!file_exists($kml_file_name)) { die ("error message");}


// send out the HTTP header information followed by the file contents
// ..................................................................
	
header("Cache-Control: no-cache, no-store, must-revalidate"); // trying to keep from getting the files stored on the local computer
header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");

if (substr($kml_file, -4) == ".kml") {header('Content-Type: application/vnd.google-earth.kml+xml');}
if (substr($kml_file, -4) == ".kmz") {header('Content-Type: application/vnd.google-earth.kmz');}

header("Content-Disposition: inline");
header("Content-Description: KML or KMZ data intended for Google Earth");

readfile ($kml_file_name);

?>

Static vs Dynamic URLs

It used to be that no spiders or search engines could index a dynamic URL, namely one that contained a "?" followed by parameters to be used by PHP, ASP or other server-side scripting languages to drive a website using a database.

Nowadays, Google and Yahoo seem to do a perfectly fine job of indexing dynamic URLs but Google has a disclaimer warning that it may still encounter problems with dynamic URLs and the SEO literature is still full of warnings that other spiders and search engines may be blind to everything to the right of the "?".

Furthermore, a *.php extension is an invitation to bad guys to try to break in and wreak many sorts of havoc: this site was hacked by Nigerians a few years ago using PHP tricks and they managed to use it as an email factory until our ISP shut us down. I came on the scene at that point and implemented every safeguard I could find, but the concern still lingers.

Finally, dynamic URLs are not user friendly ... human beings generally do not know what to make of long strings of obscure parameters.

Apache has a feature called "mod_rewrite" that allows you to specify, via regex, that you want incoming URLs to be transformed in some way. Apache's instructions on this subject are here: URL Rewriting Guide. I have used that facility at Philadelphia Reflections to use static URLs for public use while still allowing me to use parameters to drive the website with the database.

Two excellent articles on this got me started:

Here's what I did:

Step 1: htaccess

I added these lines to the htaccess file

Options +FollowSymLinks
RewriteEngine on
RewriteRule ^(blog|topic)/([0-9]+)\.html?$ reflections.php?type=$1&key=$2
  1. ^(blog|topic)/([0-9]+)\.html?$ is the pattern to be compared against all incoming URLs.

    If matched, it changes the URL to reflections.php?type=$1&key=$2

  2. The ^ ... $ sequence in the pattern says that we will match the whole string, not just some part in the middle
     
  3. (blog|topic)/ matches either   "blog"   or   "topic"   followed by   "/"  
     
  4. ([0-9]+) matches one or more digits
     
  5. \.htm matches   ".htm"  
     
  6. l? matches 0 or 1 lower-case Ls (so that we will match either htm or html)
     
  7. In the replacement string $1 is replaced with the contents of the first () in the pattern: either "blog" or "topic"
    ... and $2 is replaced by the second () in the pattern, namely the numeric ID on the database of the blog or topic

The result is that

http://www.philadelphia-reflections.com/blog/906.htm

is transformed into

http://www.philadelphia-reflections.com/reflections.php?type=blog&key=906

The latter is what is passed in to me in the reflections.php routine, which tells me to pull up blog #906 from the database.

Both of these URLs are equivalent to the old, ugly dynamic URL

http://www.philadelphia-reflections.com/reflections.php?content=blogs_alpha/zmadame_butterfly.html

which still works, in case there are any legacy bookmarks or links out there, but going forward the new, simple, static URL is the face we will present to the world.

Step 2. SMOP

After the htaccess regex was debugged, all that was left was a simple matter of programming. In fact, I had to completely rewrite the driver script, reflections.php, and the XML creation script which creates the RSS, sitemap, etc. files; plus a lot more besides. It was a lot of work but the breakthrough was in figuring out the htaccess trick; everything else was just work.


In July 2008, after Volumes were implemented, another RewriteRule was implemented:

RewriteRule ^volumes?/([0-9]+)\.html?$ volume.php?table_key=$1
Converts
http://www.philadelphia-reflections.com/volumes/3.htm
into
http://www.philadelphia-reflections.com/volume.php?table_key=3

HTML Forms

How do you (a) open a form when a radio button is clicked (b) in a new window?

Here's how it's done on this website.

<html>
<head>

<script type="text/javascript">

	/* javascript function called by the radio buttons 
	     to submit the form when clicked */

	function formSubmit()
	{
	document.getElementById("form_x").submit()
	}
	
</script>

</head>
<body>

  <form name="form_x" id="form_x"
	action="some_routine.php"
	target="newIMGwin"
	method="post" 
	style="whatever">

	<fieldset>
	<legend>legend surrounding the form</legend>

	<input type="radio" name="key" value="1269" onclick="formSubmit()" />

	</fieldset>
  </form>

</body>
</html>

(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

Regex URL Matching

On this site we check for the existence of a URL whenever an entry is updated

There are two key technologies at work


function url_exists($url) 
{
// 
// checks whether a URL actually exists on the Internet
//
$handle   = curl_init($url);
if (false === $handle)
   {
    return false;
   }
curl_setopt($handle, CURLOPT_HEADER, false);
curl_setopt($handle, CURLOPT_FAILONERROR, true); 
curl_setopt($handle, CURLOPT_NOBODY, true);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, false);
$connectable = curl_exec($handle);
curl_close($handle);   
return $connectable;
}


function aExists($matches)
{
//
// function called by preg_replace_callback
//
// $matches[0] is the complete match
// $matches[1] the match for the first subpattern
//	enclosed in '(...)' and so on

//
// checks to see if a regular link exists
// something similar is done for img src= also
//

$srcURL = $matches[3];
		
if (url_exists($srcURL)) {do something; return "";}  
else {do something else; return "";}
}

$foo = preg_replace_callback(
            '/(.*?)(<a .*?href=")([^"]*)("[^>]*>)(.*?)(<\/a>)/i',
            "aExists",
            $source_string);

(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

Parsing name-value pair attributes in an HTML tag

Not only do the attributes in an HTML tag come in random order but many are optional

Here's a regex solution:

<?php
function tagAttr($matches) {print_r($matches);}

$string = '<img src="/images/picture.jpg" width="300" class="left" alt="alt keywords" />';

$foo	= preg_replace_callback(
'/<img\b(?>\s+(?:alt="([^"]*)"|class="([^"]*)"|style="([^"]*)"|src="([^"]*)"|height="([^"]*)"|width="([^"]*)")|[^\s>]+|\s+)*>/i',
"tagAttr",
$string);
?>

Produces the following:

Array
(
    [0] => <img src="/images/picture.jpg" width="300" class="left" alt="alt keywords" />
    [1] => alt keywords
    [2] => left
    [3] => 
    [4] => /images/picture.jpg
    [5] => 
    [6] => 300
)

The regex is a series of alternating sequences; so, add href="([^"]*)"| in front of alt="([^"]*)" to select an additional attribute.

$matches[0] is the complete match
$matches[1] is alt=
$matches[2] is class=
$matches[3] is style=
$matches[4] is src=
$matches[5] is height=
$matches[6] is width=

My thanks (a) to Flagrant Badassery for putting me onto the idea and (b) to http://centricle.com/tools/html-entities/ for HTML encoding

Server-Side gzip Compression

Compression can reduce the size of the text (not images) of your web pages as they are transmitted outbound to the client. This will have only a small impact on response time over modern fiber connections but it will significantly reduce your bandwidth consumption (70% on average on this site.)

In XHTML vs. HTML I show how I implemented gzip compression on this site. The problem with that method is that it's a pain. So on another website I tried out the Apache htaccess method to instruct the server to compress all outbound pages. Works like a charm.


# See http://httpd.apache.org/docs/2.0/mod/mod_deflate.html

# Insert filter
SetOutputFilter DEFLATE

# Netscape 4.x has some problems...
BrowserMatch ^Mozilla/4 gzip-only-text/html

# Netscape 4.06-4.08 have some more problems
BrowserMatch ^Mozilla/4\.0[678] no-gzip

# MSIE masquerades as Netscape, but it is fine
# BrowserMatch \bMSIE !no-gzip !gzip-only-text/html

# NOTE: Due to a bug in mod_setenvif up to Apache 2.0.48
# the above regex won't work. You can use the following
# workaround to get the desired effect:
BrowserMatch \bMSI[E] !no-gzip !gzip-only-text/html

# Don't compress images
SetEnvIfNoCase Request_URI \
\.(?:gif|jpe?g|png)$ no-gzip dont-vary

# Make sure proxies don't deliver the wrong content
Header append Vary User-Agent env=!dont-vary

SQL To Exclude A List Of Items

Let's say you have a table "Primary" that contains an "Email" field.

You would like to select all the email addresses in Primary except for the list of email addresses in the Email field in a table "Exclude".

This SQL will exclude the emails in "Primary" based on those contained in "Exclude".

SELECT * FROM Primary WHERE ((Primary.Email) Not In (SELECT Email FROM Exclude))

HTML Anchor without an HREF?

Sometimes I want to execute a JavaScript function when a user clicks a link, but nothing else.

If I omit the href entirely, the cursor doesn't change and some browsers don't recognize the text as a link:

<a onclick="function();">

If I include the pound sign, which seems to be a very popular trick, I get sent to the top of the current page, which messes up both History and the backspace button; to say nothing of the fact that I don't want to jump to the top of the page:

<a href="#" onclick="function();">

Including the function in the href and omitting the onclick seems to be the answer to my specific problem:

<a href="javascript: function();">

Once again, my thanks to http://centricle.com/tools/html-entities/ for HTML encoding.

Website Statistics

Today's Philadelphia Reflections was born in June 2006. It had a prior incarnation but it was hacked by Nigerian spammers who took it over and turned it into an email factory.

We scrubbed everything down and rebuilt from scratch, implementing as many PHP and MySQL security features as we could find.

We have done all of the standard things to improve our search engine standings but we are really at a loss to explain the inflection points that can be seen in the graphs.

Our home page has a Google Page Rank of 5/10 and the pages vary as follows (as of December 2008):

Google Images is by far the largest source of referrals but we also have many visitors who come to us via the search engines and who like what they see and come back; we would like to express our appreciation to all of our visitors.

{website total visitor statistics}

The dips in the Unique Visitors graph were the result of problems with our ISP ... once they were simply off the air and twice they made software changes without notification or testing.


{website returning visitor statistics}

 

PHP script to display Google PageRank

Like so many things on this website, the code to find the Google PageRank of the pages was lifted from someone else's work. This work is particularly praiseworthy because it worked exactly as described the minute I got it implemented.

See PHP script to display Google PageRank


pagerank.php

<?php
define('GOOGLE_MAGIC', 0xE6359A60);
class pageRank{
var $pr; 
 function zeroFill($a, $b){
 $z = hexdec(80000000);
  if ($z & $a){
   $a = ($a>>1);
   $a &= (~$z);
   $a |= 0x40000000;
   $a = ($a>>($b-1));
  }else{
   $a = ($a>>$b);
  }
 return $a;
 } 
 
 function mix($a,$b,$c) {
   $a -= $b; $a -= $c; $a ^= ($this->zeroFill($c,13));
   $b -= $c; $b -= $a; $b ^= ($a<<8);
   $c -= $a; $c -= $b; $c ^= ($this->zeroFill($b,13));
   $a -= $b; $a -= $c; $a ^= ($this->zeroFill($c,12));
   $b -= $c; $b -= $a; $b ^= ($a<<16);
   $c -= $a; $c -= $b; $c ^= ($this->zeroFill($b,5));
   $a -= $b; $a -= $c; $a ^= ($this->zeroFill($c,3));
   $b -= $c; $b -= $a; $b ^= ($a<<10);
   $c -= $a; $c -= $b; $c ^= ($this->zeroFill($b,15));
   return array($a,$b,$c);
 }
 
 function GoogleCH($url, $length=null, $init=GOOGLE_MAGIC) {
  if(is_null($length)) {
   $length = sizeof($url);
  }
  $a = $b = 0x9E3779B9;
  $c = $init;
  $k = 0;
  $len = $length;
  while($len >= 12) {
   $a += ($url[$k+0] +($url[$k+1]<<8) +($url[$k+2]<<16) +($url[$k+3]<<24));
   $b += ($url[$k+4] +($url[$k+5]<<8) +($url[$k+6]<<16) +($url[$k+7]<<24));
   $c += ($url[$k+8] +($url[$k+9]<<8) +($url[$k+10]<<16)+($url[$k+11]<<24));
   $mix = $this->mix($a,$b,$c);
   $a = $mix[0]; $b = $mix[1]; $c = $mix[2];
   $k += 12;
   $len -= 12;
  }
  $c += $length;
  switch($len){
   case 11: $c+=($url[$k+10]<<24);
   case 10: $c+=($url[$k+9]<<16);
   case 9 : $c+=($url[$k+8]<<8);
   /* the first byte of c is reserved for the length */
   case 8 : $b+=($url[$k+7]<<24);
   case 7 : $b+=($url[$k+6]<<16);
   case 6 : $b+=($url[$k+5]<<8);
   case 5 : $b+=($url[$k+4]);
   case 4 : $a+=($url[$k+3]<<24);
   case 3 : $a+=($url[$k+2]<<16);
   case 2 : $a+=($url[$k+1]<<8);
   case 1 : $a+=($url[$k+0]);
  }
  $mix = $this->mix($a,$b,$c);
 /* report the result */
 return $mix[2];
 }
 
 //converts a string into an array of integers containing the numeric value of the char
 
 function strord($string) {
  for($i=0;$i<strlen($string);$i++) {
   $result[$i] = ord($string{$i});
  }
 return $result;
 }
 
 function printrank($url){
  $ch = "6".$this->GoogleCH($this->strord("info:" . $url));
  
  $fp = fsockopen("www.google.com", 80, $errno, $errstr, 30);
  if (!$fp) {
     echo "$errstr ($errno)<br />\n";
  } else {
     $out = "GET /search?client=navclient-auto&ch=" . $ch .  

"&features=Rank&q=info:" . $url . " HTTP/1.1\r\n" ;
     $out .= "Host: www.google.com\r\n" ;
     $out .= "Connection: Close\r\n\r\n" ; 
     fwrite($fp, $out);
     while (!feof($fp)) {
       $data = fgets($fp, 128);
       $pos = strpos($data, "Rank_");
         if($pos === false){
         }else{
           $pagerank = substr($data, $pos + 9);
           $this->pr_image($pagerank);
         }
     }
     fclose($fp);
  }
 }

//
// Display pagerank image. Create your own or download images I made for this script. 
// If you make your own make sure to call them pr0.gif, pr1.gif, pr2.gif etc.
//

 function pr_image($pagerank){
  if($pagerank == 0){
   $this->pr = "<img src=\"images/pr0.gif\" alt=\"PageRank " .$pagerank. " 

out of 10\">" ;
   }elseif($pagerank == 1){
   $this->pr = "<img src=\"images/pr1.gif\" alt=\"PageRank " .$pagerank. " 

out of 10\">" ;
   }elseif($pagerank == 2){
   $this->pr = "<img src=\"images/pr2.gif\" alt=\"PageRank " .$pagerank. " 

out of 10\">" ;
   }elseif($pagerank == 3){
   $this->pr = "<img src=\"images/pr3.gif\" alt=\"PageRank " .$pagerank. " 

out of 10\">" ;
   }elseif($pagerank == 4){
   $this->pr = "<img src=\"images/pr4.gif\" alt=\"PageRank " .$pagerank. " 

out of 10\">" ;
   }elseif($pagerank == 5){
   $this->pr = "<img src=\"images/pr5.gif\" alt=\"PageRank " .$pagerank. " 

out of 10\">" ;
   }elseif($pagerank == 6){
   $this->pr = "<img src=\"images/pr6.gif\" alt=\"PageRank " .$pagerank. " 

out of 10\">" ;
   }elseif($pagerank == 7){
   $this->pr = "<img src=\"images/pr7.gif\" alt=\"PageRank " .$pagerank. " 

out of 10\">" ;
   }elseif($pagerank == 8){
   $this->pr = "<img src=\"images/pr8.gif\" alt=\"PageRank " .$pagerank. " 

out of 10\">" ;
   }elseif($pagerank == 9){
   $this->pr = "<img src=\"images/pr9.gif\" alt=\"PageRank " .$pagerank. " 

out of 10\">" ;
   }else{
   $this->pr = "<img src=\"images/pr10.gif\" alt=\"PageRank " .$pagerank. 

" out of 10\">" ;
  }
 }
 function get_pr(){
  return $this->pr;
 }
}
?>

Usage

Do following:

   1. Save the code above as pagerank.php.
   2. Download or create your own images to display each rank.
   3. Create a directory "images" containing all page rank images. 
   4. See code below on how to use the class. 

<?php
include("pagerank.php");
$gpr = new pageRank();
$gpr->printrank("http://www.yahoo.com");
//display image
echo $gpr->get_pr();
?>


(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

Create and send CSV files from PHP

Here's how CSV files are created and downloaded on this site. No saving the file or data import into Excel ... Excel just opens with the data automatically. Very handy.

This function as shown pulls all the field names to create a CSV header and then pulls every field from every row in the table. There is no need to know the field names, the data types or the size of the table. Quotes in the data are double-quoted and the result is surrounded by quotes.

CSV calls for each field to be contained in double quotes.


Pull data from a database using standard PHP MySQL functions:

<?php

$db_link        = mysql_connect(DB_HOST, DB_USER, DB_PSWD);
$db_selected    = mysql_select_db(DB_DATABASE, $db_link);

# Create the CSV file header from the database-table field names
$query          = "SHOW COLUMNS FROM table";
$result         = mysql_query($query);

$csv_output     = NULL;
while ($row = mysql_fetch_assoc($result))
	{
	$csv_output .= '"' . str_replace('"', '""', $row["Field"]) . '",';
	}
$csv_output  = substr($csv_output, 0, -1) . "\n";  // remove trailing "," and add a line break

# Pull all the rows
$query          = "SELECT * FROM table";
$result         = mysql_query($query);

# loop through database records creating one comma-delimed line per row
while ($row = mysql_fetch_assoc($result))
	{
	foreach ($row as $key => $value)
	  {
	  $$key = $value;
	  $$key = str_replace('"', '""', $$key);
	  $var  = $$key;
	  $csv_output .= "\"$var\",";
	  }
	$csv_output .= "\n";
	}

# send the file
$size_in_bytes		= strlen($csv_output);
$csv_file		= "filename_" . date("Y-m-d") . ".csv";
	
$ContentType		= "Content-type: application/vnd.ms-excel";
$ContentLength		= "Content-Length: $size_in_bytes";
$ContentDisposition	= "Content-Disposition: attachment; filename=\"$csv_file\"";

header($ContentType);
header($ContentLength);
header($ContentDisposition);

echo "$csv_output"; 

?>

I generally use compression with output buffering to speed things up:

ob_start("ob_gzhandler");
      .
      .
      .
ob_end_flush();

To use in your HTML:

<button onclick="window.location='CSVoutput.php'" 
	style="font-size:85%;width:100px;">Download<br />CSV file</button>

(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

PHP out of memory condition

Fatal error: Allowed memory size of 18388608 bytes exhausted 
(tried to allocate 724 bytes) in /home/dir1/dir2/script.php on line ###

When working with large amounts of data in memory (very long concatenated strings and/or very large arrays in my case), the server may hit a memory max. No amount of "unset" of variables will do the trick past a certain point.

This is the result of a memory allocation ceiling set in php.ini that can be over-ridden (judiciously) as follows:

ini_set  ("memory_limit", -1  );

Be sure to test your code with smaller amounts of data first: this limit is set for a reason ... programs have been known to have been poorly written (not yours, of course; but test anyway).

Ternary Operator and the IIF function

The standard PHP If statement can be reduced by the ternary operator, which is described in the PHP manual Comparison Operators. The IIF function puts the ternary operator into a function.

Ternary Operator

conditional ? if_true : if_not_true;

is the same as

if (conditional)
  {
  if_true
  }
  else
    {
    if_not_true
    }

IIF Function

To return the result of the ternary operator

function iif($expression, $returntrue, $returnfalse = '') {
    return ($expression ? $returntrue : $returnfalse);
}

MySQL server has gone away

MySQL timeout? Probably a new error after years of working perfectly, resulting from an ISP change which they will neither acknowledge nor fix. Sound familiar?

Charming people.

Try this:

$db_link = @mysql_connect(DB_HOST, DB_USER, DB_PSWD,'',MYSQL_CLIENT_INTERACTIVE);

... instead of what you used to do:

$db_link = @mysql_connect(DB_HOST, DB_USER, DB_PSWD);

Images

The panel below shows every image (2000+) in every blog (800+) on Philadelphia Reflections starting with most recent additions.

It works better on some browsers especially Firefox than others, and -- with 2000 images -- it takes a while to load, as much at 10 minutes on a slow connection. An icon in the corner of the picture-wall starts a slideshow. Note: Mouse-clicking enlarges each thumbnail picture, displaying an icon linked to the website source page. We suggest you try out every little icon to see the amazing versatility of Cooliris.

Get Adobe Flash

Embed Flash as Valid XHTML

The problem to be solved is that you want to embed YouTube (or other Flash movies) but the <embed> tag is deprecated in XHTML and the <param> tags don't validate, either. Here are the steps to clean things up:

The original HTML from YouTube

<object
  width="425"
  height="344">
<param
  name="movie"
  value="http://www.youtube.com/v/zhWkTiMVWVI&color1=0xb1b1b1&color2=0xcfcfcf&hl=en&feature=player_embedded&fs=1">
</param>
<param
  name="allowFullScreen"
  value="true">
</param>
<param
  name="allowscriptaccess"
  value="always">
</param>
<embed
  src="http://www.youtube.com/v/zhWkTiMVWVI&color1=0xb1b1b1&color2=0xcfcfcf&hl=en&feature=player_embedded&fs=1"
  type="application/x-shockwave-flash"
  allowfullscreen="true"
  width="425"
  height="344">
</embed>
</object>

Step 1: replace the closing "</param>" tags with trailing " />"

<object
  width="425"
  height="344">
<param
  name="movie"
  value="http://www.youtube.com/v/zhWkTiMVWVI&color1=0xb1b1b1&color2=0xcfcfcf&hl=en&feature=player_embedded&fs=1"  />
<param
  name="allowFullScreen"
  value="true"  />
<param
  name="allowscriptaccess"
  value="always"  />
<embed
  src="http://www.youtube.com/v/zhWkTiMVWVI&color1=0xb1b1b1&color2=0xcfcfcf&hl=en&feature=player_embedded&fs=1"
  type="application/x-shockwave-flash"
  allowfullscreen="true"
  width="425"
  height="344">
</embed>
</object>

Step 2: put type="application/x-shockwave-flash" into the <object> tag:

<object
  width="425"
  height="344"
  type="application/x-shockwave-flash">
<param
 name="movie"
  value="http://www.youtube.com/v/zhWkTiMVWVI&color1=0xb1b1b1&color2=0xcfcfcf&hl=en&feature=player_embedded&fs=1" />
<param
  name="allowFullScreen"
  value="true" />
<param
  name="allowscriptaccess"
  value="always" />
<embed
  src="http://www.youtube.com/v/zhWkTiMVWVI&color1=0xb1b1b1&color2=0xcfcfcf&hl=en&feature=player_embedded&fs=1"
  type="application/x-shockwave-flash"
  allowfullscreen="true"
  width="425"
  height="344">
</embed>
</object>

Step 3: move src="..." from the <embed> tag to a data="..." attribute in the <object> tag:

<object
  width="425"
  height="344"
  type="application/x-shockwave-flash"
  data="http://www.youtube.com/v/zhWkTiMVWVI&color1=0xb1b1b1&color2=0xcfcfcf&hl=en&feature=player_embedded&fs=1">
<param
  name="movie"
  value="http://www.youtube.com/v/zhWkTiMVWVI&color1=0xb1b1b1&color2=0xcfcfcf&hl=en&feature=player_embedded&fs=1" />
<param
  name="allowFullScreen"
  value="true" />
<param
  name="allowscriptaccess"
  value="always" />
<embed
  src="http://www.youtube.com/v/zhWkTiMVWVI&color1=0xb1b1b1&color2=0xcfcfcf&hl=en&feature=player_embedded&fs=1"
  type="application/x-shockwave-flash"
  allowfullscreen="true"
  width="425"
  height="344">
</embed>
</object>

Step 4: remove the <embed> tag:

<object
  width="425"
  height="344"
  type="application/x-shockwave-flash"
  data="http://www.youtube.com/v/zhWkTiMVWVI&amp;hl=en&amp;fs=1">
<param
  name="movie"
  value="http://www.youtube.com/v/zhWkTiMVWVI&amp;hl=en&amp;fs=1" />
<param
  name="allowFullScreen"
  value="true" />
<param
  name="allowscriptaccess"
  value="always" />
</object>


You can View > Source to see that the code shown here does actually produce the YouTube video displayed.


Step X: it should be noted that in Firefox you don't need any "<param>" tags at all, which makes things very simple and clean:

<object
  width="425"
  height="344"
  type="application/x-shockwave-flash"
  data="http://www.youtube.com/v/zhWkTiMVWVI&amp;hl=en&amp;fs=1">
</object>

Not in IE, though; nope. (Why the </object> instead of a closing " />"? Because it seems to work more reliably in Firefox 3.0.5; I don't know why.)

Stylesheet for ATOM feed

Even though neither IE 7 nor Firefox 3.0.8 will render a stylesheet for an ATOM or RSS feed delivered over the network, we have set them up in hopes that someday this anomaly will be cured.

No aggregator I have ever seen goes to the basic trouble of sorting their input feeds by modified date, relying on the feed creator to "push" the latest onto the top of the stack. Our XSL stylesheet does sort by modified date, among other nice things.

Here's the XSL stylesheet for our ATOM feed, including sorting the entries:

<?xml version="1.0" encoding="utf-8"?>

<!--                                   -->
<!--     Philadelphia Reflections      -->
<!--   XSL Stylesheet for ATOM feed    -->
<!--                                   -->

<!-- Grateful acknowledgement to http://24ways.org/2006/beautiful-xml-with-xsl -->

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:atom="http://www.w3.org/2005/Atom"
  xmlns:dc="http://purl.org/dc/elements/1.1/">
  
    <xsl:output method="html" encoding="utf-8"/>
	
    <xsl:template match="/">
      <html>
      <head>
      <title>ATOM Feed for Philadelphia Reflections</title>
      <link rel="stylesheet" href="http://www.philadelphia-reflections.com/stylesheets/rssxsl.css" type="text/css"/>
      <link rel="stylesheet" href="http://www.philadelphia-reflections.com/stylesheets/images.css" type="text/css"/>
      <style type="text/css">
        .notvisible {visibility: hidden;}
      </style>
      </head>
      <body>
        <xsl:apply-templates select="/atom:feed"/>
      </body>
      </html>
    </xsl:template>
	
    <xsl:template match="/atom:feed">
      <div class="topbox">
        <p><img src="http://www.philadelphia-reflections.com/images/Newsfeed-Atom-24x24.png" alt="ATOM feed icon" /> 
        This is the <strong>ATOM-feed </strong> 
        for the <a href="http://www.philadelphia-reflections.com/"><xsl:value-of select="atom:title"/></a>
        website.<br />
        ATOM feeds allow you to stay up to date with the latest additions and changes 
        on  <xsl:value-of select="atom:title"/>.</p>
      </div>
        
      <div class="contbox">
        <table><tr>
          <td>
            <div class="mainbox">
            <div class="itembox">
              <h1><xsl:value-of select="atom:title"/></h1>
              <p><xsl:value-of select="atom:subtitle"/></p>

              <ul id="entries">
                <xsl:apply-templates select="atom:entry">
                  <xsl:sort select="atom:updated" order="descending"/>
                </xsl:apply-templates>
              </ul>

            </div>
            </div>
          </td>

          <td valign="top" width="30%">
            <div class="subscrbox">
            <div class="padrhsbox">
              <h2>Subscribe to this feed</h2>
              <p>If you use one of the following web-based News Readers,
                click on the appropriate button to subscribe to the RSS feed.</p>
              <a href="#" onClick="window.location='http://add.my.yahoo.com/rss?url=' + window.location;return false;">
                <img height="17" width="91" vspace="3" border="0" alt="my yahoo" src="http://www.philadelphia-reflections.com/images/addtomyyahoo4.gif"/>
              </a><br/>
              <a href="#" onClick="window.location='http://www.bloglines.com/sub/'+ window.location;return false;">
                <img height="18" width="91" vspace="3" border="0" alt="bloglines" src="http://www.philadelphia-reflections.com/images/rss-bloglines.gif"/>
              </a><br/>
              <a href="#" onClick="window.location='http://www.newsgator.com/ngs/subscriber/subext.aspx?url='+ window.location;return false;">
                <img height="17" width="91" vspace="3" border="0" alt="newsgator" src="http://www.philadelphia-reflections.com/images/rss-newsgator.gif"/>
              </a><br/>
              <a href="#" onClick="window.location='http://client.pluck.com/pluckit/prompt.aspx?GCID=C12286x053&amp;a=' + window.location + '&amp;t={title}';return false;">
                <img src="http://www.philadelphia-reflections.com/images/pluspluck.png" vspace="3" border="0" alt="Subscribe with Pluck RSS reader"/>
              </a><br/>
              <a href="#" onClick="window.location='http://www.rojo.com/add-subscription?resource=' + window.location;return false;">
                <img src="http://www.philadelphia-reflections.com/images/rss-rojo.gif" vspace="3" border="0" alt="Subscribe in Rojo"/>
              </a><br/>
              <a href="#" onClick="window.location='http://fusion.google.com/add?feedurl=' + window.location;return false;">
                <img src="http://gmodules.com/ig/images/plus_google.gif" vspace="3" border="0" alt="Add to Google"/>
              </a>
              <hr />
              <p>If you would like to receive an email whenever changes are made, please send me an email and I'll be glad to add you. 
              <br /><br /><a href="mailto:grfisheriii@gmail.com?subject=Add%20me%20to%20Philadelphia%20Reflections%20email%20list">Click to send an email</a></p>
            </div>
            </div>
          </td>

        </tr></table>

      </div>
    </xsl:template>
	
    <xsl:template match="atom:entry">
      <li style="margin-bottom: 25px; height: auto;">
        <a href="{atom:link/@href}">
          <xsl:value-of select="atom:title"/>
        </a>
        <small>
          &#8212; <xsl:value-of select="substring-before(atom:updated,'T')"/>
        </small>
        <br/>
        <div class="item_desc">
          <xsl:value-of select="atom:summary" disable-output-escaping="yes"/> <!-- disable-output-escaping="yes" does not work with Firefox 3.0.8 -->
        </div>
      </li>
    </xsl:template>

</xsl:stylesheet>

How to detect an iPhone and other mobile devices

Handheld/mobile devices have been exploding in popularity and with the advent of the iPhone they have become the device of choice. The Blackberry was a lovely device but once you try an iPhone you will never want a Blackberry again. Of course, all of this will change as each new device comes out but what won't change is the fact that mobile devices are supplanting PCs for everything but the most keyboard- or large-screen-intensive work.

Therefore, the popularity of a website/blog/whatever depends on making it accessible to mobile devices. Step one is knowing when you've been visited by such a thing.

iPhones

» Articles

The iPhone is very well-behaved with respect to CSS. Simply include the following meta tag in an otherwise-ordinary web page:

<!--[if !IE]>-->
<link rel="stylesheet" 
  type="text/css" 
  media="only screen and (max-device-width: 480px)" 
  href="iphone.css" />
<!--<![endif]-->

... which points to the iphone.css CSS stylesheet:

body { margin: 0; 
       padding: 0; 
       width: 100%; }				
				
    #wrapper { margin: 0; padding: 0; width: 100%; }
    
      #left    { display: none; }
      
      #right   { display: none; }
		
      #center  { margin: 0; }
      
        #welcome  { display: none; }
        
        #content  { line-height: 115%; 
                    font-family: "Times New Roman", Times, serif; 
                    padding: 5px 20px 0 20px; 
                    margin: 0;
                    font-size: 28px; 
                    background-color: white; }
        
        #comments { display: none; }
        
      #footer  { display: none; }

iphone.css uses { display: none; } to keep the surrounding boxes from displaying on the smaller screen. No other changes are required.

An excellent article on using CSS is here: http://www.bushidodesigns.net/blog/mobile-device-detection-css-without-user-agent/

(Subsequent to writing this article we wrote an iPhone-specific article page. The stylesheet method described does work but we had other reasons to make modifications to the content.)

» Index Page

The index page was a complete rewrite of the standard index page to turn it into a simple table of contents. The breakthrough here was an exquisite PHP script found at http://detectmobilebrowsers.mobi/ which analyzes the HTTP headers to determine if a device is a handheld and if so, what type. Using this, we redirect from the regular index page to the iPhone-specific index page (from index.php to indexiphone.php ... it looks much better on an iPhone than on a PC).

Generic Handheld Devices

We may build more device-specific CSS files and pages as we learn what our visitors use but for the time being we simply support iPhones and Other.

Because many handheld devices have very small screens and a very small buffer capacity, we strip out all HTML comments, images and tables; we also convert to UTF-8 encoding because of the claim that handhelds support it better than iso-8859-1:

$content = preg_replace('/<!-- .*? -->/si', '', $content);
$content = preg_replace('%<table [^>]*?>.*?</table>%si', '', $content);
$content = preg_replace('/<img [^>]*?>/i', '', $content);
echo utf8_encode($content);

We include these meta tags (cribbed from Google's mobile page):

<meta name="viewport" content="width=device-width,minimum-scale=1.0,maximum-scale=1.0"/>
<meta name="HandheldFriendly" content="true" />

We also send handheld-specific HTTP headers ... see XHTML vs. HTML for the script we use for all page headers.

» Articles

It turns out in real life that many handhelds don't handle CSS correctly, so the { display: none; } trick that works so well for iPhones is unreliable for many other devices. Furthermore, many devices choke if sent a large data stream. Therefore, we had to write a very stripped-down page for each individual article. This required writing only one new program since all of our content is served from a database.

The http://detectmobilebrowsers.mobi/ function is used to redirect from the regular page to the handheld page.

Examples:

A glance at the page source of each of these pages will show you the extent to which the handheld version is stripped down; rather elegant, really.

Finally, in the regular pages we include this link tag; wishful thinking I suspect:

<link rel="alternate" 
  media="handheld" 
  href="http://www.philadelphia-reflections.com/reflectionsHandheld.php?type=blog&amp;key=####" />

» Index Page

The generic handheld index page (http://www.philadelphia-reflections.com/indexhandheld.php) is an even-more stripped down version of the iPhone index page (http://www.philadelphia-reflections.com/indexiphone.php).

Testing and Validating

The place to start testing your new pages is the Opera Mini Simulator: http://www.opera.com/mini/demo/. Elegant in its simplicity, it has never choked or failed, unlike most other emulators. Plus, it is completely intuitive; unlike all other emulators. And free.

Once you get the basic design underway, you will want to validate ... at http://www.ready.mobi/.

For fine-tuning of displays on cell phones ... http://www.wap-proof.com.


To run the Blackberry emulators on a Windows machine:

I found help for this ridiculous process at http://www.cantoni.org/2007/12/18/blackberrysimulator You may, too.

Only a sadistic socially-crippled geek-savant could have dreamed up such a convoluted mess, but ultimately it does work and does allow you to see how the different models operate. Actually, once you get it working,it's sort of fun to try different models.


(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

mysql_insert_assoc

Function to make inserting new rows into a database table easier (and safe because quote_smart logic is included inline)

thanks to R. Bradley @ php.net; I have fixed a number of bugs and added quote_smart functionality

My own contribution to php.net is here: george at georgefisher dot com

<?php
function mysql_insert_assoc ($my_table, $my_array) {
   
//
// Insert values into a MySQL database
// Includes quote_smart code to foil SQL Injection
//
// A call to this function of:
//
//  $val1 = "foobar";
//  $val2 = 495;
//  mysql_insert_assoc("tablename", array(col1=>$val1, col2=>$val2, col3=>"val3", col4=>720));
//
// Sends the following query:
//  INSERT INTO tablename (col1, col2, col3, col4) values ('foobar', 495, 'val3', 720)
//
 
    global $db_link;
    
    // Find all the keys (column names) from the array $my_array
    $columns = array_keys($my_array);

    // Find all the values from the array $my_array
    $values = array_values($my_array);
       
    // quote_smart the values
    $values_number = count($values);
    for ($i = 0; $i < $values_number; $i++)
      {
      $value = $values[$i];
      if (get_magic_quotes_gpc()) { $value = stripslashes($value); }
      if (!is_numeric($value))    { $value = "'" . mysql_real_escape_string($value, $db_link) . "'"; }
      $values[$i] = $value;
      }
         
    // Compose the query
    $sql = "INSERT INTO $my_table ";

    // create comma-separated string of column names, enclosed in parentheses
    $sql .= "(" . implode(", ", $columns) . ")";
    $sql .= " values ";

    // create comma-separated string of values, enclosed in parentheses
    $sql .= "(" . implode(", ", $values) . ")";
       
    $result = @mysql_query ($sql) 
              OR die ("<br />\n<span style=\"color:red\">Query: $sql UNsuccessful :</span> " . mysql_error() . "\n<br />");

    return ($result) ? true : false;
}
?>

mysql_update_assoc is a similar function that updates existing records.

Also thanks to http://centricle.com/tools/html-entities/ for encoding

mysql_update_assoc

Function to make updating rows in a database table easier (and safe: quote_smart logic is implented inline).

<?php
function mysql_update_assoc ($my_table, $my_array, $where_conditions) {

//
// Update values in a MySQL database table
// Includes quote_smart code to foil SQL Injection
//
// A call to this function of:
//
//  $val1 = "foobar";
//  $val2 = 495;
//  mysql_update_assoc("tablename", array(col1=>$val1, col2=>$val2), array(table_key=>52, age=>"old"));
//
// Sends the following query:
//  UPDATE tablename SET col1 = 'foobar', col2 = 495 WHERE table_key = 52 AND age = 'old'
// 
//                  -- and --
//
//  $table_name = "tablename";
//  mysql_update_assoc($table_name, array(col1=>$val1, col2=>$val2), array(table_key=>52));
//
// Sends this:
//  UPDATE tablename SET col1 = 'foobar', col2 = 495 WHERE table_key = 52
//
// Note: the WHERE clause is always "=" and always AND
//

global $db_link;

$sql = "UPDATE $my_table SET ";

// quote_smart the data values and create a comma-separated string of column_name = value
foreach ($my_array as $key => $value)
  {
  if (get_magic_quotes_gpc()) { $value = stripslashes($value); }
  if (!is_numeric($value))    { $value = "'" . mysql_real_escape_string($value, $db_link) . "'"; }
  $sql .= "$key = $value, ";
  }
$sql = substr($sql, 0, -2);  // remove trailing ", "

// quote_smart the conditional values and create a comma-separated string of column_name = value AND
$conditional_pairs = NULL;
foreach ($where_conditions as $key => $value)
  {
  if (get_magic_quotes_gpc()) { $value = stripslashes($value); }
  if (!is_numeric($value))    { $value = "'" . mysql_real_escape_string($value, $db_link) . "'"; }
  $conditional_pairs .= "$key = $value AND ";
  }
$conditional_pairs = substr($conditional_pairs, 0, -5);  // remove trailing " AND "

$sql .= " WHERE $conditional_pairs";

$result = @mysql_query ($sql) 
          OR die ("<br />\n<span style=\"color:red\">Query: $sql UNsuccessful :</span> " . mysql_error() . "\n<br />");

return ($result) ? true : false;
}
?>

mysql_insert_assoc is a similar function that adds new records.

Thanks to http://www.primitivetype.com/resources/htmlentities.php for encoding

An iPhone web app

The iPhone is the best PDA to come along since the Blackberry 15 years ago. It is to the Blackberry what the Blackberry was to cell phones.

Philadelphia Reflections is now a fully-fledged iPhone web app. The application will appear on your iPhone in the appropriate format automatically: just navigate to http://www.philadelphia-reflections.com with the iPhone browser; we will detect it and do the right thing.

A two-step process is required to get a little icon on your iPhone home page so you can go there directly:

  1. Click the "+" plus sign at the bottom of the iPhone screen
  2. Click the "Add to Home Screen" button that appears.

That's it. We do all the rest.

We are listed in the Apple web-app store: click here to go see our special page.

http://www.philadelphia-reflections.com/images/missing_img.gif
Click Add to Home Screen
http://www.philadelphia-reflections.com/images/missing_img.gif
Click the plus sign

Valid XHTML YouTube embed code generator

Escaping for PHP Output to JavaScript

To send data to a JavaScript script from PHP, three levels of escaping are required as shown in the snippet below:

<script type="text/javascript">
// <![CDATA[


// escape for JavaScript
$message   = preg_replace("/\r?\n/", "\\n", addslashes($message));

// escape for XHTML
$message   = preg_replace('%</%i', '<\/', $message);

// send to JavaScript
echo "   var message = \"$message\";\n";


// ]]>
</script>

(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

Google Maps Icons

Google Maps/Earth do not make icon creation & manipulation easy. Here are a couple of tips:

GIcon (look here: http://code.google.com/apis/maps/documentation/reference.html#GIcon) has a number of methods for creating and modifying an icon. I've found it's best to start with the default because adding features you expect is harder than you think.

// Here's how to create a new icon with the defaults
var newIcon = new GIcon(G_DEFAULT_ICON);

// To create a new icon like the default but yellow:
var yellowIcon = new GIcon(G_DEFAULT_ICON, "http://www.philadelphia-reflections.com/images/googlemapsmarkeryellow.png");

// To make use of that yellow icon and give it a tooltip
var point   = new GLatLng(40.39, -75.34);
var marker  = new GMarker(point, {icon:yellowIcon, title:"View Above Philadelphia"});
map.addOverlay(marker);

// To open a balloon when clicked
var message = " ... fill with text and HTML ... I've found tables are very helpful ";
GEvent.addListener(marker, 'click', function() {marker.openInfoWindowHtml(message);});

// Change icon on mouseover (see http://www.cems.uwe.ac.uk/~cjwallac/apps/phpxml/showIcons.php)
var msoverIcon = new GIcon(G_DEFAULT_ICON, "http://maps.google.com/mapfiles/kml/pal2/icon1.png");
GEvent.addListener(marker, 'mouseover', function() { marker.seticon(msoverIcon); });

Note: the "message" part of that snippet has to be escaped correctly.
See http://www.philadelphia-reflections.com/blog/1783.htm


(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

QR Codes

QR Codes are similar to bar codes in that they are read optically. Most-common in Japan, all Japanese cell phones can read them; all fancy phones in America (iPhone, etc.) have download-able apps that can read QR Codes (semacode is a free QR Code app for the iPhone but there are many for all).

One application that is becoming common is encoding a website's URL and including the image in a print advertisement.

The QR Codes below were created by http://qrcode.kaywa.com/; create QR and semacode/DataMatrix from text: http://invx.com/code/.

QR Codes
{philadelphia reflections qrcode} {george fisher qr code}
Philadelphia Reflections George Fisher (Flash... N/G on iPhone)
{chemical heritage society qr code} {kaiser qr code}
Chemical Heritage Society Kaiser Permanente
{george fisher advisors qr code} http://www.philadelphia-reflections.com/images/missing_img.gif
George Fisher Advisors QR Code George Fisher Advisors semacode/DataMatrix

ZNOTE: Website Development

.

Display Image

This clever website will create an image with the specified dimensions

<img src="http://dummyimage.com/340x123" alt="A Dummy Image" />

{A Dummy Image}
Image created with specific dimensions

Python URL Handling

In case you're wondering "How the heck does Python handle headers and data under 3.2.2?", here's an example that works using IDLE and Python 3.2.2 installed on a 64-bit Windows 7 machine.

import re
import urllib.request
url = "http://www.philadelphia-reflections.com"

uf = urllib.request.urlopen(url)

# header information
print('--- headers ---')
info = uf.info()  # headers

#headers = info._headers # a list of all the headers

print('charsets:',info.get_charsets())
print('content_charset:',info.get_content_charset()) 
print('content_type:',info.get_content_type())
print('content_maintype:',info.get_content_maintype())
print('content_subtype:',info.get_content_subtype())
print('default_type:',info.get_default_type())
print('filename:',info.get_filename())
print('params:',info.get_params())
print('payload:',info.get_payload())

print()
print('--- data ---')

data = uf.read().decode(info.get_content_charset()) # content
print(data[:500])

print()
print('--- image ---')

imageurl = url + "/images/001.JPG"
image = urllib.request.urlretrieve(imageurl, 'python_001.jpg')
print(image)

(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

Website Test Results

The site www.webpagetest.org is an excellent facility for testing the performance of a web site. Philadelphia Reflections passes with flying colors:

{Response time test with lots of images}
Response time test with lots of images

REFERENCES


Excellent site for website testing www.webpagetest.org

Iterate through a Word document, modifying picture properties(Blog 2300)

(Blog 2300) We have a facility on this website to download books of many chapters (made up of volumes of topics on the site) to Microsoft Word for subsequent editing and eventual publishing. In many cases we download lots of pictures (via an img src= tag). I have not found a way to set the way text flows around the images in Word using HTML or CSS, so I built a Word macro to do it. This should allow you to change the size of images, as well as move them around. Moving the captions requires the use of the captions feature in Word's image menu (right-click).

------------------------------------

Instructions for use of a Macro named Sub ImageFlow():

  1. Open Word

  2. In Word, enter File>Open

  3. enter the URL of the file you want to modify into the File Entry box and press the Enter key to load the document. It may take a minute or two, but a working screen should appear, loaded with the file in a condition ready to move the pictures around.

  4. Press Alt + F11 which will open the VBA screen

  5. Copy the macro found on this page from
    Sub ImageFlow()
    to
    End Sub
  6. In the right-hand panel of the VBA screen press Ctrl+A, Ctrl+V to paste it in

  7. In the VBA screen press F5 to run the macro

If you want to do a lot of these manipulations, save the macro in the Macro Library of Windows Word.

------------------------------------
Sub ImageFlow()
'
'  this Macro goes through an entire Word document and
'  changes the way text flows around each picture
'  ("Tight" in this example but see below for choices)
'
    Dim shpIn As InlineShape, shp As Shape

    For Each shpIn In ActiveDocument.InlineShapes
        If (shpIn.Type = wdInlineShapeLinkedPicture) Then
            Set shp = shpIn.ConvertToShape
            shp.WrapFormat.Type = wdWrapTight
        End If
    Next shpIn

    For Each shp In ActiveDocument.Shapes
        shp.WrapFormat.Type = wdWrapTight
    Next shp

End Sub
----------------------------------------

Change wdWrapTight to any of the following:
wdWrapBehind
wdWrapFront
wdWrapInline
wdWrapNone
wdWrapSquare
wdWrapThrough
wdWrapTight
wdWrapTopBottom

My thanks to http://www.phrebh.com/Jenius/252-center-pictures-in-word-with-vba/ for showing me the essential technique of iterating through the pictures.

What are the InlineShapes' Types? See http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word.inlineshape.type(v=office.11).aspx; it is possible we may also need to select on wdInlineShapePicture (as well as wdInlineShapeLinkedPicture) but for my specific purpose I did not need to.

SMTP Authorization and Handling Bounced Emails with PEAR Mail

Recently our ISP started requiring user signon in order to send emails. PHP's mail function stopped working as a result.

Naturally, the ISP did not notify us of this change so we were quite surprised when many thousands of emails on our newsletter list were rejected (every one of them, in fact).

What error message was returned to us to notify us of what the problem was? Why this helpful note:

Mail sent by user nobody being discarded due to sender restrictions in WHM->Tweak Settings

Doesn't that just say it all?

I'm being snide, but our ISP is really quite good about keeping its software up to date and aside from an occasional surprise like this, they are very reliable. Being up to date included the automatic incorporation of the PEAR Mail facility which we are now using.

PEAR's Mail system works quite well but two problems were very vexing until we stumbled our way to a solution:

  1. How, exactly, do we sign on to the SMTP server?
  2. How do we ensure that bounced emails (the bane of all email lists) get returned to us?

You might not think that the first question would be so hard but it actually took a good deal of trial and error to get it right. As for the second question, there is an awful lot of wrong information available out in Internet land (including but not limited to VERP and XVERP which I advise you to avoid).

With PEAR Mail you first set up a "factory" and then send emails, either singly or in a loop. We keep the user id, password, etc. in a file "above" the web server in hopes that will keep them secret ... here's the code (it actually is in production and it does in fact work):

<?php
include('Mail.php');

# the email constants are contained in a file outside the web server
include("/level1/level2/level3/constants.php");

$headers = array (
         'From' => '"name"<addr@domain.com>',
         'Sender' => '"name"<addr@domain.com>',
         'Reply-To' => '"name"<addr@domain.com>',
         'Return-Path' => 'addr@domain.com',
         'Content-type' => 'text/html; charset=iso-8859-1',
         'X-Mailer' => 'PHP/' . phpversion(),
         'Date' => date("D, j M Y H:i:s O",time()),
         'Content-Language' => 'en-us',
         'MIME-Version' => '1.0'
         );

// call the PEAR mail "factory"
$smtp = Mail::factory('smtp',
      array (
            'host' => EMAIL_HOST,
            'port' => EMAIL_PORT,
            'auth' => true,
            'username' => EMAIL_USERNAME,
            'password' => EMAIL_PASSWORD,
            'persist' => true,
            'debug' => false
            ), '-f addr@domain.com'
      );

# to send emails:
#
# $headers['To']      = $to;        # provide the "$to" variable, something like $to = '"name"<addr@domain.com>';
#                                   # note that the first parameter of $smtp->send can be "decorated" this way or just a naked email address
# $headers['Subject'] = $subject;   # provide the "$subject" variable
# $mail = $smtp->send($to, $headers, $contents_of_the_email);
#                          -------- ................................> except for 'To' and 'Subject',
#                                                                     $headers is provided by this module but can be over-ridden
# if (PEAR::isError($mail))
# {
#   echo "<p style='color:red;'>The email failed; debug information follows:<br />";
#   echo $mail->getDebugInfo() . "<br />";
#   echo $mail->getMessage()   . "</p>";
# }
# else
# {
#   echo "<p>email successfully sent</p>";
# }

?>

My thanks to http://htmlentities.net/ for the HTML entites conversion.

Quick Analysis of Financial-Industry

Quick Analysis of Financial-Industry

Big-Data Analytic Needs

DRAFT George Fisher July 24, 2017

Abstract

Databricks intends to create a Finance Vertical position to support the Sales and SA teams when working with financial-industry organiza- tions. This article attempts to describe the structure of the worldwide financial industry, who the major players are and what their needs might be in the context of Apache Spark and Databricks offerings.

Contents

1 Executive Summary 2

2 Introduction 2

3 Risk Mitigation 3

4 Opportunity Discovery 5

5 Finance-Industry Sponsored Kaggle Contests 6

6 Spark and Finance on YouTube 11

7 APPENDIX 15


1 Executive Summary

The opportunities in the finance sector lie on a wide spectrum: at one end are the quant funds for whom large-scale analytics are the entire business, at the other are traditional depositories for many of whom a daily batch cycle and a quarterly book closing have long sufficed. Quite often both extremes exist in the same company.

For this entire spectrum the easy-to-use, streaming, multi-source, big-data ana- lytics offered by Databricks can offer advantages. Perhaps with quick adoption by the quants and slower adoption by the others. Early adoption may involve a lot of discovery but a growing collection of proven use cases will ease later sales.

1

. streaming will supplant batch

. predictive analytics will replace BI

. easy multi-sourcing can unite stove pipes

. pooling can dramatically reduce operational complexity and cost

In addition, in the larger companies, the pressure to comply with data- related regulations company-wide has become almost overwhelming and nearly all are struggling with multitudes of incompatible systems that Spark might unite.

2 Introduction

The finance industry is vast, far too large and diverse to make a comprehensive enumeration of all the functions performed or of the firms that perform them. The Economist Intelligence Unit [14] might be a good source to begin with for such a survey.

The Appendix of this report contains lists of the major financial organizations grouped by function starting on page 15.

The questions of interest to Databricks are (1) which finance firms are most likely to benefit from the manipulation and analysis of large datasets and (2) what are the types of manipulation and analysis of interest?

The two main concerns for the finance industry are:

. Risk Mitigation

. Opportunity Discovery

1 I wonder if the entirely cloud-based solution offered by Databricks does not leave a lot on the table given the pervasiveness of proprietary datacenters in this world. IBM mainframes, at that.


3 Risk Mitigation

[7]

Simply put, risk mitigation means don’t lose money, don’t go out of business and don’t go to jail.

Risk Categories

1. Business Risk Risks undertaken by the business itself to maximize share- holder value and profits. For example: the cost to launch a new product. Risk mitigation takes the form of competent management controls.

2. Exogenous Risk Political upheaval, natural disaster, economic disrup- tion. Insurance is the most-common risk mitigation tool in these cases.

3. Financial Risk Financial risk arises from volatility in equities, deriva- tives, currencies, interest rates etc. In the case of financial firms these risks are also Business Risks since finance is the business.

. Market Risk Changes in prices, their magnitude, direction and volatility.

. Credit Risk The effect of counter-party default or the repercussions of providing services to bad actors.

. Liquidity Risk The inability to make timely payment. Margin calls often precipitate this when illiquid securities cannot be sold or col- lateralized.


. Operational Risk Failures of judgment, integrity, controls, proce- dures or technology.

Cyber Security An aspect of Operational Risk that gains clar- ity at senior levels with every report of the losses incurred and chaos engendered by widespread sophisticated hacking.

Financial-firm financial-risk mitigation is a field of study unto itself. For example, there is a rigorous, multi-partFinancial Risk Manager (FRM) Certification [5] created by Global Association of Risk Professionals (GARP).

4. Regulatory Compliance While perhaps not a risk per se this is a huge concern to financial firms, particularly since the Financial Crisis of a decade ago and the rules promulgated as a response.

For example, one of the main tenets of BCBS 239 [15] is that all ‘material risk data’ must be automatically aggregated and analyzed across the en- tire banking group on a near-real-time basis while facing severe economic stresses. Multitudes of incompatible systems are a huge barrier.

[11]


4 Opportunity Discovery

If Risk Mitigation is Operations, Opportunity Discovery is Research & Devel- opment.

An inexhaustive list:

• F undamental Analysis The study of the financial characteristics of in- dividual firms, seeking undiscovered value. Warren Buffett is the world’s most-famous fundamental analyst.

• Macro The study of economy-wide signals. George Soros’ famous short of the UK Pound is an example [12]. The ‘Big Short’ of 2007-2008 is another [22].

• Relative The study of relative movements of securities. Long/Short hedge funds are an example.

• T ec hnical Analysis The study of trendlines.

• Quantitative Analysis The intersection of big data and machine learn- ing. Jim Simons’ Renaissance Capital [16] is the most successful example I know of but there are many others; some are listed in the appendix be- ginning on page 17. Some Kaggle contests focused on this, see Section

5.

• Product Development Swaps are an example of building a product to meet very specific customer needs. Even more sophisticated products are possible with analytical support using all available data.

• Customer Enhancement Using machine learning to reduce customer churn; using predictive analytics for product-customer targeting; consis- tent customer support across multiple access channels; etc. . . . using Ama- zonian techniques in a banking environment to take on the characteristics of the fintechs.

• Cost Control Route optimization for filling ATMs; redundant process identification; risk reduction not just as a regulatory requirement, but as a cost saver and a profit enhancer

• Risk System Integration The regulators are forcing the larger firms to create “living wills” which has resulted in a much better understanding the the numerous piece parts. The Basel risk data requirements are now forcing a near-real-time integration of numerous disparate systems. This seems like fertile ground for innovation both for compliance and to build upon the results.


5 Finance-Industry Sponsored Kaggle Contests

Over the past several years a number of financial firms have sponsored Kaggle contests. Someone at these firms thought that these subjects were worth paying for crowd-sourced analysis and was willing to go to the considerable trouble of setting up and monitoring a contest with thousands of participants lasting three months or more.

Two Sigma is a quant fund, listed in the appendix on pages 17 and 23. The challenge was to predict daily price changes. (In this contest I earned a Kaggle Silver Medal for coming in 37th out of 2,070 contestants. [9])

Opportunity Di sco v ery

Improve credit risk models by predicting the probability of default on consumer credit.

Risk Mitigation

Improve the quality of information within transaction data.

Risk Mitigation

Predict which customers will leave an insurance company in the next 12 months.

Risk Mitigation

Given a dataset of 2D dashboard camera images, State Farm is challenging Kag-


glers to classify each driver’s behavior. Are they driving attentively, wearing their seatbelt, or taking a selfie with their friends in the backseat?

Risk Mitigation

Santander (Spain-based bank) is challenging Kagglers to predict which products their existing customers will use in the next month based on their past behavior and that of similar customers.

Opportunity Di sco v ery

Santander Bank is asking Kagglers to help them identify dissatisfied customers early in their relationship.

Risk Mitigation , Opportunity Disco very

Using terabytes of noisy, non-stationary data Winton Capital is looking for data scientists who excel at finding the hidden signal in the proverbial haystack, and who are excited by creating novel statistical modeling and data mining tech- niques.

Opportunity Di sco v ery

Using a customers shopping history, can you predict what insurance policy they will end up choosing?

Opportunity Di sco v ery


Claims management may require different levels of check before a claim can be approved and a payment can be made. With the new practices and behaviors generated by the digital economy, this process needs adaptation thanks to data science to meet the new needs and expectations of customers. Kagglers are challenged to predict the category of a claim based on features available early in the process.

Risk Mitigation , Opportunity Disco very

The life insurance application process is antiquated. Customers provide exten- sive information to identify risk classification and eligibility, including scheduling medical exams, a process that takes an average of 30 days.

The result? People are turned off. Thats why only 40% of U.S. households own individual life insurance. Prudential wants to make it quicker and less la- bor intensive for new and existing customers to get a quote while maintaining privacy boundaries.

Opportunity Di sco v ery

Predict a transformed count of hazards or pre-existing damages using a dataset of property information. This will enable Liberty Mutual to more accurately identify high risk homes that require additional examination to confirm their insurability.

Risk Mitigation

Fire losses account for a significant portion of total property losses. High sever- ity and low frequency, fire losses are inherently volatile, which makes modeling them difficult. In this challenge, your task is to predict the transformed ratio of loss to total insured value. This will enable more accurate identification of each policyholders risk exposure and the ability to tailor the insurance coverage for


their specific operation.

Risk Mitigation

The Benchmark Bond Trade Price Challenge is a competition to predict the next price that a US corporate bond might trade at.

Opportunity Di sco v ery

Determine whether a loan will default and the loss incurred. We are building a bridge between traditional banking, where we are looking at reducing the con- sumption of economic capital, to an asset-management perspective, where we optimize on the risk to the financial investor.

Risk Mitigation

Develop models to predict the stock market’s short-term response following large trades. Contestants are asked to derive empirical models to predict the behavior of bid and ask prices following such “liquidity shocks”.

Modeling market resiliency will improve trading strategy evaluation methods by increasing the realism of back testing simulations, which currently assume zero market resiliency.

Risk Mitigation , Opportunity Disco very

Bodily Injury Liability Insurance covers other peoples bodily injury or death for which the insured is responsible. The goal of this competition is to predict Bod- ily Injury Liability Insurance claim payments based on the characteristics of the insureds vehicle.

Risk Mitigation


Allstate is currently developing automated methods of predicting the cost, and hence severity, of claims. Kagglers are invited to create an algorithm which accurately predicts claims severity.

Risk Mitigation


6 Spark and Finance on YouTube

• Apache Spark on IBM z Systems Demo for Finance

https://www.youtube.com/watch?v=yw0dQFMyxFQ

References to IMS, CICS and VSAM makes me think this is Spark on an IBM mainframe. Considering the fact that IBM mainframes are still quite widely used, this might be worth understanding.

Opportunity Discovery , Risk Mitigation

• Using Spark to Analyze Activity and Performance in High Speed

T rading En vironmen ts

https://www.youtube.com/watch?v=zdz9Cj1-hjA

Corvil: Irish data monitoring and analytics for financial data using Spark. Non-intrusive low-latency electronic trading monitoring, regulatory com- pliance through the use of streaming telemetry.

Risk Mitigation

• Spark in Finance Quantitative Investing

https://www.youtube.com/watch?v=WPc-DoSeCpU&t=7s

Reading historical and live tick data, determine a trend and propose trades.

Opportunity Disc o v ery

• Financial Modeling Using Apache Spark

https://www.youtube.com/watch?v=jCXOa6doXEs

Blackrock mortgage analysis of mortgage data. Using Spark, Scala and D3 to visualize a large loan-level mortgage dataset, extract distributions and cluster boundaries. Also use K-Means to reveal similar borrower groups and corresponding discriminant attributes.

Opportunity Disc o v ery

• Estimating Financial Risk with Spark

https://www.youtube.com/watch?v=0OM68k3np0E

VaR with Monte Carlo using market risk factors explained by Cloudera

Risk Mitigation


• Apache Spark in Financial Modeling at BlackRock https://www.youtube.com/watch?v=wLJi8YQcWjc&t=2881s Blackrock mortgage security analysis: Why Scala? Why Spark? Opportunity Discovery

• A Distributed Time Series Analysis Framework for Spark

https://www.youtube.com/watch?v=x2iM5he2gAU

Two Sigma equity price prediction with mutli-terabyte datasets; built own time series system on top of Spark

Opportunity Disc o v ery

• Credit Fraud Prevention with Spark and Graph Analysis

https://www.youtube.com/watch?v=q5HFMVoN_rc

Capital One fraud detection

Risk Mitigation

• Stratio’s Big Data - Use case in finance

https://www.youtube.com/watch?v=wmuG3nU9fiY

Catroon marketing video for Stratio Big Data Inc. which I did not inves- tigate

• Estimating Financial Risk with Spark https://www.youtube.com/watch?v=t2RmlshHBvI Duplicate Cloudera VaR presentation

• IBM LinuxONE Scalable Financial Trading Analysis & Insight

https://www.youtube.com/watch?v=Uw2ZioWa-Ak

Combine streaming market data, Twitter, news feed using Spark on an

IBM Linux ONE machine. Did not investigate IBM Linux ONE.

• Streaming Stock Market Data with Apache Spark and Kafka

https://www.youtube.com/watch?v=0tSZo8I2924&t=3139s

MapR presentation: high velocity streaming processing post-Hadoop at NYSE 20 Megabytes per second time windowing. One Kafka topic per stock, parallelized.

Risk Mitigation , Opportunity Disco very

• An Example Application for Processing Stock Market Trade

Data


https://www.youtube.com/watch?v=CXJK4SII0IY MapR presentation on streaming NYSE data pub/sub Opportunity Discovery

• Time Series Stream Processing with Spark and Cassandra

https://www.youtube.com/watch?v=fBWLzB0FMX4

Cloudance Ltd: multi-station weather data, group by on petabytes in operational setting. not trade data but similar structure.

• Realtime Risk Management Using Kafka, Python, and Spark

Streaming

https://www.youtube.com/watch?v=ObBdwhbyv1M

• D AT A & ANALYTICS: Analyzing 25 billion stock market events in an hour with NoOps on GCP https://www.youtube.com/watch?v=fqOpaCS117Q



7 APPENDIX

Global Financial Services Companies by Revenue

[20]

Berkshire Hathaway

Conglomerate

210.8

United States

AXA

Insurance

147.5

France

Allianz

Insurance

140.3

Germany

ICBC

Banking

134.8

China

Fannie Mae

Investment Services

131.9

United States

ING

Banking

130.0

Netherlands

BNP Paribas

Banking

126.2

France

Generali Group

Insurance

116.7

Italy

China Construction Bank

Banking

113.1

China

Banco Santander

Banking

108.8

Spain

JP Morgan Chase

Banking

108.2

United States

Socit Gnrale

Banking

107.8

France

HSBC

Banking

104.9

United Kingdom

Agricultural Bank of China

Banking

103.0

China

Bank of America

Banking

100.1

United States

Bank of China

Banking

98.1

China

Wells Fargo

Banking

91.2

United States

Citigroup

Banking

90.7

United States

Prudential

Insurance

90.2

United Kingdom

Munich Re

Insurance

88.0

Germany

Prudential Financial

Insurance

84.8

United States

Freddie Mac

Investment Services

80.6

United States

Banco Bradesco

Banking

78.3

Brazil

Lloyds Banking Group

Banking

75.6

United Kingdom

Ita Unibanco Holding

Banking

70.5

Brazil

Zurich Insurance Group

Insurance

70.4

Switzerland

Aviva

Insurance

69.0

United Kingdom

Banco do Brasil

Banking

69.0

Brazil

MetLife

Insurance

68.2

United States

American International Group

Insurance

65.7

United States

China Life Insurance

Insurance

63.2

China

Mitsubishi UFJ Financial Group

Banking

59.0

Japan

Legal & General Group

Insurance

56.9

United Kingdom

Dai-ichi Life

Insurance

56.5

Japan

Barclays

Banking

55.7

United Kingdom

Aegon

Insurance

55.2

Netherlands

Deutsche Bank

Banking

55.0

Germany

UniCredit

Banking

54.2

Italy

CNP Assurances

Insurance

53.2

France

BBVA

Banking

52.1

Spain

Credit Agricole

Banking

51.2

France


Ping An Insurance Group

Insurance

51.1

China

National Australia

Banking

49.2

Australia

Commonwealth Bank

Banking

47.8

Australia

Intesa Sanpaolo

Banking

47.7

Italy

UBS

Investment Services

47.7

Switzerland

Sumitomo Mitsui Financial Group

Banking

47.3

Japan

Westpac Banking Group

Banking

43.9

Australia

Bank of Communications

Banking

43.5

China

Credit Suisse Group

Investment Services

42.5

Switzerland

MS&AD Insurance Group

Insurance

42.2

Japan

Royal Bank of Scotland

Banking

42.1

United Kingdom

Goldman Sachs

Investment Services

41.7

United States

People’s Insurance Company

Insurance

41.3

China

Tokio Marine Holdings

Insurance

39.4

Japan

Royal Bank of Canada

Banking

38.3

Canada

ANZ

Banking

37.5

Australia

Manulife Financial

Insurance

37.3

Canada

Sberbank

Banking

36.1

Russia

State Bank of India

Banking

35.1

India

Talanx

Insurance

34.9

Germany

Power Corporation of Canada

Insurance

34.2

Canada

Swiss Re

Insurance

33.6

Switzerland

American Express

Financial Services

33.4

United States

Allstate

Insurance

33.3

United States

Mizuho Financial Group

Banking

32.8

Japan

Old Mutual

Investment Services

32.2

United Kingdom

Morgan Stanley

Investment Services

32.0

United States

Standard Life

Insurance

31.2

United Kingdom

Sompo Holdings

Insurance

30.9

Japan

TD Bank Group

Banking

30.6

Canada

China

Banking

28.4

China

China

Banking

27.9

China

Bank of Nova Scotia

Banking

27.6

Canada

Onex

Investment Services

27.4

Canada

China

Insurance

27.3

China

Mapfre

Insurance

27.1

Spain

Standard Chartered

Banking

26.9

United Kingdom

Dexia

Banking

26.6

Belgium

Hartford Financial Services

Insurance

26.4

United States

Travelers Cos

Insurance

25.7

United States

Commerzbank

Banking

25.5

Germany

Aflac

Insurance

25.4

United States

Shanghai Pudong Development

Banking

25.4

China


Major Stock Exchanges

[21]

New York Stock Exchange

United States

New York

NASDAQ

United States

New York

London Stock Exchange Group

United Kingdom

London

Japan Exchange Group

Japan

Tokyo

Shanghai Stock Exchange

China

Shanghai

Hong Kong Stock Exchange

Hong Kong

Hong Kong

Euronext

European Union

Amsterdam, Brussels, Lisbon, London, Paris

Shenzhen Stock Exchange

China

Shenzhen

Toronto Stock Exchange

Canada

Toronto

Deutsche Brse

Germany

Frankfurt

Bombay Stock Exchange

India

Mumbai

National Stock Exchange of India

India

Mumbai

SIX Swiss Exchange

Switzerland

Zurich

Australian Securities Exchange

Australia

Sydney

Korea Exchange

South Korea

Seoul

OMX Nordic Exchange

Sweden

Stockholm

JSE Limited

South Africa

Johannesburg

BME Spanish Exchanges

Spain

Madrid

Taiwan Stock Exchange

Taiwan

Taipei

BM&F Bovespa

Brazil

So Paulo

Quant F unds

[13]

• D. E. Shaw (New York, NY)

• Renaissance Technologies (East Setauket, NY)

• Morgan Stanley PDT (New York, NY)

• Point72 Asset Management (SAC Capital)

• AQR Capital

• Two Sigma Investments (New York, NY)

• Citadel (Chicago, IL)

• Jane Street Capital (New York and London)

• RG Niederhoffer

• Jump Trading


• KCG Holdings

• Bridgewater Associates

• Hudson River Trading

• Man Group AHL

• Highbridge

• Millennium/WorldQuant

• Winton

• Bluecrest

• Ellington Capital

• Tower Research Capital

• Parametrica Global Master Ltd

• Camox Ltd

• Voloridge Trading

• Senvest Partners Ltd

• BlackRock European Hedge

Credit Card Issuers

[1]

1. Visa - 323M Cardholders

2. MasterCard - 191M Cardholders

3. Chase - 93M Cardholders

4. American Express - 58M Cardholders

5. Discover - 57M Cardholders

6. Citibank - 48M Cardholders

7. Capital One - 45M Cardholders

8. Bank of America - 32M Cardholders

9. Wells Fargo - 24M Cardholders

10. US Bank - 18.5M Cardholders


11. USAA - 10M Cardholders

12. Credit One - 6M Cardholders

13. Barclaycard US 418K Cardholders

14. First PREMIER Bank (subprime)

15. PNC

Mortgage Risk

[10]

Prior to the financial collapse of 2007-2008 mortgage securitization was the hot thing. Many institutions and individuals got burned and a residual fear of securitization remains.

The result is that for jumbo and subprime mortgages, the originators are now holding many more of the loans. This reduces the systematic risk but an unanticipated consequence is that Fannie Mae and Freddie Mac [3] are now holding 50% of $11 trillion outstanding in the middle market.

Therefore the US government has undertaken a huge amount of default and interest-rate risk.


Insurance Companies by Premium Income

[8]

Property/Casualty Insurance

State Farm Mutual Automobile Insurance

62,189,311

Berkshire Hathaway Inc.

33,300,439

Liberty Mutual

32,217,215

Allstate Corp.

30,875,771

Progressive Corp.

23,951,690

Travelers Companies Inc.

23,918,048

Chubb Ltd.

20,786,847

Nationwide Mutual Group

19,756,093

Farmers Insurance Group of Companies

19,677,601

USAA Insurance Group

18,273,675

Life Insurance/Annuities

MetLife Inc.

95,110,802

Prudential Financial Inc.

45,902,327

New York Life Insurance Group

30,922,462

Principal Financial Group Inc.

28,186,098

Massachusetts Mutual Life Insurance Co.

23,458,883

American International Group

22,463,202

Jackson National Life Group

22,132,278

AXA

21,920,627

AEGON

21,068,180

Lincoln National Corp.

19,441,555

Homeowners Insurance

State Farm Mutual Automobile Insurance

17,516,715

Allstate Corp.

7,926,984

Liberty Mutual

5,993,803

Farmers Insurance Group of Companies

5,284,511

USAA Insurance Group

5,000,407

Travelers Companies Inc.

3,305,427

Nationwide Mutual Group

3,249,456

American Family Insurance Group

2,609,366

Chubb Ltd. (4)

2,485,193

Erie Insurance Group

1,471,544

Private P assenger Auto Insurance


State Farm Mutual Automobile Insurance

39,194,660

Berkshire Hathaway Inc.

25,531,762

Allstate Corp.

20,813,858

Progressive Corp.

19,634,834

USAA Insurance Group

11,691,051

Liberty Mutual

10,774,426

Farmers Insurance Group of Companies

10,304,622

Nationwide Mutual Group

7,640,558

American Family Insurance Group

4,005,549

Travelers Companies Inc.

3,896,786

Commercial Auto Insurance

Progressive Corp.

2,625,929

Travelers Companies Inc.

2,124,182

Nationwide Mutual Group

1,735,614

Zurich Insurance Group

1,624,621

Liberty Mutual

1,604,461

Old Republic International Corp.

1,123,042

Berkshire Hathaway Inc.

951,775

American International Group (AIG)

867,567

Auto-Owners Insurance Co.

739,495

Chubb Ltd.

695,210

Commercial Lines Insurance

Chubb Ltd.

16,528,891

Travelers Companies Inc.

16,463,566

Liberty Mutual

15,056,251

American International Group (AIG)

13,144,961

Zurich Insurance Group

12,554,597

CNA Financial Corp.

9,763,122

Nationwide Mutual Group

8,335,275

Hartford Financial Services

7,679,737

Berkshire Hathaway Inc.

7,650,236

Tokio Marine Group

6,256,196

W orkers’ Compensation Insurance

Travelers Companies Inc.

4,467,425

Hartford Financial Services

3,324,361

AmTrust Financial Services

2,972,901

Zurich Insurance Group

2,851,695

Liberty Mutual

2,481,479

Berkshire Hathaway Inc.

2,479,354

State Insurance Fund Workers’ Comp (NY)

2,437,325

Chubb Ltd.

2,368,918

American International Group

2,345,247

State Compensation Insurance Fund (CA)

1,638,849


Global Asset Management Firms by Revenue

[18]

BlackRock

United States

4,890

The Vanguard Group

United States

3,149

UBS

Switzerland

2,716

State Street Global Advisors

United States

2,460

Fidelity Investments

United States

2,025

Allianz

Germany

1,949

J.P. Morgan Asset Management

United States

1,760

BNY Mellon Investment Management

United States

1,740

PIMCO

United States

1,590

Credit Agricole Group

France

1,527

Global Investment Banks by Revenue

[2]

JPMorgan

3,361

Goldman Sachs

2,858

Bank of America Merrill Lynch

2,684

Morgan Stanley

2,501

Citi

2,378

Barclays

1,884

Credit Suisse

1,760

Deutsche Bank

1,387

RBC Capital Markets

994

UBS

904

Wells Fargo Securities

871

HSBC

793

Jefferies LLC

750

BNP Paribas

619

Lazard

565

BMO Capital Markets

448

Nomura

445

Mizuho

435

Sumitomo Mitsui Financial Group

413

Evercore Partners Inc

407


Hedge Funds By Assets Under Management

[6]

OrgCRD

PrimaryBusinessName

May2017AUM

110814

NOMURA ASSET MANAGEMENT CO., LTD.

367.6

105129

BRIDGEWATER ASSOCIATES, LP

239.3

158117

MILLENNIUM MANAGEMENT LLC

207.6

158319

SAMSUNG ASSET MANAGEMENT COMPANY, LTD.

182.2

148826

CITADEL ADVISORS LLC

152.7

143161

APOLLO CAPITAL MANAGEMENT, L.P.

125

140074

PICTET ASSET MANANGEMENT SA.

122.8

110997

NIKKO ASSET MANAGEMENT CO LTD

120.6

282598

VANGUARD ASSET MANAGEMENT, LIMITED

120.2

111128

THE CARLYLE GROUP

101.9

106661

RENAISSANCE TECHNOLOGIES LLC

97

144533

KOHLBERG KRAVIS ROBERTS

90

168122

ANNALY MANAGEMENT COMPANY

87.9

152719

ALPHADYNE ASSET MANAGEMENT PTE. LTD.

84.6

133720

PINE RIVER CAPITAL MANAGEMENT L.P.

82.8

159732

TPG GLOBAL ADVISORS, LLC

79.5

138111

BALYASNY ASSET MANAGEMENT L.P.

75.1

144603

EASTSPRING INVESTMENTS (SINGAPORE) LIMITED

74.5

155587

FIELD STREET CAPITAL MANAGEMENT, LLC

63.3

107580

BLACKSTONE ALTERNATIVE ASSET MANAGEMENT LP

62.3

148823

BLUECREST CAPITAL MANAGEMENT LIMITED

62.2

142979

BLACKSTONE REAL ESTATE ADVISORS L.P.

60.1

160795

APG ASSET MANAGEMENT US, INC

59.3

130074

ARES MANAGEMENT LLC

58.4

136979

BLACKSTONE MANAGEMENT PARTNERS L.L.C.

57.4

161600

AGNC MANAGEMENT, LLC

56.9

129612

FORTRESS INVESTMENT GROUP

56.9

156601

ELLIOTT MANAGEMENT CORPORATION

56

160309

ELEMENT CAPITAL MANAGEMENT LLC

55.9

139345

MACQUARIE FUNDS MANAGEMENT

54.7

160188

MOORE CAPITAL MANAGEMENT, LP

53.8

107913

OZ MANAGEMENT LP

51.7

159738

TPG CAPITAL ADVISORS, LLC

51.6


137137

TWO SIGMA INVESTMENTS, LP

49.3

152254

TWO SIGMA ADVISERS, LP

48.7

110338

MACKENZIE INVESTMENTS

48.6

156078

HUDSON AMERICAS L.P.

48.4

160000

LONE STAR NORTH AMERICA ACQUISITIONS, LLC

48.1

152175

CERBERUS CAPITAL MANAGEMENT, L.P.

48

173355

CANDRIAM LUXEMBOURG S.C.A.

47.1

156934

3G CAPITAL PARTNERS LP

46.3

143158

APOLLO MANAGEMENT, L.P.

46.2

157589

CAPULA INVESTMENT US LP

45.8

156945

WARBURG PINCUS LLC

45.7

132272

VIKING GLOBAL INVESTORS LP

43.4

160679

ADAGE CAPITAL MANAGEMENT, L.P.

42

146629

KKR CREDIT ADVISORS (US) LLC

41.5

159215

ALPINVEST PARTNERS B.V.

41.2

108679

D. E. SHAW

37

Largest private equity firms by PE capital raised

[17]

The Carlyle Group

Washington D.C.

$30,650.33

Kohlberg Kravis Roberts

New York City

$27,182.33

The Blackstone Group

New York City

$24,639.84

Apollo Global Management

New York City

$22,298.02

TPG

Fort Worth/San Francisco

$18,782.59

CVC Capital Partners

Luxembourg

$18,082.35

General Atlantic

New York City

$16,600.00

Ares Management

Los Angeles

$14,113.58

Clayton Dubilier & Rice

New York City

$13,505.00

Advent International

Boston

$13,228.09

EnCap Investments

Houston

$12,400.20

Goldman Sachs Principal Investment Area

New York City

$12,343.32

Warburg Pincus

New York City

$11,213.00

Silver Lake

Menlo Park

$10,986.40

Riverstone Holdings

New York City

$10,384.26

Oaktree Capital Management

Los Angeles

$10,147.28

Onex

Toronto

$10,097.21

Ardian (formerly AXA Private Equity)

Paris

$9,805.25

Lone Star Funds

Dallas

$9,731.81


In v estmen t Banking Private Equity Groups

[19]

ABN AMRO AAC Capital Partners Barclays Capital Equistone Partners Europe BNP Paribas PAI Partners

CIBC World Markets Trimaran Capital Partners

Citigroup Court Square; CVC; Welsh, Carson, Anderson & StoweBruckmann, Rosser, S Deutsche Bank MidOcean Partners

Globus Capital Holdings Globus Capital Banca

Goldman Sachs Goldman Sachs Capital Partners JPMorgan Chase CCMP Capital; One Equity Partners Lazard Lazard Alternative Investments Merrill Lynch Merrill Lynch Global Private Equity

Morgan Stanley Metalmark Capital; Morgan Stanley Capital Partners New York

National Westminster Bank Bridgepoint Capital

Nomura Group Terra Firma Capital Partners

UBS UBS Capital; Affinity Equity Partners; Capvis; Lightyear Capital

Wells Fargo Pamlico Capital

William Blair & Company William Blair Capital Partners


F ederal Reserve System

{Privateers}

The St. Louis Fed is well known among economics geeks as a fantastic source of data, analysis and commentary. [4] In fact, all the Fed banks are avid consumers of data, analysis and risk-management metrics.


References

[1] cardrates.com. 15 Largest Credit Card Issuers. http://www.cardrates. com/news/credit-card-companies/ , 2017.

[2] dealogic. Global IB Revenue Ranking. http://fn.dealogic.com/fn/ IBRank.htm , July 2017.

[3] Federal Housing Finance Agency. About Fannie Mae and Fred- die Mac. https://www.fhfa.gov/SupervisionRegulation/ FannieMaeandFreddieMac/Pages/About-Fannie-Mae---Freddie-Mac. aspx .

[4] Federal Reserve Bank of St. Louis. FRED Economic Data. https://

fredblog.stlouisfed.org/ .

[5] Global Association of Risk Professionals (GARP). Financial Risk Manager

(FRM) Certification. http://www.garp.org/.

[6] Raynor Gobran. Biggest hedge funds by assets under man- agement may 2017. https://www.raynergobran.com/2017/05/ biggest-hedge-funds-by-assets-under-management-may-2017/ ,

May 2017.

[7] h ttps://www.simplilearn.com/. Financial Risk and its Types. https://

www.simplilearn.com/financial-risk-and-types-rar131-article ,

2016.

[8] Insurance Information Institute. Insurance Comapny Rankings. http:// www.iii.org/fact-statistic/insurance-company-rankings , Decem- ber 2016.

[9] Kaggle. Two Sigma Financial Modeling Challenge. https://www.kaggle. com/c/two-sigma-financial-modeling/leaderboard , 2017.

[10] MarketWatch. http://www.marketwatch.com/story/

why-the-federal-government-now-holds-nearly-50-of-all-residential-mortgages-2015-10-16

2015.

[11] McKinsey & Company. Living with BCBS 239. http:

//www.mckinsey.com/business-functions/risk/our-insights/

living-with-bcbs-239 , May 2017.

[12] Priceonomics. The Trade of the Century: When George Soros Broke the British Pound. https://priceonomics.com/ the-trade-of-the-century-when-george-soros-broke/ .

[13] Quora. What are the best quant funds? https://www.quora.com/ What-are-the-best-quant-hedge-funds .


[14] The Economist of London. The Economist Intelligence Unit. https://

www.eiu.com/home.aspx .

[15] Wikipedia. BCBS 239. https://en.m.wikipedia.org/wiki/BCBS_239 . [16] Wikipedia. James Harris Simons. https://en.wikipedia.org/wiki/

James_Harris_Simons .

[17] Wikipedia. Largest private equity firms by PE capital raised. https:

//en.wikipedia.org/wiki/List_of_private_equity_firms .

[18] Wikipedia. List of asset management firms. https://en.wikipedia.org/

wiki/List_of_asset_management_firms .

[19] Wikipedia. List of investment banking private equity groups. https://

en.wikipedia.org/wiki/List_of_private_equity_firms .

[20] Wikipedia. List of largest financial services companies by rev- enue. https://en.wikipedia.org/wiki/List_of_largest_financial_ services_companies_by_revenue .

[21] Wikipedia. Major Stock Exchanges. https://en.wikipedia.org/wiki/ List_of_stock_exchanges .

[22] Wikipedia. The Big Short.

https://en.wikipedia.org/wiki/The_Big_ Short

.

When can you start?
Posted by: Eipmwlod   |   Jan 25, 2012 3:55 PM
I have not been to this blog for a long time, however it

was a joy to find it again. It is such an important

topic and ignored by so many, even professionals! I

thank you for helping to make people more aware of these

issues. Just great stuff as per usual.For more

information go to <www.brandwebdirect.com>.
Posted by: Johnmaurice   |   Sep 26, 2011 6:40 AM
this post is fantastic %3
Posted by: Yutqlnvp   |   Feb 5, 2011 11:09 PM
Some pretty amazingly esoteric information here, mate!
Posted by: Late Nights   |   Mar 16, 2007 10:06 PM
Good work. Interesting posts, besides those spam...
Posted by: Oopa Jopa   |   Oct 11, 2006 6:01 PM
C'est trouis bien. Nice, i mean. Thanks!
Posted by: Junior Lee   |   Sep 26, 2006 11:07 AM

Please Let Us Know What You Think

 
 

(HTML tags provide better formatting)
 

Blogs

XHTML vs. HTML
XHTML is advanced HTML. Not all browsers support it, so pages must test first and serve what's supported.

PHP output buffering improves response time. Gzip speeds internet transmission, compressing text volume (not images) up to 80%.

Webpage Printing
Webpage printing is supported on this site. It seems to work pretty well except for text flow-around for some browsers.

Floating Three-Column CSS Layout
A popular web page layout is three columns with a header and footer. This is achieved on this web site with CSS using floating columns.

Editing HTML with PHP scripts
To provide a PHP script that allows a user to edit HTML requires a few tricks that are hard to hack through but are elegantly documented in the manual.

Open a new window with XHTML
Prior to XHTML, you could open a new window with a link by saying target="_blank". That's no longer allowed, but what can you do?

DHTML, PHP and MySQL References
What are good references for advanced website development?

Web hosting providers
We have used two service providers: one good, the other poor.

Regular Expressions
Regular Expressions, regex, are an obscure but very powerful pattern-matching tool that every developer should learn.

Javascript: document.write and XHTML
Document.write does not work with "true" XHTML. Don't bother trying to fix the javascript.

Captcha
Captcha is the term for security codes the user must enter into a form

RSS, Atom, Syndication, etc.
The world is full of XML and XML-like file formats for syndication-feed purposes. Why they must all be different, God alone can tell. But the reason there is lousy documentation is the evil work of Man.

Web Standards Validation
It is important to confirm that your website conforms to standards

Ampersand Madness: Convert &#x26; to &#x26;amp; to prevent XHTML errors
A regex solution to the huge problem of ampersand encoding in XHTML

CSS Zen Garden Suggestions
Here's a list of the designs I think are worth a look, illustrating the great power of CSS.

Font Families
A survey of the most-commonly installed fonts found on Windows machines

Geo Positioning
With the advent of Google Earth, the tagging of websites, blogs and photographs with latitude and longitude information has taken a great leap forward.

Process .htm and .html as php
How to include php scripts in html files

Call KML files from within a blog or topic
How to create a button to call a KML or KMZ file

Send a KML file from disk using PHP
Preprocessing a kml or kmz disk file improves the user experience

Static vs Dynamic URLs
Implementing static URLs for a website driven by PHP and MySQL is as easy as a little regex and htaccess magic.

HTML Forms
How to open a form in a new window when a radio button is clicked.

Regex URL Matching
On this site we check for the existence of a URL whenever an entry is updated. A Regex (regular expression) string was the breakthrough.

Parsing name-value pair attributes in an HTML tag
Regexp HTML Attribute Parsing: Pulling out the value of numerous attributes in an HTML tag is a mind bender

Server-Side gzip Compression
You can include code in a PHP program to compress an outbound web page. It works, but instructing the web server to do it to every page is much easier.

SQL To Exclude A List Of Items
How do you select everything from one table except for a list contained in another table?

HTML Anchor without an HREF?
How to execute a JavaScript function and nothing else.

Website Statistics
Philadelphia Reflections' popularity has grown quite dramatically over two and a half years.

PHP script to display Google PageRank
It is very handy to know the Google PageRank of your pages. Here's a PHP script that figures it out for you.

Create and send CSV files from PHP
The ability to create a CSV file from MySQL data in PHP, download it and have it open automatically in Excel is very handy.

PHP out of memory condition
PHP scripts sometime run out of internal memory. Here's how to get around the problem.

Ternary Operator and the IIF function
Reduce the PHP If function and/or include it in a function

MySQL server has gone away
What to do when your MySQL connection is timing out for no apparently good reason, all of a sudden.

Images
The 1000+ images on Philadelphia Reflections are displayed at this location. It works better with some browsers than others.

Embed Flash as Valid XHTML
The embed tag doesn't validate with XHTML. Here's what to do.

Stylesheet for ATOM feed
Unsupported for the moment (in IE 7 and Firefox 3.0.8), we have equipped our ATOM and RSS feeds with XLS style sheets which include, among other features, sorting the entries in descending order by modified date.

How to detect an iPhone and other mobile devices
http://www.philadelphia-reflections.com/images/iphone.gif iPhones are definitely the wave of the future and websites, blogs, etc. must adapt to retain their audience. Luckily, if a website was developed using CSS, it's a breeze.

mysql_insert_assoc
How to make MySQL insertions easier (and safe)

mysql_update_assoc
Easily (and quote_smart-ly) update a record in a MySQL database table

An iPhone web app
Philadelphia Reflections is now available on the iPhone as a web app

Valid XHTML YouTube embed code generator
This free tool will create a valid XHTML embed code for any YouTube video. The code YouTube shows on the embed field is not valid XHTML! However, you can simply use this simple tool to make it Valid XHTML 1.0 Transitional.

Escaping for PHP Output to JavaScript
To send data to a JavaScript script from PHP, three levels of escaping are required

Google Maps Icons
Google Maps/Earth do not make icon creation & manipulation easy.

QR Codes
QR Codes are becoming common in print ads to encode the sponsor's URL

ZNOTE: Website Development
.

Display Image
A Dummy ImageCreate images with specific dimensions

Python URL Handling
An example of URL handling using Python 3.2.2 installed on Windows 7

Website Test Results
Response time test with lots of imagesPhiladelphia Reflections has excellent response time

Iterate through a Word document, modifying picture properties(Blog 2300)
Iterate through a Word document modifying modifying the Wrap Text property of every picture

SMTP Authorization and Handling Bounced Emails with PEAR Mail
Sign on to an SMTP server and get bounced emails returned to you

Quick Analysis of Financial-Industry
New blog 2017-07-24 19:45:12 description