The Age of the Philadelphia Computer Computers have a long slow history. The computer industry, however, had an abrupt start and sudden decline, in Philadelphia.
Website Development
The website technology supporting Philadelphia Reflections is PHP, MySQL and DHTML. The web hosting service is Internet Planners.
The development of this website has provided an opportunity to learn new technology, to try out different techniques for getting noticed by the search engines and the trials and tribulations of dealing with malicious hackers and spammers who range from the annoying to the abusive.:
This collection of articles documents some of our experiences and we hope that people surfing the web looking for solutions to problems we've encountered will benefit.
The primary purpose of this website is to deliver high quality content on the
subjects of Philadelphia, Philadelphia History, medicine, medical economics and
other subjects of interest to its author, Dr. George R. Fisher.
However, early in 2006 the site was attacked by spammers who broke in using
security holes in the previous implementation of PHP. In the subsequent
reconstruction of the site, there's been an opportunity to try out lots of
new technology and techniques, some of which are detailed here.
Today's Philadelphia Reflections was born in June 2006. It had a prior incarnation but it was hacked by Nigerian spammers who took it over and turned it into an email factory.
We scrubbed everything down and rebuilt from scratch, implementing as many PHP and MySQL security features as we could find.
We have done all of the standard things to improve our search engine standings but we are really at a loss to explain the inflection points that can be seen in the graphs.
We make an effort to produce clean XHTML 1.1 or HTML 4.01 depending on the user's browser, for both the tags and content.
We have keywords, description and other relevant meta tags for every page.
The content is rich, varied, relevant and frequently updated.
The website structure is simple with complete linkages between pages.
A current Google sitemap is maintained programatically along with robots.txt, RSS, ATOM and a few other more obscure syndication file types.
Both the sitemaps, etc. and the URLs themselves have been submitted to the search engines.
We use static URLs, translated by the Apache mod_rewrite function into dynamic.
These and other techniques we've picked up along the way are described in the topic "Website Development"
Our home page has a Google Page Rank of 5/10 and the pages vary as follows (as of December 2008):
No Info 36%
0 2%
1 28%
2 23%
3 9%
4 1%
5 1%
Google Images is by far the largest source of referrals but we also have many visitors who come to us via the search engines and who like what they see and come back; we would like to express our appreciation to all of our visitors.
The dips in the Unique Visitors graph were the result of problems with our ISP ... once they were simply off the air and twice they made software changes without notification or testing.
A current fad in web page styling is to use CSS exclusively to define the basic page sections.
The "old" way of doing this was to use tables, but that's no longer stylish. Instead, we are exhorted
to use CSS exclusively.
A very common page layout has a head and a foot with three columns sandwiched in between. Philadelphia Reflections
uses this layout.
Most descriptions of this layout style that I have found Googling around the Internet involve absolute
positioning which very often does not adapt well to differing screen sizes and browser window sizes.
What we use here makes use of floating columns, which re-size themselves very nicely.
Several anomalies and quirks should be noted:
Each element is defined as a DIV
The left, right and center DIVs must be enclosed in a "wrapper" DIV
The three columns must be followed by a clear:both DIV
The center column must be below the left and right columns
The center column actually is as wide as the whole page (try including border-style:solid)
These quirks and anaomalies make me think that maybe this either isn't quite kosher or else may be
superceded by later CSS definitions. But for the time being, this works very happily and both the HTML and
the CSS validate perfectly well.
DHTML and CSS for the World Wide Web:
Visual QuickStart Guide
by Jason Cranford Teague
DHTML = HTML + CSS + DOM + JavaScript
This book will get you up and running quickly with client-side programming.
Then you need to learn server-side programming.
PHP is an open source server-side scripting language that is easy to learn
and very powerful. MySQL is the same ... open source relational database.
The text book for these technologies is
PHP and MySQL for Dynamic Web Sites:
Visual QuickPro Guide
by Larry Ullman
Master those two books and you'll be creating very powerful scripts on both
the client and server side that produce dynamic and elegant results.
While you're in the process of doing this, you will constantly need to reference
manuals for syntax, functions, etc. There are many, but two will suffice for 90% of what you need:
The markup language used by web browsers continues to evolve.
The most current version (as of April 2009) is XHTML 1.1, an XML version
of HTML.
Many browsers, most particularly IE, do not support XHTML. Technically speaking,
they support only the "text/html" mime type, not "application/xhtml+xml". Lots of
web developers have gone to the trouble of sticking closing tags ( />) in their
BR, HR, META and INPUT tags and a DOCTYPE at the top but then serve the code as
"text/html".
This produces a syntactic mish mash which may be worse than using strict
HTML 4.01.
Why "worse"? Because of the possibility of unintended results
from providing incorrect instructions to the browser. If you care about the output produced
by the browser, which most developers and content providers emphatically do,
then you have to be careful about what instructions you give the browser. You simply cannot count
on getting what you want if what you're telling the browser to do is syntactically incorrect.
However, it's a little difficult to see just what good XHTML is:
There are rumors that it renders the non-image portion of a page as much as
50% faster than HTML, but what with gzip and broadband being pretty common these days,
it's hard to see that as an especially compelling reason to be bothered.
Furthermore, those browsers that do render XHTML (Mozilla, Firefox) are very picky
about syntax and blow up much too easily.
And the claim that XHTML is the way to get your web pages onto cell phones
and toaster ovens leaves me cold. It's just not believable that the format
required for these special devices will be the same as for a computer monitor.
(For the current status of handheld support on this site, see
How to detect an iPhone and other mobile devices
Internet cognoscenti speak disparagingly of "tag soup" but the Internet is a lot more about content
than it is about syntax, so who really cares?
Well, somehow, I do. A little. Since we use PHP on this site, we have the opportunity to figure out what features
are supported by a browser and render the correct types of tags, mime-types, etc.
Check out the HTTP headers and the page source to see the following script in action:
It renders XHTML 1.1 whenever it encounters a browser that can support it
It uses output buffering (which demonstrably if illogically improves rendering response time)
It sends the whole thing using gzip compression if the browser will support it
But also, it concedes certain issues based on experience for the sake of a smoothly-operating website
<?php
//
// This script figures out what kind of mime type (HTML vs XHTML) the browser supports and sends the correct headers
// It also initiates compression, specifies cache-ing and sends other <meta http-equiv headers
//
// My thanks to http://www.workingwith.me.uk/articles/scripting/mimetypes for the basic idea and structure
//
// $_SERVER["ACCEPT"] describes the mime_types a browser supports in a comma-separated list:
//
// mime_type,mime_type,mime_type
//
// If a browser prefers one mime_type or group of mime_types, it adds a q-value
//
// mime_type,mime_type;q=x.x, mime_type,mime_type,mime_type,...,mime_type;q=x.x
//
// The q-value is a number between 0.0 and 1.0 ... the higher the number, the greater the preference
// The idea is that if we can serve more than one mime_type we should serve the browser's higher preference
//
// ob_start("ob_gzhandler"); does all the work to compress the output if the browser can handle it
// ob_start("fix_code"); calls the "fix_code" function instead, so initiating gzip is my responsibility
//
// $_SERVER["HTTP_USER_AGENT"] is an opaque decription of the browser itself
//
// $_SERVER['HTTP_ACCEPT_ENCODING'] describes compression capabilities
//
// I output these three variables as an HTML comment so I can debug things more easily
//
// Despite my desire to do things "right", you will see I accomodate myself to the reality of user-supplied content
// and browser peculiarities in order to have a working website
//
function fix_code($buffer)
{
#
# Called for HTML browsers to delete all the lovely close-brackets
# it's up to me to initiate the gzipping because ob_start is called by "fix_code" instead of "ob_gzhandler"
#
if (stristr($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip'))
{
header("Content-Encoding: gzip"); // notifies the far-end to un-gzip
return (gzencode(str_replace(" />", ">", $buffer),6,FORCE_GZIP));
}
else
{
return (str_replace(" />", ">", $buffer));
}
}
#
# default values
#
$charset = "UTF-8"; # See http://en.wikipedia.org/wiki/UTF-8
$mime = "text/html"; # Plain vanilla
$cache_control = "max-age=200"; # Cache expires after 200 seconds
$xhtml_q = 0;
$html_q = 0;
# see http://www.w3.org/QA/2002/04/valid-dtd-list.html
$DOCTYPE_xhtml11 = "<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.1//EN' 'http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd'>\n";
$DOCTYPE_xhtml10 = "<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN' 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'>\n";
$DOCTYPE_wap = "<!DOCTYPE html PUBLIC '-//WAPFORUM//DTD XHTML Mobile 1.2//EN' 'http://www.openmobilealliance.org/tech/DTD/xhtml-mobile12.dtd'>\n";
$DOCTYPE_html401 = "<!DOCTYPE html PUBLIC '-//W3C//DTD HTML 4.01//EN' 'http://www.w3.org/TR/html4/strict.dtd'>\n";
$DOCTYPE_html401l = "<!DOCTYPE html PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'>\n";
$html_xhtml = "<html xmlns='http://www.w3.org/1999/xhtml' xml:lang='en'>\n\n";
$html_iphone = "<html xmlns='http://www.w3.org/1999/xhtml' xml:lang='en' manifest='iphone.manifest'>\n\n";
$html_html401 = "<html lang='en'>\n\n";
$html_html401_IE = "<html lang='en' xmlns:v='urn:schemas-microsoft-com:vml'>\n\n"; # xmlns:v='urn:schemas-microsoft-com:vml' is recommended by Google for maps display using IE
$html_plain = "<html>\n\n";
# parental control tag
$pics_Label = '(pics-1.1 "http://www.icra.org/pics/vocabularyv03/" l
gen true for "http://philadelphia-reflections.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0)
gen true for "http://www.philadelphia-reflections.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0)
gen true for "http://search.freefind.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0)
gen true for "http://www.search.freefind.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0)
gen true for "http://statcounter.com" r (n 0 s 0! v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0)
gen true for "http://www.statcounter.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0)
gen true for "http://c3.statcounter.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0)
gen true for "http://www.c3.statcounter.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0))';
# I include the following HTML comment for my ongoing debugging purposes
$show_info = "<!-- \nHTTP_USER_AGENT $_SERVER[HTTP_USER_AGENT]\nHTTP_ACCEPT_ENCODING $_SERVER[HTTP_ACCEPT_ENCODING]\nHTTP_ACCEPT $_SERVER[HTTP_ACCEPT]\n -->\n\n";
# note that I eval $prolog_type below so that the xml header (if any) gets the right charset
$prolog_type = '$DOCTYPE_html401l $html_plain $show_info';
#
# the logic
#
# W3C Validator
if (stristr($_SERVER["HTTP_USER_AGENT"],"W3C_Validator"))
{
ob_start("ob_gzhandler");
$mime = "application/xhtml+xml";
# UTF-8 produces character-type errors
$charset = "iso-8859-1";
$prolog_type = '$xml_header $DOCTYPE_xhtml11 $html_xhtml $show_info';
}
else
{
# fancy wap-enabled handheld device
if(stristr($_SERVER["HTTP_ACCEPT"],"application/vnd.wap.xhtml+xml"))
{
ob_start("ob_gzhandler");
# per http://www.ready.mobi/ and http://www.w3.org/TR/mobileOK-basic10-tests/ application/xhtml+xml is preferred
// $mime = "application/vnd.wap.xhtml+xml";
$mime = "application/xhtml+xml";
$prolog_type = '$xml_header $DOCTYPE_wap $html_plain $show_info';
}
else
{
# non-wap xhtml-enabled browser
if(stristr($_SERVER["HTTP_ACCEPT"],"application/xhtml+xml"))
{
# retrieve the q values for "application/xhtml+xml" and "text/html"
if (preg_match('%application/xhtml\+xml[^;]*?;q=([1|0]\.[1-9]+)%i', $_SERVER["HTTP_ACCEPT"], $matches))
{
$xhtml_q = (float)$matches[1];
}
if (preg_match('%text/html[^;]*?;q=([1|0]\.[1-9]+)%i', $_SERVER["HTTP_ACCEPT"], $matches))
{
$html_q = (float)$matches[1];
}
# if the q value for HTML is greater than for XHTML
# then treat output as HTML 4.01 strict (Opera 9.64, for instance)
if($html_q > $xhtml_q)
{
ob_start("fix_code");
$mime = "text/html";
# UTF-8 produces character-type errors
$charset = "iso-8859-1";
$prolog_type = '$DOCTYPE_html401 $html_html401 $show_info';
}
# otherwise, go with XHTML
else
{
ob_start("ob_gzhandler");
# for the time-being application/xhtml+xml is too strict for us: unless your tags are PERFECT, it blows up
// $mime = "application/xhtml+xml";
$mime = "text/html";
# UTF-8 produces character-type errors
$charset = "iso-8859-1";
# see "Safari Web Content Guide for iPhone OS" for cache manifest description
if (stristr($_SERVER["HTTP_USER_AGENT"],"iPhone"))
{
$prolog_type = '$xml_header $DOCTYPE_xhtml11 $html_iphone $show_info';
}
else
{
$prolog_type = '$xml_header $DOCTYPE_xhtml11 $html_xhtml $show_info';
}
}
}
else
{
# plain text/html browser
if(stristr($_SERVER["HTTP_ACCEPT"],"text/html"))
{
ob_start("fix_code");
$mime = "text/html";
# UTF-8 produces character-type errors
$charset = "iso-8859-1";
$prolog_type = '$DOCTYPE_html401 $html_html401 $show_info';
}
else
{
# if the browser doesn't specify any X/HTML mime type, treat like HTML 4.01 Transitional (IE 7, for instance)
ob_start("fix_code");
$mime = "text/html";
# UTF-8 produces character-type errors
$charset = "iso-8859-1";
$prolog_type = '$DOCTYPE_html401l $html_plain $show_info';
# if IE then include Google's recommended "xmlns:v ..."
if(stristr($_SERVER["HTTP_USER_AGENT"],"MSIE"))
{
$prolog_type = '$DOCTYPE_html401l $html_html401_IE $show_info';
}
}
}
}
}
#
# output the mime type, prolog type and other <meta http-equiv= variables
#
header("Content-Type: $mime; charset=$charset");
header("Content-Language: en-us");
header("Vary: Accept");
header("Cache-Control: $cache_control");
header("Content-Script-Type: text/javascript");
header("Content-Style-Type: text/css");
header("imagetoolbar: no");
// parental controls from http://www.icra.org/
header("pics-Label: $pics_Label");
// privacy header created at http://www.p3pwiz.com/
header("P3P: policyref=\"http://www.philadelphia-reflections.com/w3c/p3p.xml\", CP=\"NID DSP NOI COR\"");
$xml_header = "<?xml version='1.0' encoding='$charset' ?>\n";
eval("\$prolog_type = \"$prolog_type\";");
print $prolog_type;
?>
There are two primary aspects of a website that need validation:
1. (X)HTML
You can use the W3C's QA Markup Validation Service.
The URL to test the main page of Philadelphia Reflections is
http://validator.w3.org/
Firefox has several useful add-ons for (X)HTML validation; one that uses Tidy is here: Html Validator
2. CSS
The W3C has a validation service for CSS, too.
For Philadelphia Reflections, the following URL checks all the CSS definitions in the main page: http://jigsaw.w3.org/css-validator/ (note: this validator is a little flakey: it produces different answers for the same file; you have to refresh a couple of times to get the whole story)
Firefox has several useful web developer add-on tools; try this one: Web Developer
Once you've gotten the HTML and CSS basics under control, there are other aspects of your site that you will want to validate:
There is an absolutely lovely program called HTML Tidy, origianlly written by Dave Raggett and decribed by the W3C here: http://www.w3.org/People/Raggett/tidy/
Calls to Tidy are available in some newer renditions of PHP (sadly, not the one we are using), however, on Widows (only) versions of Firefox and Mozilla, you can download an extension that will provide all the Tidy functions in your browser! ... https://addons.mozilla.org/firefox/249/. This a fantastic feature that I use all the time.
Syndication XML Validation
Validating RSS and Atom files is greatly facilitated by http://feedvalidator.org/. It has a number of quirks, the worst of which is that it has a length limitation that we exceed and so we have to provide "short" syndication files since all the feed aggregators use this facility and reject any feeds that aren't validated by it.
Yahoo and Microsoft have agreed to support Google's Sitemap protocol and to support the inclusion of the line "Sitemap: http://www.philadelphia-reflections.com/sitemap.xml" in robots.txt. If other search engines adopt this facility it will make it much easier to get into the world's many search engines ... they'll pick up this line instead of us having to hunt them down.
When you start getting really fancy and want to include automatic gzip compression, you'll want to see it in action and you'll want to check out all of your HTTP headers: http://www.gidnetwork.com/tools/gzip-test.php
Here is a list of links (that open in their own pages) that show some of my favorite web designs. The CSS Zen Garden is a website that illustrates what can be done with clever CSS design. The HTML and the content are exactly the same in each of these links, only the CSS changes; but what a difference!
On this site we check for the existence of a URL whenever an entry is updated
There are two key technologies at work
A PHP function that checks whether a URL is valid (thanks to marufit at gmail dot com
in the PHP Manual)
Regex (regular expression) in a preg_replace_callback routine; this one is mine, all mine
function url_exists($url)
{
//
// checks whether a URL actually exists on the Internet
//
$handle = curl_init($url);
if (false === $handle)
{
return false;
}
curl_setopt($handle, CURLOPT_HEADER, false);
curl_setopt($handle, CURLOPT_FAILONERROR, true);
curl_setopt($handle, CURLOPT_NOBODY, true);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, false);
$connectable = curl_exec($handle);
curl_close($handle);
return $connectable;
}
function aExists($matches)
{
//
// function called by preg_replace_callback
//
// $matches[0] is the complete match
// $matches[1] the match for the first subpattern
// enclosed in '(...)' and so on
//
// checks to see if a regular link exists
// something similar is done for img src= also
//
$srcURL = $matches[3];
if (url_exists($srcURL)) {do something; return "";}
else {do something else; return "";}
}
$foo = preg_replace_callback(
'/(.*?)(<a .*?href=")([^"]*)("[^>]*>)(.*?)(<\/a>)/i',
"aExists",
$source_string);
The regex is a series of alternating sequences; so, add href="([^"]*)"| in front of alt="([^"]*)" to select an additional attribute.
$matches[0] is the complete match
$matches[1] is alt=
$matches[2] is class=
$matches[3] is style=
$matches[4] is src=
$matches[5] is height=
$matches[6] is width=
Anyone who has used the expression *.doc to search for Word files has used Regular Expressions ("regex") without realizing it. Regex arose from mathematical theory and is available in many programming languages; it is simply the only way to deal with large amounts of text. And yet most people are completely unaware of it.
Philadelphia Reflections uses regex extensively for two primary purposes: (1) checking input from forms and (2) modifying HTML input in during the creation of articles for the site.
The text PHP and MySQL by Larry Ullman has a very good introduction to regex in his chapter on security.
The great advantage of regex is that it can identify very complex patterns in a mass of text. The great disadvantage of regex is that it has developed in sort of an underground way and there exist numerous varieties that are essentially incompatible. PHP offers two regex functions: one for the POSIX Extended variety of regex and he other for the Perl language compatible vesion called PCRE. POSIX is less powerful but far easier to learn. JavaScript offers its own variety of regex which isn't quite the same as either of the two PHP versions.
References include the Ullman book, the PHP online manual has a number of handy tips on regex use in its two supported varieties, the O'Reilly book Mastering Regular Expressions is interesting and Jan Goyvaerts has a very helpful website (http://www.regular-expressions.info/) and book Regular Expressions: The Complete Tutorial.
My experience is that this area requires diligent hacking which may be sub optimal but unavoidable ... for this purpose, Jan Goyvaerts' Regex Buddy is indispensible; you simply must get this program if you hope to make anything of Regex.
Here are examples of checking for a valid email address in both Javascript and PHP:
Javascript
// check email
var namePattern = /^[a-zA-Z0-9][a-z0-9_.-]*@[a-z0-9.-]+\.[a-z]{2,4}$/i;
document.comment_form.email.value = trim(document.comment_form.email.value);
if (
( document.comment_form.email.value.length > 0)
&&
(! document.comment_form.email.value == "[none]")
&&
(! document.comment_form.email.value.match(namePattern))
)
{
alert("Please enter a valid email address");
document.comment_form.email.focus();
document.comment_form.email.select();
problem = "yes";
return false;
}
PHP
// check email
$emailpattern = "^[a-zA-Z0-9][a-z0-9_.-]*@[a-z0-9.-]+\.[a-z]{2,4}$";
if (
(trim(strlen($_POST['email'])) > 0)
and
(!$_POST['email'] == "[none]")
and
(!eregi ($emailpattern, stripslashes(trim($_POST['email']))))
)
{
$inputerror = TRUE;
$inputerrormessage .= "<br />* An invalid email address was entered";
}
Incomprehensible? Yes, absolutely.
Useful? More than you can realize until you are actually faced with the problem of, say, verifying
that a user has input a valid email format, or trying to figure out whether a user-input IMG tag
is using the correct syntax; or else maybe trying to convert a huge web page from XHTML 1.1 to HTML 4.01
because you've determined that the browser is syntactically crippled.
And, once you get deep into it, the stuff is actually intriguing and fun.
When creating scripts that allow a user to edit HTML, you have to ensure that the browser doesn't
confuse the input with HTML to be rendered. I struggled with this long and hard and throughout the
utilities section of this website are various hacks that I created with brute force. They work, but
they are mostly ugly and all were time consuming.
Well, guess what? The PHP manual has a section on this subject and the solution is really rather
elegant. Chaper 56. PHP and HTML. It's worth
reading, but the essential bits are reproduced below:
Example 56-1. A hidden HTML form element
<?php
echo "<input type='hidden' value='" . htmlspecialchars($data) . "' />\n";
?>
Example 56-2. Data to be edited by the user
<?php
echo "<textarea name='mydata'>\n";
echo htmlspecialchars($data)."\n";
echo "</textarea>";
?>
Example 56-3. In a URL
<?php
echo "<a href='" . htmlspecialchars("/nextpage.php?stage=23&data=" .
urlencode($data)) . "'>\n";
?>
It used to be that no spiders or search engines could index a dynamic URL,
namely one that contained a "?" followed by parameters to be used by
PHP, ASP or other server-side scripting languages to drive a website
using a database.
Nowadays, Google and Yahoo seem to do a perfectly fine job of indexing
dynamic URLs but Google has a disclaimer warning that it may still
encounter problems with dynamic URLs and the SEO literature is still
full of warnings that other spiders and search engines may be blind to
everything to the right of the "?".
Furthermore, a *.php extension is an invitation to bad guys to try to
break in and wreak many sorts of havoc: this site was hacked by
Nigerians a few years ago using PHP tricks and they managed to use it as
an email factory until our ISP shut us down. I came on the scene at that
point and implemented every safeguard I could find, but the concern
still lingers.
Finally, dynamic URLs are not user friendly ... human beings generally do
not know what to make of long strings of obscure parameters.
Apache has a feature called "mod_rewrite" that allows you to specify, via
regex, that you want incoming URLs to be transformed in some way. Apache's instructions
on this subject are here: URL Rewriting Guide. I have
used that facility at Philadelphia Reflections to use static URLs for public use while still
allowing me to use parameters to drive the website with the database.
Options +FollowSymLinks
RewriteEngine on
RewriteRule ^(blog|topic)/([0-9]+)\.html?$ reflections.php?type=$1&key=$2
^(blog|topic)/([0-9]+)\.html?$
is the pattern to be compared against all incoming URLs.
If matched, it changes the URL to
reflections.php?type=$1&key=$2
The ^ ... $
sequence in the pattern says that we will match the whole string, not
just some part in the middle
(blog|topic)/
matches either
"blog"
or
"topic"
followed by
"/"
([0-9]+)
matches one or more digits
\.htm
matches
".htm"
l?
matches 0 or 1 lower-case Ls (so that we will match either htm or html)
In the replacement string $1 is replaced with the contents of the
first () in the pattern: either "blog" or "topic"
... and $2 is replaced by the second () in the pattern, namely the
numeric ID on the database of the blog or topic
which still works, in case there are any legacy bookmarks or links out
there, but going forward the new, simple, static URL is the face we will
present to the world.
Step 2. SMOP
After the htaccess regex was debugged, all that was left was a simple
matter of programming. In fact, I had to completely rewrite the driver
script, reflections.php, and the XML creation script which creates the
RSS, sitemap, etc. files; plus a lot more besides. It was a lot of work
but the breakthrough was in figuring out the htaccess trick; everything
else was just work.
In July 2008, after Volumes were implemented, another RewriteRule was implemented:
Later on, you'll find out that the ampersand is a huge source of XHTML errors because it has to be written
& or
& or
&
but you will struggle endlessly with how to get the darn thing to stay converted. First of all, content providers feel justifiably justified in including bare naked "&"s wherever they please; second of all, you will find that encoded ampersands get stripped back to their bare naked selves by browsers and other well-meaning sorts.
So, my undying thanks to Michael Ash's Regex Blog for providing the regex pattern in the following bit of PHP code:
I don't know how it can possibly work, and I may yet eat my words, but for the moment it seems to do the trick.
≈
Ampersand Encoding In RSS
Another thing: & is the only ampersand encoding form acceptable to both RSS and Atom. So, look at the souce of this page and you will find that I use this encoding in the title ... that's because the title goes into the Title field of my RSS and Atom feeds.
How do you (a) open a form when a radio button is clicked (b) in a new window?
Here's how it's done on this website.
The radio button is activated by a little JavaScript routine
The new window is simply a matter of including the target attribute in the form tag
<html>
<head>
<script type="text/javascript">
/* javascript function called by the radio buttons
to submit the form when clicked */
function formSubmit()
{
document.getElementById("form_x").submit()
}
</script>
</head>
<body>
<form name="form_x" id="form_x"
action="some_routine.php"
target="newIMGwin"
method="post"
style="whatever">
<fieldset>
<legend>legend surrounding the form</legend>
<input type="radio" name="key" value="1269" onclick="formSubmit()" />
</fieldset>
</form>
</body>
</html>
Geo Tagging refers to adding latitude and longitude information to websites and photographs. This has been around for a long time but it has taken the advent of Google Earth for it to really start to catch on.
This blog entry has geo meta tags that you can see if you look at the HTML source ("View > [Page] Source"). The input was as follows:
Address: 82 Devonshire St Boston MA
Lat: 42.3578 Lon: -71.0577
Descriptive Place Name: Fidelity Investments headquarters
Region: US-MA Country Code: US Country Name: United States
This creates meta tags in the HTML Header as follows:
To be precise you can go to the location with a GPS device and record the exact coordinates.
(see below for a discussion of GPX, the GPS transfer protocol)
Google Earth is a very good way to find coordinates of any spot on earth
and both GE itself and the KML Editor will allow you to pick up coordinates from GE.
But for most lookups, the easiest method is to use the Firefox Sidebar AddOn called Minimap; it's right in your browser so you don't have to switch back and forth. (The AddOn download is found here: Mini Map Sidebar).
geo.placename and tgn.name are often rendered as the city name but are intended to describe the geographical feature ("Pyramids of Giza" or something). This tag is optional.
HTML geo meta tags can be validated here:
There is a search engine of long standing that reads HTML geo meta tags and indexes the website based upon its location; for searching, it groups sites based on their geographic proximity: GeoURL.
Photographs can also contain geo meta data, so-called EXIF data (Firefox has an EXIF viewer AddOn).
JPEG is the most common image format and the easiest to deal with. The combination of Picassa2 and Google Earth allow you easily to add this information to your own photos.
The process of adding lat and lon to your photographs is this:
1. Select one or more photos in Picassa
2. Select Tools > GeoTag > GeoTag with Google Earth ...
3. This starts Google Earth and you can "fly" to the location of the picture
4. A small Picasa window will appear in Earth's lower-right corner displaying thumbnails of the pictures you selected; press the "Geotag" button.
5. When all of your pictures are tagged, press the "Done" button
Slowly, camera manufacturers are providing GPS capability. Some few have GPS devices built in and some others allow an external GPS device to be attached, although both Canon and Nikon are way behind the curve ... if you own either, you can essentially forget it: the best - lousy - solution is to carry around a GPS with you and synchronize the times ... ugly.
The Google Maps API allows maps to be embedded in a website as is done here. Google Maps API
The JavaScript required to embed the map on this page can also be seen in the HTML source ("View > [Page] Source"). In addition to JavaScript, you need a DIV with an ID of "map" or whatever is specified in the JavaScript document.getElementById entry, which specifies the height and width of the map to be displayed.
Google, Yahoo and Microsoft all now support GeoRSS as a feed to their map programs. My sense of it is that KML is a richer protocol, allowing more features, but fundamentally all these XML variants do mostly the same thing.
Google Sitemaps can include links to KML files (and ATOM, now, too). Part of the sitemap generation on this site is some code that picks up every *.kml and *.kmz file in the /kml/ folder and adds them to our sitemap.xml file.
KML ( Keyhole Markup Language, Keyhole being the predecessor to Google Earth) is an XML protocol that allows you to incorporate Google Earth into graphical presentations. Google KML Overview
The way we serve the KML in the link that connects to Google Earth from individual blogs uses the following PHP script as its base:
<?php
// See Google Earth's KML 2.1 Reference
// http://code.google.com/apis/kml/documentation/kml_tags_21.html
$lat = $_GET['lat'];
$lon = $_GET['lon'];
$placename = $_GET['placename'];
$altitude = $_GET['altitude'];
$range = $_GET['range'];
$heading = $_GET['heading'];
$tilt = $_GET['tilt'];
if ($altitude == NULL) {$altitude = 0;}
if ($range == NULL) {$range = 1000;}
if ($heading == NULL) {$heading = 0;}
if ($tilt == NULL) {$tilt = 0;}
$description = "<h3><font color=\"#ea9f20\"><a href=\"http://www.philadelphia-reflections.com/\">
PHILADELPHIA REFLECTIONS</font></a></h3>
<p>The musings of a Philadelphia physician who has served the community for six decades.</p>";
header('Content-Type: application/vnd.google-earth.kml+xml');
header('Content-Disposition: inline; filename="philadelphia-reflections.kml"');
echo '<?xml version="1.0" encoding="UTF-8"?>';
?>
<kml xmlns="http://earth.google.com/kml/2.0">
<Placemark>
<name><?php echo $placename; ?></name>
<description>
<![CDATA[<?php echo $description; ?>]]>
</description>
<LookAt>
<longitude><?php echo $lon; ?></longitude>
<latitude><?php echo $lat; ?></latitude>
<altitude><?php echo $altitude; ?></altitude>
<range><?php echo $range; ?></range>
<tilt><?php echo $tilt; ?></tilt>
<heading><?php echo $heading; ?></heading>
<altitudeMode>relativeToGround</altitudeMode>
</LookAt>
<Point>
<coordinates><?php echo "$lon,$lat,$altitude"; ?></coordinates>
</Point>
</Placemark>
</kml>
Of course, GPS devices are an integral part of this process of Geo Positioning. GPS devices are supposed to support the open-source protocol GPX,
which is an XML-based description of waypoints and routes. Wikipedia describes GPX here: GPS eXchange Format
Google Earth supports raw GPX (File > Open ...) and when you open a GPX file in Google Earth, it converts it to KML. But if you want stand alone programs to do this:
If you have non-standard GPS data, you may want to have a look at GPS Babel for conversion of native GPS formats as well as the GPS Utility and G7ToWin
At Philadelphia Reflections, we are creating tours by carrying a GPS and a camera around on our travels. The GPS track becomes a path and waypoints become placemarks. When you come home, download the GPS data in GPX format and open up the GPX file in Google Earth. Use Google Earth to edit the placemark balloons, including pictures and text.
There are many, many sightseeing blogs around that take you to interesting places on Google Maps and Google Earth. A place to start looking is Sightseeing with Google Satellite Maps
Somehow, the concept of "mashup" is related to all of this but it sort of sounds like the term "multimedia" a few years ago ... fancy in concept but somewhat vague in reality.
Google has a Mashup Editor and Wikipedia has a definition but it's not clear what it all adds up to.
Sending a kml or kmz disk file is as easy as clicking on it. But different browsers react differently, some asking you which program to use others storing the file on your desk top, etc. Preprocessing the file through PHP can reduce some of these annoyances.
<?php
//
// reads and sends a kml or kmz file
// located in /whatever/kml/
//
// calling protocol:
// this-program.php?file=somefile.kml
//
// read the input and check that it's a kmz or kml file
// ....................................................
$kml_file = $_GET['file'];
if (($kml_file === NULL) or ($kml_file == "")) {exit ("error message");}
if ((substr($kml_file, -4) != ".kmz") AND (substr($kml_file, -4) != ".kml"))
{
exit ("error message");
}
// prepend the file path information to the file name and check that it exists
// ...........................................................................
$kml_file_name = "/whatever/kml/" . $kml_file;
if (!file_exists($kml_file_name)) { die ("error message");}
// send out the HTTP header information followed by the file contents
// ..................................................................
header("Cache-Control: no-cache, no-store, must-revalidate"); // trying to keep from getting the files stored on the local computer
header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");
if (substr($kml_file, -4) == ".kml") {header('Content-Type: application/vnd.google-earth.kml+xml');}
if (substr($kml_file, -4) == ".kmz") {header('Content-Type: application/vnd.google-earth.kmz');}
header("Content-Disposition: inline");
header("Content-Description: KML or KMZ data intended for Google Earth");
readfile ($kml_file_name);
?>
This site offers a Print button for all Reflections and Topics. Formatting
the text on the pages to print nicely works quite well; but how to specify what to do
with images remains a bit unclear (as of August 2006). Although 95% of users employ Internet Explorer because Microsoft supplies it free with new computers, IE is just about the worst browser to use for printing. Safari is much better, and Firefox is pretty good. Opera is also satisfactory, but Internet Explorer is not recommended. The other browsers are free; find them in Google and download them. For the usual user, that's all you have to know.
If you are curious about the technicalities, read on. The "trick", if it can be called that, to special print formatting is the media attribute
for CSS styling. The main stylesheet for this website is called in a LINK statement as follows:
The media attribute tells the browser to use this sylesheet for all media types, i.e., for screens and printers. In the
pages that are formattted to print is a stylesheet that cascades below the main stylesheet and therefore supercedes it.
This stylesheet controls the printing. IE seems to have its own views on font size so we use some conditional comments to
coax it to our way of thinking.
Here and there throughout the website are pages that contain onscreen navigation ("jump to top" and that sort of thing).
We hide them when printing by saying class="navstrip" which you can see will result in those elements being hidden.
The specification of
<body onload="window.print()">
(all lower case for XHTML purposes) is what forces the print dialog to appear.
The remaining problem is how to specify CSS formatting for images so that text flows around them as
we want. The formatting seems to work on screen for all browsers but only on some browsers for
printing.
The following link shows the results of a survey done to find out which font families are installed on Windows machines. This should help determine which fonts to use.
Sometimes I want to execute a JavaScript function when a user clicks a link, but nothing else.
If I omit the href entirely, the cursor doesn't change and some browsers don't recognize the text as a link:
<a onclick="function();">
If I include the pound sign, which seems to be a very popular trick, I get sent to the top of the current page, which messes up both History and the backspace button; to say nothing of the fact that I don't want to jump to the top of the page:
<a href="#" onclick="function();">
Including the function in the href and omitting the onclick seems to be the answer to my specific problem:
Fatal error: Allowed memory size
of 18388608 bytes exhausted
(tried to allocate 724 bytes) in
/home/dir1/dir2/script.php on line ###
When working with large amounts of data in memory (very long concatenated strings and/or very large arrays in my case), the server may hit a memory max. No amount of "unset" of variables will do the trick past a certain point.
This is the result of a memory allocation ceiling set in php.ini that can be over-ridden (judiciously) as follows:
ini_set ("memory_limit", -1 );
Be sure to test your code with smaller amounts of data first: this limit is set for a reason ... programs have been known to have
been poorly written (not yours, of course; but test anyway).
For reasons that make no sense to me, the Javascript command document.write does not work when your page is rendered properly in XHMTL (as described elsewhere in this Topic).
I have searched the web in vain to find a Javascript solution. Many are offered but none work worth a damn.
So don't bother. Use PHP's echo function. It works perfectly.
With the rise of spam entries in web forms, a security feature called "captcha" has been developed.
CAPTCHA stands for "Completely Automated Public Turing test to tell Computers and Humans Apart". The idea is that only a human could read the letters contained in the image and then enter them in the form. "Accessibility", ie., designing websites to accommodate people with handicaps is obviously hindered by Captcha; but at least given our experience with this website, spamming is a huge problem and the inability of handicapped people to leave comments is a price we are willing to pay to rid ourselves of spam. The W3C, the Godhead of web standards, does not agree with me and lectures at length on the futility of captcha: Inaccessibility of CAPTCHA. Whatever. I may get around to implementing some of their recommendations later, if we continue to be spammed.
Spammers have countered captcha in a number of ways. The first is OCR, which is why the images have fuzzy backgrounds and distorted letters: trying to defeat OCR programs. As OCR techniques have improved, captcha programs have moved from letters to "objects" such as kittens, boxes, etc., which are thought to be harder for computers to recognize; harder for people, too: cat vs kitten, for example. I am amazed to learn during my captcha research that there are spammers who offer micro-payments to people in India, etc. to enter hundreds of spam manually in captcha-ed websites that have defeated their automated spamming systems. Move, counter move; seemingly endlessly.
In this website captcha has been implemented using PHP: the comments form that appears at the end of every page has an image created using the PHP image-creation routines which has random characters in it. If the characters in the image are entered correctly in the form, the comments are entered into the database.
I cribbed the PHP captcha code from http://www.white-hat-web-design.co.uk/articles/php-captcha.php and it worked right out of the box with the minor exception that the form HTML didn't quite pass XHTML muster; easily fixed. (I have subsequently discovered that PHP security and sessions don't play well together; this problem remains unresolved and I've had to turn off captcha processing for my secure pages.)
I implemented a number of other spam counter measures before I got around to captcha, which involved noticing what the spammers did and writing code to frustrate it. I am constantly on the lookout for new security techniques to implement.
The RSS and Atom validator (http://feedvalidator.org/) has a length restriction.
I don't know what it is, exactly, but it bombs if your file is "too long". Since most syndication readers run the validator before
they'll accept a feed, I have resorted to creating a short file, which is what I point to in my meta tags.
Here's how I provide change frequency and priority for our Google sitemap (in PHP ... $mod is the variable
containing the date last modified)
$GOOGLEpriority = "0.0"; $GOOGLEfreq = "yearly"; // default
if ($mod > mktime(0,0,0) - 86400*210) {$GOOGLEpriority = "0.1"; $GOOGLEfreq = "monthly";} // past 210 days
if ($mod > mktime(0,0,0) - 86400*180) {$GOOGLEpriority = "0.2"; $GOOGLEfreq = "monthly";} // past 180 days
if ($mod > mktime(0,0,0) - 86400*150) {$GOOGLEpriority = "0.3"; $GOOGLEfreq = "monthly";} // past 150 days
if ($mod > mktime(0,0,0) - 86400*120) {$GOOGLEpriority = "0.4"; $GOOGLEfreq = "monthly";} // past 120 days
if ($mod > mktime(0,0,0) - 86400*90) {$GOOGLEpriority = "0.5"; $GOOGLEfreq = "monthly";} // past 90 days
if ($mod > mktime(0,0,0) - 86400*60) {$GOOGLEpriority = "0.6"; $GOOGLEfreq = "monthly";} // past 60 days
if ($mod > mktime(0,0,0) - 86400*30) {$GOOGLEpriority = "0.7"; $GOOGLEfreq = "monthly";} // past 30 days
if ($mod > mktime(0,0,0) - 86400*7) {$GOOGLEpriority = "0.8"; $GOOGLEfreq = "weekly";} // past 7 days
if ($mod > mktime(0,0,0) - 86400) {$GOOGLEpriority = "0.9"; $GOOGLEfreq = "daily";} // yesterday
if ($GOOGLEmoddate == date("Y-m-d")) {$GOOGLEpriority = "1.0"; $GOOGLEfreq = "hourly";} // today
IDIF is a stupid format: it includes the entire blog_contents, so the files are huge.
In the process of setting this up, I learned that flat files have a maximum size of
1.4 megs or so (the size of an old floppy disk), so I had to create more than one.
Which explains the stupid concept of a "pointer file"; instead of just giving Yahoo the
IDIF file itself, you give it a pointer file with URLs pointing to the multitude of
IDIF files. Really stupid.
News flash, after finding the Journal Of Ovid on the web, I learned about length
restrictions for the input fields (described below). This information was not contained
on the Yahoo web site describing their file format. It considerably reduced the file sizes
but I retained the structure of multiple files because who knows what I'll learn next?
IDIF title must be a maximum of 80 characters
IDIF description must be a maximum of 180 characters
IDIF body must be a maximum of 1000 characters
I'm only guessing about keywords
Thanks to the Journal Of Ovid on the web for this secret information
Choosing a web hosting service provider is difficult. There's no brand to rely on
and an Internet search turns up confusing claims and offers. We have used two
providers:
Internet Planners is the service currently running this web site. They have provided what they promised. They do not run the latest versions of MySQL or PHP which means certain newer features are not available but it probably improves their stability and what they offer is, in fact, provided. What was missing that we noticed included
and HTML Tidy, which we've just lived without, substituting our own Regular Expressions.
Network Solutions, on the other hand, has provided very poor service and has high fees. They run later versions of PHP and MySQL than Internet Planners, but their implementation is poor and the result is that critical functions are not available: user authentication for example (how can an Internet service provider not support user authentication?).
Based on our experience, Internet Planners is a reasonable choice for web hosting; Network Solutions is a bad choice.
Compression can reduce the size of the text (not images) of your web pages as they are transmitted outbound to the client. This will have only a
small impact on response time over modern fiber connections but it will significantly reduce your bandwidth consumption (70% on average on this
site.)
In XHTML vs. HTML I show how I implemented
gzip compression on this site. The problem with that method is that it's a pain. Soon another website, I tried out the Apache access method to instruct the server to compress all outbound pages. Works like a charm.
# See http://httpd.apache.org/docs/2.0/mod/mod_deflate.html
# Insert filter
SetOutputFilter DEFLATE
# Netscape 4.x has some problems...
BrowserMatch ^Mozilla/4 gzip-only-text/html
# Netscape 4.06-4.08 have some more problems
BrowserMatch ^Mozilla/4\.0[678] no-gzip
# MSIE masquerades as Netscape, but it is fine
# BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
# NOTE: Due to a bug in mod_setenvif up to Apache 2.0.48
# the above regex won't work. You can use the following
# workaround to get the desired effect:
BrowserMatch \bMSI[E] !no-gzip !gzip-only-text/html
# Don't compress images
SetEnvIfNoCase Request_URI \
\.(?:gif|jpe?g|png)$ no-gzip dont-vary
# Make sure proxies don't deliver the wrong content
Header append Vary User-Agent env=!dont-vary
Here's how to create a button in your blogs or topics that
calls KML or KMZ files you create. In the Modify A Blog utility
you can include a KML or KMZ file, but to call it explicity from
within the blog, you can create a button as shown here.
Create and save your KML file.
FTP the file to the kml folder in Philadelphia Reflections
Use the following code in a blog or topic to call the kml file
The standard PHP If statement can be reduced by the ternary operator, which is described in the PHP manual
Comparison Operators. The IIF function puts the ternary
operator into a function.
Ternary Operator
conditional ? if_true : if_not_true;
is the same as
if (conditional)
{
if_true
}
else
{
if_not_true
}
IIF Function
To return the result of the ternary operator
function iif($expression, $returntrue, $returnfalse = '') {
return ($expression ? $returntrue : $returnfalse);
}
The panel below shows every image (2000+) in every blog (800+) on Philadelphia Reflections starting with most recent additions.
It works better on some browsers especially Firefox than others, and -- with 2000 images -- it takes a while to load, as much as 10 minutes on a slow connection. An icon in the corner of the picture-wall starts a slideshow. Note: Mouse-clicking enlarges each thumbnail picture, displaying an icon linked to the website source page. We suggest you try out every little icon to see the amazing versatility of Cooliris.
The problem to be solved is that you want to embed YouTube (or other Flash movies) but the <embed> tag is deprecated in XHTML and the <param> tags don't validate, either. Here are the steps to clean things up:
Even though neither IE 7 nor Firefox 3.0.8 will render a stylesheet for an ATOM or RSS feed
delivered over the network, we have set them up in hopes that someday this anomaly will be cured.
No aggregator I have ever seen goes to the basic trouble of sorting their input feeds by
modified date, relying on the feed creator to "push" the latest onto the top of the stack. Our
XSL stylesheet does sort by modified date, among other nice things.
Here's the XSL stylesheet for our ATOM feed, including sorting the entries:
<?xml version="1.0" encoding="utf-8"?>
<!-- -->
<!-- Philadelphia Reflections -->
<!-- XSL Stylesheet for ATOM feed -->
<!-- -->
<!-- Grateful acknowledgement to http://24ways.org/2006/beautiful-xml-with-xsl -->
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<xsl:output method="html" encoding="utf-8"/>
<xsl:template match="/">
<html>
<head>
<title>ATOM Feed for Philadelphia Reflections</title>
<link rel="stylesheet" href="http://www.philadelphia-reflections.com/stylesheets/rssxsl.css" type="text/css"/>
<link rel="stylesheet" href="http://www.philadelphia-reflections.com/stylesheets/images.css" type="text/css"/>
<style type="text/css">
.notvisible {visibility: hidden;}
</style>
</head>
<body>
<xsl:apply-templates select="/atom:feed"/>
</body>
</html>
</xsl:template>
<xsl:template match="/atom:feed">
<div class="topbox">
<p><img src="http://www.philadelphia-reflections.com/images/Newsfeed-Atom-24x24.png" alt="ATOM feed icon" />
This is the <strong>ATOM-feed </strong>
for the <a href="http://www.philadelphia-reflections.com/"><xsl:value-of select="atom:title"/></a>
website.<br />
ATOM feeds allow you to stay up to date with the latest additions and changes
on <xsl:value-of select="atom:title"/>.</p>
</div>
<div class="contbox">
<table><tr>
<td>
<div class="mainbox">
<div class="itembox">
<h1><xsl:value-of select="atom:title"/></h1>
<p><xsl:value-of select="atom:subtitle"/></p>
<ul id="entries">
<xsl:apply-templates select="atom:entry">
<xsl:sort select="atom:updated" order="descending"/>
</xsl:apply-templates>
</ul>
</div>
</div>
</td>
<td valign="top" width="30%">
<div class="subscrbox">
<div class="padrhsbox">
<h2>Subscribe to this feed</h2>
<p>If you use one of the following web-based News Readers,
click on the appropriate button to subscribe to the RSS feed.</p>
<a href="#" onClick="window.location='http://add.my.yahoo.com/rss?url=' + window.location;return false;">
<img height="17" width="91" vspace="3" border="0" alt="my yahoo" src="http://www.philadelphia-reflections.com/images/addtomyyahoo4.gif"/>
</a><br/>
<a href="#" onClick="window.location='http://www.bloglines.com/sub/'+ window.location;return false;">
<img height="18" width="91" vspace="3" border="0" alt="bloglines" src="http://www.philadelphia-reflections.com/images/rss-bloglines.gif"/>
</a><br/>
<a href="#" onClick="window.location='http://www.newsgator.com/ngs/subscriber/subext.aspx?url='+ window.location;return false;">
<img height="17" width="91" vspace="3" border="0" alt="newsgator" src="http://www.philadelphia-reflections.com/images/rss-newsgator.gif"/>
</a><br/>
<a href="#" onClick="window.location='http://client.pluck.com/pluckit/prompt.aspx?GCID=C12286x053&a=' + window.location + '&t={title}';return false;">
<img src="http://www.philadelphia-reflections.com/images/pluspluck.png" vspace="3" border="0" alt="Subscribe with Pluck RSS reader"/>
</a><br/>
<a href="#" onClick="window.location='http://www.rojo.com/add-subscription?resource=' + window.location;return false;">
<img src="http://www.philadelphia-reflections.com/images/rss-rojo.gif" vspace="3" border="0" alt="Subscribe in Rojo"/>
</a><br/>
<a href="#" onClick="window.location='http://fusion.google.com/add?feedurl=' + window.location;return false;">
<img src="http://gmodules.com/ig/images/plus_google.gif" vspace="3" border="0" alt="Add to Google"/>
</a>
<hr />
<p>If you would like to receive an email whenever changes are made, please send me an email and I'll be glad to add you.
<br /><br /><a href="mailto:grfisheriii@gmail.com?subject=Add%20me%20to%20Philadelphia%20Reflections%20email%20list">Click to send an email</a></p>
</div>
</div>
</td>
</tr></table>
</div>
</xsl:template>
<xsl:template match="atom:entry">
<li style="margin-bottom: 25px; height: auto;">
<a href="{atom:link/@href}">
<xsl:value-of select="atom:title"/>
</a>
<small>
— <xsl:value-of select="substring-before(atom:updated,'T')"/>
</small>
<br/>
<div class="item_desc">
<xsl:value-of select="atom:summary" disable-output-escaping="yes"/> <!-- disable-output-escaping="yes" does not work with Firefox 3.0.8 -->
</div>
</li>
</xsl:template>
</xsl:stylesheet>
MySQL timeout? Probably a new error after years of working perfectly, resulting from an ISP change
which they will neither acknowledge nor fix. Sound familiar?
Charming people.
Try this:
$db_link = @mysql_connect(DB_HOST, DB_USER, DB_PSWD,'',MYSQL_CLIENT_INTERACTIVE);
... instead of what you used to do:
$db_link = @mysql_connect(DB_HOST, DB_USER, DB_PSWD);
Handheld/mobile devices have been exploding in popularity and with the advent of the iPhone they have become the device of choice. The Blackberry
was a lovely device but once you try an iPhone you will never want a Blackberry again. Of course, all of this will change as each new device comes out
but what won't change is the fact that mobile devices are supplanting PCs for everything but the most keyboard- or large-screen-intensive work.
Therefore, the popularity of a website/blog/whatever depends on making it accessible to mobile devices. Step one is knowing when you've been
visited by such a thing.
iPhones
» Articles
The iPhone is very well-behaved with respect to CSS. Simply include the following meta tag in an otherwise-ordinary web page:
(Subsequent to writing this article we wrote an iPhone-specific article page. The stylesheet method described does work but we had other
reasons to make modifications to the content.)
» Index Page
The index page was a complete rewrite of the standard index page to turn it into a simple table of contents.
The breakthrough here was an exquisite PHP script found at
http://detectmobilebrowsers.mobi/
which analyzes the HTTP headers to determine if a device is a handheld and if so, what type. Using this, we redirect from the regular index page
to the iPhone-specific index page (from index.php to
indexiphone.php
... it looks much better on an iPhone than on a PC).
Generic Handheld Devices
We may build more device-specific CSS files and pages as we learn what our visitors use but for the time being we simply support iPhones and Other.
Because many handheld devices have very small screens and a very small buffer capacity, we strip out all HTML comments, images and tables; we also convert to UTF-8 encoding
because of the claim that handhelds support it better than iso-8859-1:
We also send handheld-specific HTTP headers ... see XHTML vs. HTML for the script we use for all page headers.
» Articles
It turns out in real life that many handhelds don't handle CSS correctly, so the { display: none; }
trick that works so well for iPhones is unreliable for many other devices. Furthermore, many devices choke if sent a large data stream. Therefore, we had to
write a very stripped-down page for each individual article. This required writing only one new program since all of our content is served from a database.
The place to start testing your new pages is the Opera Mini Simulator:
http://www.opera.com/mini/demo/.
Elegant in its simplicity, it has never choked or failed, unlike most other emulators. Plus, it is completely intuitive; unlike all other emulators. And free.
Once you get the basic design underway, you will want to validate ... at
http://www.ready.mobi/.
Third Modify the run.bat file located in
C:\Program Files\Research In Motion\BlackBerry Email and MDS Services Simulators 4.1.2\MDS
To include on the second line
set JAVA_HOME=C:\Program Files\Java\jre1.6.0_07
(the locations may/will be different on your machine)
Only a sadistic socially-crippled geek-savant could have dreamed up such a convoluted mess, but ultimately it does work and does allow you to see how the different models operate. Actually, once you get it working,it's sort of fun to try different models.
The iPhone is the best PDA to come along since the Blackberry 15 years ago. It is to the Blackberry what the Blackberry was to cell phones.
Philadelphia Reflections is now a fully-fledged iPhone web app. The application will appear on your iPhone in the appropriate format automatically: just navigate to http://www.philadelphia-reflections.com with the iPhone browser; we will detect it and do the right thing.
A two-step process is required to get a little icon on your iPhone home page so you can go there directly:
Click the "+" plus sign at the bottom of the iPhone screen
Click the "Add to Home Screen" button that appears.
// Here's how to create a new icon with the defaults
var newIcon = new GIcon(G_DEFAULT_ICON);
// To create a new icon like the default but yellow:
var yellowIcon = new GIcon(G_DEFAULT_ICON, "http://www.philadelphia-reflections.com/images/googlemapsmarkeryellow.png");
// To make use of that yellow icon and give it a tooltip
var point = new GLatLng(40.39, -75.34);
var marker = new GMarker(point, {icon:yellowIcon, title:"View Above Philadelphia"});
map.addOverlay(marker);
// To open a balloon when clicked
var message = " ... fill with text and HTML ... I've found tables are very helpful ";
GEvent.addListener(marker, 'click', function() {marker.openInfoWindowHtml(message);});
// Change icon on mouseover (see http://www.cems.uwe.ac.uk/~cjwallac/apps/phpxml/showIcons.php)
var msoverIcon = new GIcon(G_DEFAULT_ICON, "http://maps.google.com/mapfiles/kml/pal2/icon1.png");
GEvent.addListener(marker, 'mouseover', function() { marker.seticon(msoverIcon); });
In case you're wondering "How the heck does Python handle headers and data under 3.2.2?", here's an example that works using IDLE and Python 3.2.2 installed on a 64-bit Windows 7 machine.
import re
import urllib.request
url = "http://www.philadelphia-reflections.com"
uf = urllib.request.urlopen(url)
# header information
print('--- headers ---')
info = uf.info() # headers
#headers = info._headers # a list of all the headers
print('charsets:',info.get_charsets())
print('content_charset:',info.get_content_charset())
print('content_type:',info.get_content_type())
print('content_maintype:',info.get_content_maintype())
print('content_subtype:',info.get_content_subtype())
print('default_type:',info.get_default_type())
print('filename:',info.get_filename())
print('params:',info.get_params())
print('payload:',info.get_payload())
print()
print('--- data ---')
data = uf.read().decode(info.get_content_charset()) # content
print(data[:500])
print()
print('--- image ---')
imageurl = url + "/images/001.JPG"
image = urllib.request.urlretrieve(imageurl, 'python_001.jpg')
print(image)
QR Codes are similar to bar codes in that they are read optically.
Most-common in Japan, all Japanese cell phones can read them; all fancy phones in America
(iPhone, etc.) have downloadable apps that can read QR Codes (semacode is a free QR Code app for
the iPhone but there are many for all).
One application that is becoming common is encoding a website's URL and including the image in a print advertisement.
The site www.webpagetest.org is an excellent facility for testing the performance of a web site. Philadelphia Reflections passes with flying colors:
A First Byte Time ... the PHP program is efficient
A Keep-alive Enabled ... the TCP connection is kept open throughout the entire process
A Compress Text ... the text is gzip compressed to reduce the amount of data sent
A Compress Images ... we reduce the size of our images and use JPEG compression
A Cache Static Content ... this can be seen in the Repeat View: none of the images are sent on subsequent page loads
X CDN Detected ... large commercial sites have copies scattered around the country and the world to balance the load and to reduce latency but it would not pay for a small site like this one to go to this effort and expense
(Blog 2300) We have a facility on this website to download books of many chapters (made up of volumes of topics on the site) to Microsoft Word for subsequent editing and eventual publishing. In many cases we download lots of pictures (via an img src= tag). I have not found a way to set the way text flows around the images in Word using HTML or CSS, so I built a Word macro to do it. This should allow you to change the size of images, as well as move them around. Moving the captions requires the use of the captions feature in Word's image menu (right-click).
------------------------------------
Instructions for use of a Macro named Sub ImageFlow():
Open Word
In Word, enter File>Open
enter the URL of the file you want to modify into the File Entry box and press the Enter key to load the document. It may take a minute or two, but a working screen should appear, loaded with the file in a condition ready to move the pictures around.
Press Alt + F11 which will open the VBA screen
Copy the macro found on this page from
Sub ImageFlow()
to
End Sub
In the right-hand panel of the VBA screen press Ctrl+A, Ctrl+V to paste it in
In the VBA screen press F5 to run the macro
If you want to do a lot of these manipulations, save the macro in the Macro Library of Windows Word.
------------------------------------
Sub ImageFlow()
'
' this Macro goes through an entire Word document and
' changes the way text flows around each picture
' ("Tight" in this example but see below for choices)
'
Dim shpIn As InlineShape, shp As Shape
For Each shpIn In ActiveDocument.InlineShapes
If (shpIn.Type = wdInlineShapeLinkedPicture) Then
Set shp = shpIn.ConvertToShape
shp.WrapFormat.Type = wdWrapTight
End If
Next shpIn
For Each shp In ActiveDocument.Shapes
shp.WrapFormat.Type = wdWrapTight
Next shp
End Sub
----------------------------------------
Change wdWrapTight to any of the following:
wdWrapBehind
wdWrapFront
wdWrapInline
wdWrapNone
wdWrapSquare
wdWrapThrough
wdWrapTight
wdWrapTopBottom
Recently our ISP started requiring user sign on in order to send emails. PHP's mail function stopped working as a result.
Naturally, the ISP did not notify us of this change so we were quite surprised when many thousands of emails on our newsletter list were rejected (every one of them, in fact).
What error message was returned to us to notify us of what the problem was? Why this helpful note:
Mail sent by user nobody being discarded due to sender restrictions in WHM->Tweak Settings
Doesn't that just say it all?
I'm being snide, but our ISP is really quite good about keeping its software up to date and aside from an occasional surprise like this, they are very reliable. Being up to date including the automatic incorporation of the PEAR Mail facility which we are now using.
PEAR's Mail system works quite well but two problems were very vexing until we stumbled our way to a solution:
How, exactly, do we sign on to the SMTP server?
How do we ensure that bounced emails (the bane of all email lists) get returned to us?
You might not think that the first question would be so hard but it actually took a good deal of trial and error to get it right. As for the second question, there is an awful lot of wrong information available out in Internet land (including but not limited to VERP and XVERP which I advise you to avoid).
With PEAR Mail you first set up a "factory" and then send emails, either singly or in a loop. We keep the user id, password, etc. in a file "above" the web server in hopes that will keep them a secret ... here's the code (it actually is in production and it does in fact work):
<?php
include('Mail.php');
# the email constants are contained in a file outside the web server
include("/level1/level2/level3/constants.php");
$headers = array (
'From' => '"name"<addr@domain.com>',
'Sender' => '"name"<addr@domain.com>',
'Reply-To' => '"name"<addr@domain.com>',
'Return-Path' => 'addr@domain.com',
'Content-type' => 'text/html; charset=iso-8859-1',
'X-Mailer' => 'PHP/' . phpversion(),
'Date' => date("D, j M Y H:i:s O",time()),
'Content-Language' => 'en-us',
'MIME-Version' => '1.0'
);
// call the PEAR mail "factory"
$smtp = Mail::factory('smtp',
array (
'host' => EMAIL_HOST,
'port' => EMAIL_PORT,
'auth' => true,
'username' => EMAIL_USERNAME,
'password' => EMAIL_PASSWORD,
'persist' => true,
'debug' => false
), '-f addr@domain.com'
);
# to send emails:
#
# $headers['To'] = $to; # provide the "$to" variable, something like $to = '"name"<addr@domain.com>';
# # note that the first parameter of $smtp->send can be "decorated" this way or just a naked email address
# $headers['Subject'] = $subject; # provide the "$subject" variable
# $mail = $smtp->send($to, $headers, $contents_of_the_email);
# -------- ................................> except for 'To' and 'Subject',
# $headers is provided by this module but can be over-ridden
# if (PEAR::isError($mail))
# {
# echo "<p style='color:red;'>The email failed; debug information follows:<br />";
# echo $mail->getDebugInfo() . "<br />";
# echo $mail->getMessage() . "</p>";
# }
# else
# {
# echo "<p>email successfully sent</p>";
# }
?>
Databricks intends to create a Finance Vertical position to support the Sales and SA teams when working with financial-industry organizations. This article attempts to describe the structure of the worldwide financial industry, who the major players are and what their needs might be in the context of Apache Spark and Databricks offerings.
Contents
1 Executive Summary 2
2 Introduction 2
3 Risk Mitigation 3
4 Opportunity Discovery 5
5 Finance-Industry Sponsored Kaggle Contests 6
6 Spark and Finance on YouTube 11
7 APPENDIX 15
1 Executive Summary
The opportunities in the finance sector lie on a wide spectrum: at one
end are the quant funds for whom large-scale analytics are the entire
business, at the other, are traditional depositories for many of whom a
daily batch cycle and a quarterly book closing have long sufficed.
Quite often both extremes exist in the same company.
For this entire spectrum the easy-to-use, streaming, multi-source,
big-data analytics offered by Databricks can offer advantages.
Perhaps with quick adoption by the quants and slower adoption by the others. Early adoption may involve a lot of discovery but a growing collection of proven use cases will ease later sales.
1
.
streaming will supplant batch
.
predictive analytics will replace BI
.
easy multi-sourcing can unite stove pipes
.
pooling can dramatically reduce operational complexity and cost
In addition, in the larger companies, the pressure to comply with data-
related regulations company-wide has become almost overwhelming and
nearly all are struggling with multitudes of incompatible systems that
A spark might unite.
2 Introduction
The finance industry is vast, far too large and diverse to make a
comprehensive enumeration of all the functions performed or of the firms that perform them. The Economist Intelligence Unit [14]
might be a good source, to begin with for such a survey.
The Appendix of this report contains lists of the major financial organizations grouped by function starting on page 15.
The questions of interest to Databricks are (1) which finance firms are
most likely to benefit from the manipulation and analysis of large
datasets and (2) what are the types of manipulation and analysis of
interest?
The two main concerns for the finance industry are:
.
Risk Mitigation
.
Opportunity Discovery
1 I wonder if the entirely cloud-based solution offered by Databricks
does not leave a lot on the table given the pervasiveness of
proprietary datacenters in this world. IBM mainframes, at that.
3 Risk Mitigation
[7]
Simply put, risk mitigation means don’t lose money, don’t go out of business and don’t go to jail.
Risk Categories
1. Business Risk Risks undertaken by the business
itself to maximize share- holder value and profits. For example: the
cost to launch a new product. Risk mitigation takes the form of
competent management controls.
2. Exogenous Risk Political upheaval, natural
disaster, economic disrup- tion. Insurance is the most-common risk
mitigation tool in these cases.
3. Financial Risk Financial risk arises from
volatility in equities, deriva- tives, currencies, interest rates etc.
In the case of financial firms these risks are also Business Risks
since finance is the business.
.Market Risk
Changes in prices, their magnitude, direction and volatility.
.Credit Risk
The effect of counter-party default or the repercussions of providing
services to bad actors.
.Liquidity Risk
The inability to make timely payment. Margin calls often precipitate
this when illiquid securities cannot be sold or col- lateralized.
.Operational Risk
Failures of judgment, integrity, controls, proce- dures or technology.
Cyber Security
An aspect of Operational Risk that gains clar- ity at senior levels
with every report of the losses incurred and chaos engendered by
widespread sophisticated hacking.
Financial-firm financial-risk mitigation is a field of study unto
itself. For example, there is a rigorous, multi-partFinancial Risk Manager (FRM) Certification [5] created by Global Association of Risk Professionals (GARP).
4. Regulatory Compliance While perhaps not a risk per
see this is a huge concern to financial firms, particularly since the
Financial Crisis of a decade ago and the rules promulgated as a
response.
For example, one of the main tenets of BCBS 239 [15] is that all
‘material risk data’ must be automatically aggregated and analyzed across the entire banking group on a near-real-time basis while facing severe economic stresses. Multitudes of incompatible systems are a huge barrier.
[11]
4 Opportunity Discovery
If Risk Mitigation is Operations, Opportunity Discovery is Research
& Devel- opment.
An inexhaustive list:
• Fundamental Analysis
The study of the financial characteristics of in- dividual firms,
seeking undiscovered value. Warren Buffett is the world’s most-famous
fundamental analyst.
• Macro
The study of economy-wide signals. George Soros’ famous short of the UK
Pound is an example [12]. The ‘Big Short’ of 2007-2008 is another [22].
• Relative
The study of relative movements of securities. Long/Short hedge funds
are an example.
• TechnicalAnalysis
The study of trendlines.
• Quantitative Analysis
The intersection of big data and machine learn- ing. Jim Simons’
Renaissance Capital [16] is the most successful example I know of but
there are many others; some are listed in the appendix be- ginning on
page 17. Some Kaggle contests focused on this, see Section
5.
• Product Development
Swaps are an example of building a product to meet very specific
customer needs. Even more sophisticated products are possible with
analytical support using all available data.
• Customer Enhancement
Using machine learning to reduce customer churn; using predictive
analytics for product-customer targeting; consis- tent customer support
across multiple access channels; etc. . . . using Ama- zonian
techniques in a banking environment to take on the characteristics of
the fintechs.
• Cost Control
Route optimization for filling ATMs; redundant process identification;
risk reduction not just as a regulatory requirement, but as a cost
saver and a profit enhancer
• Risk System Integration
The regulators are forcing the larger firms to create “living willsâ€
which has resulted in a much better understanding the the numerous
piece parts. The Basel risk data requirements are now forcing a
near-real-time integration of numerous disparate systems. This seems
like fertile ground for innovation both for compliance and to build
upon the results.
5 Finance-Industry Sponsored Kaggle Contests
Over the past several years a number of financial firms have sponsored
Kaggle contests. Someone at these firms thought that these subjects were worth paying for crowd-sourced analysis and was willing to go to the considerable trouble of setting up and monitoring a contest with thousands of participants lasting three months or more.
Two Sigma is a quant fund, listed in the appendix on pages 17 and 23.
The challenge was to predict daily price changes. (In this contest I
earned a Kaggle Silver Medal for coming in 37th out of 2,070
contestants. [9])
Opportunity Discovery
Improve credit risk models by predicting the probability of default on
consumer credit.
Risk Mitigation
Improve the quality of information within transaction data.
Risk Mitigation
Predict which customers will leave an insurance company in the next 12
months.
Risk Mitigation
Given a dataset of 2D dashboard camera images, State Farm is
challenging Kag-
guess to classify each driver’s behavior. Are they driving attentively,
wearing their seatbelt, or taking a selfie with their friends in the
backseat?
Risk Mitigation
Santander (Spain-based bank) is challenging Kagglers to predict which
products their existing customers will use in the next month based on
their past behavior and that of similar customers.
Opportunity Discovery
Santander Bank is asking Kagglers to help them identify dissatisfied customers early in their relationship.
Risk Mitigation
, Opportunity Discovery
Using terabytes of noisy, non-stationary data Winton Capital is looking
for data scientists who excel at finding the hidden signal in the
proverbial haystack, and who are excited by creating novel statistical
modeling and data mining techniques.
Opportunity Discovery
Using a customers shopping history, can you predict what insurance policy they will end up choosing?
Opportunity Discovery
Claims management may require different levels of the check before a claim can be approved and payment can be made. With the new practices and behaviors generated by the digital economy, this process needs adaptation thanks to data science to meet the new needs and expectations of customers. Kagglers are challenged to predict the category of a claim based on features available early in the process.
Risk Mitigation
, Opportunity Discovery
The life insurance application process is antiquated. Customers provide
extensive information to identify risk classification and
eligibility, including scheduling medical exams, a process that takes
an average of 30 days.
The result? People are turned off. That's why only 40% of U.S.
households own individual life insurance. Prudential wants to make it quicker and less labor intensive for new and existing customers to get a quote while maintaining privacy boundaries.
Opportunity Discovery
Predict a transformed count of hazards or pre-existing damages using a
dataset of property information. This will enable Liberty Mutual to more accurately identify high-risk homes that require an additional examination to confirm their insurability.
Risk Mitigation
Fire losses account for a significant portion of total property losses.
High severity and low frequency, fire losses are inherently volatile,
which makes modeling them difficult. In this challenge, your task is to
predict the transformed ratio of loss to the total insured value. This will
enable more accurate identification of each policyholders risk exposure
and the ability to tailor the insurance coverage for
their specific operation.
Risk Mitigation
The Benchmark Bond Trade Price Challenge is a competition to predict the next price that a US corporate bond might trade at.
Opportunity Discovery
Determine whether a loan will default and the loss incurred. We are
building a bridge between traditional banking, where we are looking at
reducing the consumption of economic capital, to an asset-management
perspective, where we optimize on the risk to the financial investor.
Risk Mitigation
Develop models to predict the stock market’s short-term response following large trades. Contestants are asked to derive empirically
models to predict the behavior of bid and ask prices following such
“liquidity shocksâ€.
Modeling market resiliency will improve trading strategy evaluation
methods by increasing the realism of backtesting simulations, which
currently, assume zero market resiliency.
Risk Mitigation
, Opportunity Discovery
Bodily Injury Liability Insurance covers other peoples bodily injury or death for which the insured is responsible. The goal of this
competition is to predict Bodily Injury Liability Insurance claim
payments based on the characteristics of the insureds vehicle.
Risk Mitigation
Allstate is currently developing automated methods of predicting the cost, and hence severity, of claims. Kagglers are invited to create an algorithm which accurately predicts claims severity.
Risk Mitigation
6 Spark and Finance on YouTube
• Apache Spark on IBM z Systems Demo for Finance
References to IMS, CICS, and VSAM make me think this is Spark on an IBM
mainframe. Considering the fact that IBM mainframes are still quite widely used, this might be worth understanding.
Opportunity Discovery
, Risk Mitigation
•
Using Spark to Analyze Activity and Performance in High Speed
Corvil: Irish data monitoring and analytics for financial data using
Spark. Non-intrusive low-latency electronic trading monitoring,
regulatory compliance through the use of streaming telemetry.
Blackrock mortgage analysis of mortgage data. Using Spark, Scala, and D3
to visualize a large loan-level mortgage dataset, extract distributions and cluster boundaries. Also, use K-Means to reveal similar borrower groups and corresponding discriminant attributes.
MapR presentation: high velocity streaming processing post-Hadoop at
NYSE 20 Megabytes per second time windowing. One Kafka topic per stock,
parallelized.
Risk Mitigation
, Opportunity Discovery
•
An Example Application for Processing Stock Market Trade
Prior to the financial collapse of 2007-2008 mortgage, securitization was the hot thing. Many institutions and individuals got burned and a
residual fear of securitization remains.
The result is that for jumbo and subprime mortgages, the originators are now holding many more of the loans. This reduces the systematic
risk but an unanticipated consequence is that Fannie Mae and Freddie Mac [3] are
now holding 50% of $11 trillion outstanding in the middle market.
Therefore the US government has undertaken a huge amount of default and
interest-rate risk.
Insurance Companies by Premium Income
[8]
Property/Casualty Insurance
State Farm Mutual Automobile Insurance
62,189,311
Berkshire Hathaway Inc.
33,300,439
Liberty Mutual
32,217,215
Allstate Corp.
30,875,771
Progressive Corp.
23,951,690
Travelers Companies Inc.
23,918,048
Chubb Ltd.
20,786,847
Nationwide Mutual Group
19,756,093
Farmers Insurance Group of Companies
19,677,601
USAA Insurance Group
18,273,675
Life Insurance/Annuities
MetLife Inc.
95,110,802
Prudential Financial Inc.
45,902,327
New York Life Insurance Group
30,922,462
Principal Financial Group Inc.
28,186,098
Massachusetts Mutual Life Insurance Co.
23,458,883
American International Group
22,463,202
Jackson National Life Group
22,132,278
AXA
21,920,627
AEGON
21,068,180
Lincoln National Corp.
19,441,555
Homeowners Insurance
State Farm Mutual Automobile Insurance
17,516,715
Allstate Corp.
7,926,984
Liberty Mutual
5,993,803
Farmers Insurance Group of Companies
5,284,511
USAA Insurance Group
5,000,407
Travelers Companies Inc.
3,305,427
Nationwide Mutual Group
3,249,456
American Family Insurance Group
2,609,366
Chubb Ltd. (4)
2,485,193
Erie Insurance Group
1,471,544
Private Passenger Auto Insurance
State Farm Mutual Automobile Insurance
39,194,660
Berkshire Hathaway Inc.
25,531,762
Allstate Corp.
20,813,858
Progressive Corp.
19,634,834
USAA Insurance Group
11,691,051
Liberty Mutual
10,774,426
Farmers Insurance Group of Companies
10,304,622
Nationwide Mutual Group
7,640,558
American Family Insurance Group
4,005,549
Travelers Companies Inc.
3,896,786
Commercial Auto Insurance
Progressive Corp.
2,625,929
Travelers Companies Inc.
2,124,182
Nationwide Mutual Group
1,735,614
Zurich Insurance Group
1,624,621
Liberty Mutual
1,604,461
Old Republic International Corp.
1,123,042
Berkshire Hathaway Inc.
951,775
American International Group (AIG)
867,567
Auto-Owners Insurance Co.
739,495
Chubb Ltd.
695,210
Commercial Lines Insurance
Chubb Ltd.
16,528,891
Travelers Companies Inc.
16,463,566
Liberty Mutual
15,056,251
American International Group (AIG)
13,144,961
Zurich Insurance Group
12,554,597
CNA Financial Corp.
9,763,122
Nationwide Mutual Group
8,335,275
Hartford Financial Services
7,679,737
Berkshire Hathaway Inc.
7,650,236
Tokio Marine Group
6,256,196
Workers’ Compensation Insurance
Travelers Companies Inc.
4,467,425
Hartford Financial Services
3,324,361
AmTrust Financial Services
2,972,901
Zurich Insurance Group
2,851,695
Liberty Mutual
2,481,479
Berkshire Hathaway Inc.
2,479,354
State Insurance Fund Workers’ Comp (NY)
2,437,325
Chubb Ltd.
2,368,918
American International Group
2,345,247
State Compensation Insurance Fund (CA)
1,638,849
Global Asset Management Firms by Revenue
[18]
BlackRock
United States
4,890
The Vanguard Group
United States
3,149
UBS
Switzerland
2,716
State Street Global Advisors
United States
2,460
Fidelity Investments
United States
2,025
Allianz
Germany
1,949
J.P. Morgan Asset Management
United States
1,760
BNY Mellon Investment Management
United States
1,740
PIMCO
United States
1,590
Credit Agricole Group
France
1,527
Global Investment Banks by Revenue
[2]
JPMorgan
3,361
Goldman Sachs
2,858
Bank of America Merrill Lynch
2,684
Morgan Stanley
2,501
Citi
2,378
Barclays
1,884
Credit Suisse
1,760
Deutsche Bank
1,387
RBC Capital Markets
994
UBS
904
Wells Fargo Securities
871
HSBC
793
Jefferies LLC
750
BNP Paribas
619
Lazard
565
BMO Capital Markets
448
Nomura
445
Mizuho
435
Sumitomo Mitsui Financial Group
413
Evercore Partners Inc
407
Hedge Funds By Assets Under Management
[6]
OrgCRD
PrimaryBusinessName
May2017AUM
110814
NOMURA ASSET MANAGEMENT CO., LTD.
367.6
105129
BRIDGEWATER ASSOCIATES, LP
239.3
158117
MILLENNIUM MANAGEMENT LLC
207.6
158319
SAMSUNG ASSET MANAGEMENT COMPANY, LTD.
182.2
148826
CITADEL ADVISORS LLC
152.7
143161
APOLLO CAPITAL MANAGEMENT, L.P.
125
140074
PICTET ASSET MANANGEMENT SA.
122.8
110997
NIKKO ASSET MANAGEMENT CO LTD
120.6
282598
VANGUARD ASSET MANAGEMENT, LIMITED
120.2
111128
THE CARLYLE GROUP
101.9
106661
RENAISSANCE TECHNOLOGIES LLC
97
144533
KOHLBERG KRAVIS ROBERTS
90
168122
ANNALY MANAGEMENT COMPANY
87.9
152719
ALPHADYNE ASSET MANAGEMENT PTE. LTD.
84.6
133720
PINE RIVER CAPITAL MANAGEMENT L.P.
82.8
159732
TPG GLOBAL ADVISORS, LLC
79.5
138111
BALYASNY ASSET MANAGEMENT L.P.
75.1
144603
EASTSPRING INVESTMENTS (SINGAPORE) LIMITED
74.5
155587
FIELD STREET CAPITAL MANAGEMENT, LLC
63.3
107580
BLACKSTONE ALTERNATIVE ASSET MANAGEMENT LP
62.3
148823
BLUECREST CAPITAL MANAGEMENT LIMITED
62.2
142979
BLACKSTONE REAL ESTATE ADVISORS L.P.
60.1
160795
APG ASSET MANAGEMENT US, INC
59.3
130074
ARES MANAGEMENT LLC
58.4
136979
BLACKSTONE MANAGEMENT PARTNERS L.L.C.
57.4
161600
AGNC MANAGEMENT, LLC
56.9
129612
FORTRESS INVESTMENT GROUP
56.9
156601
ELLIOTT MANAGEMENT CORPORATION
56
160309
ELEMENT CAPITAL MANAGEMENT LLC
55.9
139345
MACQUARIE FUNDS MANAGEMENT
54.7
160188
MOORE CAPITAL MANAGEMENT, LP
53.8
107913
OZ MANAGEMENT LP
51.7
159738
TPG CAPITAL ADVISORS, LLC
51.6
137137
TWO SIGMA INVESTMENTS, LP
49.3
152254
TWO SIGMA ADVISERS, LP
48.7
110338
MACKENZIE INVESTMENTS
48.6
156078
HUDSON AMERICAS L.P.
48.4
160000
LONE STAR NORTH AMERICA ACQUISITIONS, LLC
48.1
152175
CERBERUS CAPITAL MANAGEMENT, L.P.
48
173355
CANDRIAM LUXEMBOURG S.C.A.
47.1
156934
3G CAPITAL PARTNERS LP
46.3
143158
APOLLO MANAGEMENT, L.P.
46.2
157589
CAPULA INVESTMENT US LP
45.8
156945
WARBURG PINCUS LLC
45.7
132272
VIKING GLOBAL INVESTORS LP
43.4
160679
ADAGE CAPITAL MANAGEMENT, L.P.
42
146629
KKR CREDIT ADVISORS (US) LLC
41.5
159215
ALPINVEST PARTNERS B.V.
41.2
108679
D. E. SHAW
37
Largest private equity firms by PE capital raised
[17]
The Carlyle Group
Washington D.C.
$30,650.33
Kohlberg Kravis Roberts
New York City
$27,182.33
The Blackstone Group
New York City
$24,639.84
Apollo Global Management
New York City
$22,298.02
TPG
Fort Worth/San Francisco
$18,782.59
CVC Capital Partners
Luxembourg
$18,082.35
General Atlantic
New York City
$16,600.00
Ares Management
Los Angeles
$14,113.58
Clayton Dubilier & Rice
New York City
$13,505.00
Advent International
Boston
$13,228.09
EnCap Investments
Houston
$12,400.20
Goldman Sachs Principal Investment Area
New York City
$12,343.32
Warburg Pincus
New York City
$11,213.00
Silver Lake
Menlo Park
$10,986.40
Riverstone Holdings
New York City
$10,384.26
Oaktree Capital Management
Los Angeles
$10,147.28
Onex
Toronto
$10,097.21
Ardian (formerly AXA Private Equity)
Paris
$9,805.25
Lone Star Funds
Dallas
$9,731.81
InvestmentBanking Private Equity Groups
[19]
ABN AMRO AAC Capital Partners Barclays Capital Equistone Partners
Europe BNP Paribas PAI Partners
CIBC World Markets Trimaran Capital Partners
Citigroup Court Square; CVC; Welsh, Carson, Anderson &
StoweBruckmann, Rosser, S Deutsche Bank MidOcean Partners
Globus Capital Holdings Globus Capital Banca
Goldman Sachs Goldman Sachs Capital Partners JPMorgan Chase CCMP
Capital; One Equity Partners Lazard Lazard Alternative Investments
Merrill Lynch Merrill Lynch Global Private Equity
Morgan Stanley Metalmark Capital; Morgan Stanley Capital Partners New
York
National Westminster Bank Bridgepoint Capital
Nomura Group Terra Firma Capital Partners
UBS UBS Capital; Affinity Equity Partners; Capvis; Lightyear Capital
Wells Fargo Pamlico Capital
William Blair & Company William Blair Capital Partners
Federal Reserve System
The St. Louis Fed is well known among economics geeks as a fantastic
source of data, analysis and commentary. [4] In fact, all the Fed banks
are avid consumers of data, analysis and risk-management metrics.
SQL is the language of relational databases. It has its roots in mathematical theory which is what distinguishes it from the prior database structures and it has become the universal way that programmers communicate with databases.
Philadelphia Reflections uses the MySQL variety of database but the SQL language is universal.
Below are two simple examples of the SQL used in Philadelphia Reflections ...
SELECT * FROM individual_reflections WHERE table_key = ?;
This example simply pulls up all the fields ("TITLE", "DESCRIPTION", etc.) from a blog, the number of which is presented at run time
SELECT title, description, table_key FROM topics
WHERE NOT center_order = 0
AND table_key IN
(SELECT topic_key FROM topics_blogs
WHERE blog_key=?);
This example is slightly more complex ... it selects a subset of fields (title, description, table_key) of all the TOPICS that are associated with a single BLOG, which are visible (WHERE NOT center_order = 0 ).
When can you start?
Posted by: Eipmwlod | Jan 25, 2012 3:55 PM
I have not been to this blog for a long time, however it
was a joy to find it again. It is such an important
topic and ignored by so many, even professionals! I
thank you for helping to make people more aware of these
issues. Just great stuff as per usual.For more
information go to <www.brandwebdirect.com>.
Posted by: Johnmaurice | Sep 26, 2011 6:40 AM
this post is fantastic %3
Posted by: Yutqlnvp | Feb 5, 2011 11:09 PM
Some pretty amazingly esoteric information here, mate!
Posted by: Late Nights | Mar 16, 2007 10:06 PM
Good work. Interesting posts, besides those spam...
Posted by: Oopa Jopa | Oct 11, 2006 6:01 PM
C'est trouis bien. Nice, i mean. Thanks!
Posted by: Junior Lee | Sep 26, 2006 11:07 AM
Please Let Us Know What You Think
48 Blogs
Website Statistics Philadelphia Reflections' popularity has grown quite dramatically over two and a half years.
Floating Three-Column CSS Layout A popular web page layout is three columns with a header and footer. This is achieved on this web site with CSS using floating columns.
CSS Zen Garden Suggestions Here's a list of the designs I think are worth a look, illustrating the great power of CSS.
Regex URL Matching On this site we check for the existence of a URL whenever an entry is updated. A Regex (regular expression) string was the breakthrough.
Editing HTML with PHP scripts To provide a PHP script that allows a user to edit HTML requires a few tricks that are hard to hack through but are elegantly documented in the manual.
Static vs Dynamic URLs Implementing static URLs for a website driven by PHP and MySQL is as easy as a little regex and htaccess magic.
HTML Forms How to open a form in a new window when a radio button is clicked.
Geo Positioning With the advent of Google Earth, the tagging of websites, blogs and photographs with latitude and longitude information has taken a great leap forward.
Webpage Printing Webpage printing is supported on this site. It seems to work pretty well except for text flow-around for some browsers.
Font Families A survey of the most-commonly installed fonts found on Windows machines
Open a new window with XHTML Prior to XHTML, you could open a new window with a link by saying target="_blank". That's no longer allowed, but what can you do?
Captcha Captcha is the term for security codes the user must enter into a form
RSS, Atom, Syndication, etc. The world is full of XML and XML-like file formats for syndication-feed purposes. Why they must all be different, God alone can tell. But the reason there is lousy documentation is the evil work of Man.
Server-Side gzip Compression You can include code in a PHP program to compress an outbound web page. It works, but instructing the web server to do it to every page is much easier.
Stylesheet for ATOM feed Unsupported for the moment (in IE 7 and Firefox 3.0.8), we have equipped our ATOM and RSS feeds with XLS style sheets which include, among other features, sorting the entries in descending order by modified date.
SQL To Exclude A List Of Items How do you select everything from one table except for a list contained in another table?
MySQL server has gone away What to do when your MySQL connection is timing out for no apparently good reason, all of a sudden.
mysql_update_assoc Easily (and quote_smart-ly) update a record in a MySQL database table
How to detect an iPhone and other mobile devices iPhones are definitely the wave of the future and websites, blogs, etc. must adapt to retain their audience. Luckily, if a website was developed using CSS, it's a breeze.
An iPhone web app Philadelphia Reflections is now available on the iPhone as a web app
Valid XHTML YouTube embed code generator This free tool will create a valid XHTML embed code for any YouTube video. The code YouTube shows on the embed field is not valid XHTML! However, you can simply use this simple tool to make it Valid XHTML 1.0 Transitional.
SQL ... Structured Query Language
SQL (Structured Query Language) is the way that programmers communicate with relational databases such as Philadelphia Reflections