<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jeremy Cook &#187; SQL</title>
	<atom:link href="http://jeremycook.ca/category/web-development/sql/feed/" rel="self" type="application/rss+xml" />
	<link>http://jeremycook.ca</link>
	<description>Random musings on web development and PHP</description>
	<lastBuildDate>Thu, 10 May 2012 02:27:21 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Some More PDO Weirdness</title>
		<link>http://jeremycook.ca/2010/11/22/some-more-pdo-weirdness/</link>
		<comments>http://jeremycook.ca/2010/11/22/some-more-pdo-weirdness/#comments</comments>
		<pubDate>Tue, 23 Nov 2010 01:26:54 +0000</pubDate>
		<dc:creator>Jeremy Cook</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[PDO]]></category>

		<guid isPermaLink="false">http://jeremycook.ca/?p=187</guid>
		<description><![CDATA[I&#8217;ve said before that I&#8217;m a great fan of PDO and use it wherever possible. That said there are some annoying quirks in it, one of which I encountered today. I&#8217;ll outline what I was trying to do, what I expected to happen and what actually happened. I&#8217;m also curious as to what anyone else [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve said before that I&#8217;m a great fan of PDO and use it wherever possible. That said there are some annoying quirks in it, one of which I encountered today. I&#8217;ll outline what I was trying to do, what I expected to happen and what actually happened. I&#8217;m also curious as to what anyone else thinks of this problem and whether it&#8217;s something worth reporting as a bug or if it&#8217;s a &#8216;feature&#8217;.</p>
<h2>What I Was Trying to Do&#8230;</h2>
<p>I&#8217;m currently adding a clearance item section to a website. This will aggregate items from over 35 stores across Canada and allow a user to search for items in a number of ways. As part of this I&#8217;ve created a business object to represent a clearance item. This mainly consists of protected properties with accessors and mutators for setting and getting the properties. Since the site is in English and French the accessors allow the properties to be formatted accordingly before being returned. I use this clearance item class when importing information into the database (items are imported from Excel worksheets, but that&#8217;s a topic for another post) and when returning results from searches. I use data mappers to handle the CRUD operations in the database. What I wanted to do was to find a quick and easy way to transform a row from a resultset into an instance of a clearance item object without using a full ORM solution. I thought I had found this with PDO.</p>
<h2>The &#8216;Solution&#8217;</h2>
<p>PDO allows you to set the fetch mode to fetch an object of a user defined class for each row returned from the database. You can either use PDOStatement::fetchObject() or PDOStatement::setFetchMode() to define this behaviour. It will take the names of the columns and assume that these are properties of the object, setting the appropriate values for these properties. In theory this sounds great but unfortunately it&#8217;s not as easy as it sounds.</p>
<h2>The Problem</h2>
<p>The problem in my case is that all properties of my object are declared as protected and I use mutators to set them. I anticipated this and provided a __set() method to provide the needed property overloading. I thought that what would happen is that PDO would attempt to set an undefined public property, triggering the __set() method where there is logic to call the appropriate mutator. Unfortunately it doesn&#8217;t work like that at all. According to comments in the PHP manual it seems that PDO uses something called reflection injection to set the properties inside the object regardless of their visibility. This meant that I was getting back an instance of my clearance item class with the protected properties set directly by PDO. This may not seem like too much of a problem, but my mutators include logic to transform the properties being set to the correct data type. For instance, columns containing integers or floating point numbers come back as strings and the mutators make sure they are set as numeric data types. Worst of all for me was the information on stores. I&#8217;m using MySQL&#8217;s GROUP_CONCAT() function to &#8216;implode&#8217; information about stores and the inventory in each into a string, meaning that I can fetch all of the information I need in a single query. The logic in the mutator then splits that string apart into an array of information on stores and the inventory. PDO was bypassing this completely and setting the protected stores property as the string that came back from GROUP_CONCAT().</p>
<h2>A Partial Solution</h2>
<p>Once again the PHP manual was my partial saviour. There is a way to fetch a row from a resultset into an existing instance of a class. In this case the visibility of properties is honoured and the __set() function is called, in turn triggering the correct mutators. The code I came up with looks like this:</p>
<pre class="brush: php; title: ; notranslate">

//The statement is prepared before this.

$item = new ClearanceItem($this-&gt;lang);
 $st-&gt;setFetchMode(PDO::FETCH_INTO, $item);
 $st-&gt;bindParam(':ItemId', $id, PDO::PARAM_INT);
 $st-&gt;execute();
 foreach ($st as $result) {
//The query in this case only returns a single row.

return $result;
 }
</pre>
<p>While this works it&#8217;s a bit of a hack to say the least. It seems that PDOStatement::fetch() doesn&#8217;t work here to fetch the info into the object, necessitating the foreach loop and the immediate return. This is fine in this case but in other areas of the code I need to iterate over a resultset, getting a new instance of ClearanceItem each time. This wouldn&#8217;t work as the information is populated into the existing object bound to the statement. The best I could do is to clone the object each time through the loop and store the cloned copy.</p>
<p>There is another way out of this: return each row as an associative array and create a factory method to return instances of ClearanceItem from this. I actually coded this solution already when I intitially found the strangeness that PDO was doing in return instances of ClearanceItem. That solution works perfectly well but it seems strange that it&#8217;s not possible to loop over a record set, getting a new instance of a class each time.</p>
<h2>Conclusion</h2>
<p>It seems that it&#8217;s quite possible to return instances of a class from a PDOStatement providing that all of the properties are public or if you don&#8217;t mind PDO directly setting your private and protected properties. If you want to use mutators to control how your data is set and to make sure it&#8217;s of the correct type it seems that the options are far more limited and you may be out of luck. Of course there are other options, such as writing a factory method to create an object from an array or using a full blown ORM like Doctrine. In this instance, as I haven&#8217;t used Doctrine before and as time is of the essence I was hoping that PDO would give me what I need.</p>
<p>It seems very strange that PDO is allowed to directly set private and protected properties when all other code accessing instances of ClearanceItem have to use the accessors and mutators. It seems to me to drive a coach and horses through the concept of property visibility. Why should PDO (at least I&#8217;m assuming it&#8217;s just PDO) have the ability to bypass the visibility protections that OO code provides while everyone else has to work with it? Is there some good reason for doing this that I&#8217;m missing? I&#8217;d be really interested to hear what the possible advantages of this are.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeremycook.ca/2010/11/22/some-more-pdo-weirdness/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Creating 50,000 unique values</title>
		<link>http://jeremycook.ca/2010/07/11/creating-50000-unique-values/</link>
		<comments>http://jeremycook.ca/2010/07/11/creating-50000-unique-values/#comments</comments>
		<pubDate>Mon, 12 Jul 2010 01:57:25 +0000</pubDate>
		<dc:creator>Jeremy Cook</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://jeremycook.ca/?p=135</guid>
		<description><![CDATA[A client at work wants to run a promotion where a customer will receive a card with a unique 8 digit code on it when they buy something. They will then be able to visit the website to find out if they are a winner or to get a chance to enter a prize draw. [...]]]></description>
			<content:encoded><![CDATA[<p>A client at work wants to run a promotion where a customer will receive a card with a unique 8 digit code on it when they buy something. They will then be able to visit the website to find out if they are a winner or to get a chance to enter a prize draw. I have to put all of the code together to manage this and thought I would run a few experiments to try out a few ideas, the first of which was to generate the 50,000 unique codes needed for the competition. I didn&#8217;t think this would be a particularly difficult task (and in reality it wasn&#8217;t) but I hit a number of problems while doing this that revealed some interesting things about the variations of running PHP on different operating systems and issues with handling large datasets.</p>
<h2>Development environment</h2>
<p>My development environment is Windows 7 with Apache (compiled using VC9), PHP 5.3.2 and MySQL 5.1.47. For coding I use NuSphere&#8217;s PHPED. All of my timings were made using the built in profiler that comes with PHPED and the DBG PHP debugging extension. Why is this information important? Hopefully that will become clear later.</p>
<h2>My first attempt</h2>
<p>My first attempt used a while loop to iterate 50,000 times to create the values. On each iteration a code was generated and a prepared statement executed to insert the code into the database. The column holding the code has a unique index on it, causing an exception to be thrown if PHP generates a duplicate 8 digit code. This is then caught in the inner catch block, which causes the loop to go through another iteration. If the code is inserted successfully the counter is incremented and the loop carries on. Code for this is below:</p>
<pre class="brush: php; title: ; notranslate">

&lt;?php
 //Up the execution time to 15 minutes as this takes ages to run
 ini_set('max_execution_time', 900);
 try {
 //Create a PDO connection to the database
 $db = new PDO('mysql:dbname=test;host=localhost', 'root', 'PASSWORD');
 //Set the PDO error mode to exceptions
 $db-&gt;setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
 //Delete all entries from the test table
 $db-&gt;exec('TRUNCATE test');
 //Prepare the statement
 $st = $db-&gt;prepare('INSERT INTO test(code) VALUES(?)');
 //Bind the parameter to the $code variable.
 //When the statement is executed whatever value is in $code will be used as the value for the parameter
 $st-&gt;bindParam(1, $code);
 //Set the counter for the main loop
 $i = 0;
 //Possible characters for the code string
 $possible = &quot;0123456789bcdfghjkmnpqrstvwxyz&quot;;
 //Length of the possible characters (0 based)
 $len = strlen($possible) - 1;
 //Loop 50,000 times!
 while ($i &lt; 50000) {
 try {
 //Reset the $code variable
 $code = &quot;&quot;;
 //Create a counter for the inner loop
 $j = 0;
 //add random characters to $code until it is 8 characters long
 do {
 //Get a random character
 $char = $possible[mt_rand(0, $len)];
 //we don't want this character if it's already in the code
 if (! strpos($code, $char)) {
 $code .= $char;
 $j++;
 }
 } while ($j &lt; 8);
 //Execute the statement. The value of $code will be used as the parameter
 $st-&gt;execute();
 $i++;
 }
 catch (PDOException $e) {
 //If an exception is caught here it's probably a duplicate value in the code column. Continue to get a new value.
 continue;
 }
 }
 }
 catch (PDOException $e) {
 //Any other exceptions caught here.
 echo $e-&gt;getMessage();
 }
?&gt;
</pre>
<p>The major problem with this was the time taken to execute it: after 15 minutes I still didn&#8217;t have 50,000 rows in the database but only something like 38,000! The code worked but was amazingly slow. After running the code through a profiler I found that 14 mins 54 seconds were spent on the line &#8216;$st-&gt;execute()&#8217; to insert the row. I was quite amazed by this as I didn&#8217;t expect the database to add quite such an overhead. The other important point to mention here is that I used the mt_rand() function. Why that&#8217;s important will become clear in a moment.</p>
<p>At this point I had two main questions:</p>
<ol>
<li>What&#8217;s the quickest way to create 50,000 unique 8 digit codes?</li>
<li>What&#8217;s the quickest way to insert these rows into the database?</li>
</ol>
<p>I should add that the need to find answers to both of these questions was somewhat academic and more to satisfy my own curiosity. This code would never be run on a live server but would be used on my development box to generate a database table. This could then be dumped into a SQL file and used to create the table on a live server. As a result performance was not the number one consideration in this case. I still wanted to run this in less than 15 minutes though!</p>
<h2>The &#8216;Hamlet&#8217; solution</h2>
<p>At this point I turned to the <a href="http://www.open.ac.uk/">Open University</a> web development forums for help. I learnt <a href="http://www3.open.ac.uk/study/undergraduate/qualification/c39.htm">web development at the OU</a> and their forums are open to students, tutors and ex-students like myself. Some very clever people hang out there and I knew I would get some good suggestions. Two of the tutors of the course on open source web development tools, Michelle Hoyle and Keith Evetts, which focuses on PHP as a server side scripting language engaged in a discussion with me about how to generate codes quickly. From the beginning I noticed that ideas they were posting were not working for me. They were using PHP functions like str_shuffle(), array_rand() and rand() and producing 50,00 unique values quickly. I had independently tried str_shuffle() and array_rand() too but had given up on them as they generated huge numbers of duplicate codes after a while. At this point I couldn&#8217;t understand why Keith and Michelle could produce solutions using these functions that I couldn&#8217;t get to work.</p>
<p>Keith Evetts came up with an idea which seemed crazy at the time but which generates random values very well and quickly. His idea was to take a large piece of text, use mcrypt() to encrypt it, discard all non alpha-numeric characters from the encrypted text and use this to generate the 8 digit codes. He used the the third act of Hamlet as the text (hence my calling this the Hamlet solution). This worked extremely well except for the fact that even the third act of Hamlet couldn&#8217;t be used to produce 50,000 codes. The solution was to loop over the code generation, encrypting the text using different initialisation vectors and keystrings, until 50,00 uniques values had been produced. I also worked out at this time that the fastest way to insert this many values quickly into a MySQL table was to use the &#8216;LOAD DATA INFILE&#8217; command. I used Keith&#8217;s function to generate the codes, writing these to a text file before using &#8216;LOAD DATA INFILE&#8217; to insert the records into the database. The PHP code ran in 1.9 seconds while the database insert took around 6 seconds. Here&#8217;s the PHP code:</p>
<pre class="brush: php; title: ; notranslate">

&lt;?php
 /* ---------  array function generate_codes ( int $number, string $plaintext , string $keystring, $length ) -----------
Keith Evetts 3 July 2010 license: LGPL 2.  This notice and author name must remain intact.
Args: number of codes to be generated,
 plaintext string from which to generate them, minimum 100 chars (optimal length is 4000 chars)
 keystring in plain text; minimum 12 chars
 length of code strings to be generated
Returns: enumerated array of  unique alphanumeric codes of length $length
Requires: PHP mcrypt lib with Rijndael 256 (v. 2.4 + of mcrypt)
-------------------------------------------------------------------------------------------------------------------------------------- */
 function generate_codes ( $number, $plaintext, $keystring, $length = 8 ) {
 switch (true) {
 case ( ! is_int($number) )                                                :
 case ( ! is_int($length) )                                                  :
 case ( ! is_string($plaintext)  || strlen($plaintext ) &lt; 100)   :
 case ( ! is_string($keystring) || strlen($keystring) &lt; 12 )    :
 throw new Exception (' function generate_codes called with incorrect params ');
 break;
 // default is proceed
 }
 // use the same text and randomly different keys to reach desired number of codes
 // for e.g. 50000 codes this will take several passes
 $unique_array = array();
 do {
 // get a new key
 $key =  substr ( sha1 ( str_shuffle ( $keystring )  ) , 0 , 32 );
 // get a new initialisation vector
 $iv = substr ( sha1 ( str_shuffle ( &quot;the slings and arrows of outrageous fortune&quot; )  ) , 0, 32 );
 $ciphertext = mcrypt_encrypt (   MCRYPT_RIJNDAEL_256,
 $key,
 $plaintext,
 MCRYPT_MODE_CBC,
 $iv
 );
 /* clean out non-alphanumeric chars at some cost to code redundancy */
 $ciphertext = preg_replace( '/[^2346789abdefhjkmnprtwxyz]/' ,  &quot;&quot; , strtolower($ciphertext) );
 $codearray = str_split ( $ciphertext, $length );
 // dump leftover element at end
 array_pop( $codearray );
 $size = sizeof($codearray);
 for ($i = 0; $i &lt; $size; $i++) $unique_array[] = $codearray[$i];
 /*somewhat amazingly, it is far quicker to enlarge the array by adding elements one at a time in a for loop, than to use array_merge() ! */
 } while ( sizeof ( $unique_array ) &lt;= ( $number  + 1 ) ) ;
 // now remove any duplicates at end of whole process
 $unique_array = array_unique ( $unique_array );
 return array_slice($unique_array, 0, $number);
 }
 try {
 $codes = generate_codes(50000, file_get_contents('plaintext_Hamlet_Act3.txt'), 'This is the keystring');
 }
 catch (Exception $e) {
 echo $e-&gt;getMessage();
 }
 $file = fopen('codes.txt', 'w');
 foreach($codes as $code) {
 fwrite($file, &quot;$code\r\n&quot;);
 }
 fclose($file);
?&gt;
</pre>
<p>This was clearly the winner for me on Windows but it still didn&#8217;t explain why PHP random functions couldn&#8217;t be used for me on Windows.</p>
<h2>&#8216;Random&#8217; functions on Windows-a theory</h2>
<p>At this point I was left wondering why the various &#8216;random&#8217; functions on Windows had performed so badly for me. I knew that Michelle Hoyle uses a Mac and that Keith Evetts scripts executed without a problem on a web server. I began to think that the problem was PHP on Windows. To test this out I uploaded one of the scripts that generated huge numbers of duplicates for me to a live server running CentOS Linux and everything worked without any problems. So what&#8217;s going on? I have a theory and once again I need to thank Keith Evetts for pointing me on the way to this. It seems that the PHP function rand() is simply a wrapper for the operating systems native random function (see <a href="http://cod.ifies.com/2008/05/php-rand01-on-windows-openssl-rand-on.html">here</a> for more details). The PHP manual states:</p>
<blockquote>
<blockquote><p><strong>Note</strong>:          On some platforms (such as Windows), <a href="http://ca.php.net/manual/en/function.getrandmax.php">getrandmax()</a> is only 32768.  If you require a range larger than 32768, specifying     <em><tt>min</tt></em> and <em><tt>max</tt></em> will allow     you to create a range larger than this, or consider using     <a href="http://ca.php.net/manual/en/function.mt-rand.php">mt_rand()</a> instead.</p></blockquote>
</blockquote>
<p>My theory is that functions such as str_shuffle() or array_rand() are also using the native operating system random function. It just happens to be that the function on the Linux/Unix platform is far better than the one available under Windows, which would explain why the scripts run so differently under different platforms. Normally this wouldn&#8217;t create any problems but when you&#8217;re dealing with very large datasets where randomness is a necessity it becomes a problem. This would also explain why my initial attempt had no problems generating 50,000 unique values as I was using the mt_rand() function. This is based on mathematics known as the Mersenne Twister, which generates better random numbers, and does not rely on the operating systems random number generator. I don&#8217;t know any C or the PHP source code well enough to confirm this  but this is my strong hunch. Is someone able to confirm this?</p>
<h2>Conclusion</h2>
<p>As I said at the beginning what should have been a fairly simple exercise turned into something a little more involved and ultimately informative. I would suggest that if you need to make a large number of random values that these are the guidelines you might want to follow:</p>
<ul>
<li>If you&#8217;re going to be running the script on a Linux/Unix box or you&#8217;re developing on such a system go ahead and use whichever PHP functions you like. As the underlying OS random number generation is better than on Windows you won&#8217;t run into any problems and will almost certainly get better performance this way.</li>
<li>If you&#8217;re running on Windows your options are more limited. Here I would suggest that you use either mt_rand() or the &#8216;Hamlet&#8217; approach if you require more than 32,768 random values.</li>
</ul>
<p>I know that all programs rely on the OS they&#8217;re executing on but this discrepancy seems quite big to me. Is there any way that PHP&#8217;s &#8216;random&#8217; functions could be re-written to take advantage of the Mersenne Twister? With my limited knowledge of the PHP core code that would seem to me to offer the combination of good random generation and consistency across platforms. Of course, if it were that simple I&#8217;m sure someone else would already have done it.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeremycook.ca/2010/07/11/creating-50000-unique-values/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>A Quick Tip</title>
		<link>http://jeremycook.ca/2010/05/15/a-quick-tip/</link>
		<comments>http://jeremycook.ca/2010/05/15/a-quick-tip/#comments</comments>
		<pubDate>Sat, 15 May 2010 14:00:10 +0000</pubDate>
		<dc:creator>Jeremy Cook</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[PDO]]></category>

		<guid isPermaLink="false">http://jeremycook.ca/?p=84</guid>
		<description><![CDATA[I haven&#8217;t written anything here for ages due to illness, work and life getting in the way. I&#8217;ve got a longer post brewing that I&#8217;ll hopefully add in the next couple of days but for now here&#8217;s a quick tip that I hope someone will find useful. I recently had a situation where I wanted [...]]]></description>
			<content:encoded><![CDATA[<p>I haven&#8217;t written anything here for ages due to illness, work and life getting in the way. I&#8217;ve got a longer post brewing that I&#8217;ll hopefully add in the next couple of days but for now here&#8217;s a quick tip that I hope someone will find useful.</p>
<p>I recently had a situation where I wanted to use an array of values as bound parameters in a SQL IN clause. Easy enough to do except the array was of variable length and I didn&#8217;t know how long it would be each time the script was ran. Here&#8217;s the solution I came up with.</p>
<pre class="brush: php; title: ; notranslate">

&lt;?php

try {

$array = array('some value', 'another', 'another');//Variable length array, unknown length before runtime

$db = new PDO(CONNSTR, USERNAME, PASS);

$sql = &quot;SELECT SomeColumn FROM table WHERE SomeOtherColumn IN (&quot; . implode(',', array_fill(0, count($array), '?')) . ') ORDER BY SomeColumn';

$st = $db-&gt;prepare($sql);

$st-&gt;execute($array);

}

catch (PDOExeception $e) {}

?&gt;
</pre>
<p>Quick and easy!</p>
]]></content:encoded>
			<wfw:commentRss>http://jeremycook.ca/2010/05/15/a-quick-tip/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Creating a Persistent Login Mechanism</title>
		<link>http://jeremycook.ca/2010/03/28/creating-a-persistent-login-mechanism/</link>
		<comments>http://jeremycook.ca/2010/03/28/creating-a-persistent-login-mechanism/#comments</comments>
		<pubDate>Sun, 28 Mar 2010 16:48:13 +0000</pubDate>
		<dc:creator>Jeremy Cook</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://jeremycook.ca/?p=73</guid>
		<description><![CDATA[How to create a secure persistent login mechanism in PHP, looking at some of the issues and pitfalls surrounding this.]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been working on a project recently where one of the requirements was to have a persistent login mechanism. I&#8217;m not a great fan of this kind of feature simply because it significantly degrades the security of the application. As the only mechanism whereby data can be persisted across sessions is cookies creating a persistent login means storing data in a cookie on the client. Since the cookie has to contain some sort of value to identify the user when they visit the site again there&#8217;s a significant risk that if the cookie is stolen an attacker can log into the application without knowing any user credentials at all. That said, I decided that the risks for this particular application were fairly minimal. The application is open to users by invitation only and all communication with the server is conducted using SSL. Because of this I was able to set the persistent login cookie to be only sent over an SSL connection. There is still a risk if the user chooses to login from a public computer, selecting the option to &#8216;remember me&#8217;, but unfortunately there is nothing I can do about this. It seems to be one of the paradoxes of software development is that making an application easier to use can adversely affect the security.</p>
<p>In creating the persistent login token I followed the principles outlined by Chris Shiflet in <a href="http://shiflett.org/books">&#8216;Essential PHP Security&#8217;</a>. This book is fantastic and would be my top recommendation for any web developer who is concerned about application security (which should be all of us!) to read. The basic principle is to create an identifier and a token if a user chooses to persist a login. These are then set in a cookie and recorded in the database along with a &#8216;timeout&#8217; value. Shiflet recommends that the maximum time a persistent login cookie should be valid for is seven days to minimise the risk of loss or exposure of the cookie. Since a new cookie is generated every time the user visits the site as long as they visit at least once every seven days they will never need to login again. Finally, if a user chooses to logout the persistent login cookie is deleted so that they really are logged out.</p>
<h2>Creating the Cookie</h2>
<p>My login form displays a checkbox to the user which, if selected, triggers the mechanism whereby a cookie is set. If the user can be logged in from the submitted username and password I call a method to create the necessary token and identifier and record the values in the database. The code to do this follows:</p>
<pre class="brush: php; title: ; notranslate">

$identifier = sha1(SALT . sha1($organisation . SALT));
 $token = sha1(uniqid(mt_rand(), true));
 $user-&gt;setPersistentLogin($identifier, $token, $organisation);
 setcookie('auth', &quot;$identifier:$token&quot;, time() + 60 * 60 * 24 * 7, '/', '' , true);
</pre>
<p>The identifier is generated from the user&#8217;s username for the application, which is then double hashed along with a salt. By creating a secondary identifier in this way I ensure that no details that an attacker could use to log into the application through the login form are set in the cookie. A random token is then generated and hashed before the information is recorded in the database in the setPersistentLogin method of the user object. Finally the cookie holding the login information is set, with an expiry time of one week. My database table includes columns for the identifier and token and a datetime column for the expiry date. The SQL to record the information looks like this (my database access is done using PDO prepared statements):</p>
<pre class="brush: sql; title: ; notranslate">

UPDATE users SET Identifier = :identifier, Token = :token, Timeout = DATE_ADD(NOW(), INTERVAL 7 DAY) WHERE Organisation = :Organisation
</pre>
<h2>Logging in Using the Cookie</h2>
<p>When a user requests the login page my controller for that page executes the following code:</p>
<pre class="brush: php; title: ; notranslate">

if ((isset($_POST['token']) &amp;&amp; $_POST['token'] === $_SESSION['token']) || isset($_COOKIE['auth'])) {

if (isset($_POST['token']) &amp;&amp; $_POST['token'] === $_SESSION['token']) {

$captcha = isset($_SESSION['captcha']) ? $_SESSION['captcha'] : false;

$this-&gt;model-&gt;inputs = $_POST;

$this-&gt;model-&gt;captcha = $captcha;

$this-&gt;model-&gt;setContent();

} else if (isset($_COOKIE['auth'])) {

list($identifier, $token) = explode(':', $_COOKIE['auth']);

$this-&gt;model-&gt;identifier = $identifier;

$this-&gt;model-&gt;token = $token;

$this-&gt;model-&gt;setContent();

}

if ($this-&gt;model-&gt;valid) {

//Log the user in and redirect them.

}

} else {

//Call setContent with no values set.

$this-&gt;model-&gt;setContent();

}
</pre>
<p>This code looks for the existence of either a form submission or a persistent login cookie. If either is set appropriate properties are set in the model for the page before the setContent() method is called. The part of the setContent() method handling the login looks like this:</p>
<pre class="brush: php; title: ; notranslate">
try {
if (($this-&gt;identifier &amp;&amp; $this-&gt;token) || $this-&gt;inputs) {
 $user = new userAuth();
 if ($this-&gt;identifier &amp;&amp; $this-&gt;token) {
 //Attempt a login based on the persistent token values
 $result = $user-&gt;checkPersistentLogin($this-&gt;identifier, $this-&gt;token);
//No exceptions thrown, successful login.
$this-&gt;setPersistentLogin($result['Organisation'], $user);
} else if (isset($this-&gt;inputs)) {
 if (! $this-&gt;validate($this-&gt;inputs)) {
 throw new InvalidArgumentException('Errors in form submission values');
 }
 //Attempt a login based on the form submitted
 $result = $user-&gt;checkLogIn($this-&gt;results['username'], $this-&gt;results['password']);
 }
 //No exceptions thrown, successful login.
 $this-&gt;organisation = $result['Organisation'];
 $this-&gt;contactName = $result['ContactName'];
 $this-&gt;email = $result['Email'];
 $this-&gt;questionsAnswered = $result['Questions Answered'] === 'True' ? true : false;
 if (isset($this-&gt;results['remember']) &amp;&amp; $this-&gt;results['remember'] === 'rememberMe') {
 $this-&gt;setPersistentLogin($this-&gt;organisation, $user);
 }
 $this-&gt;valid = true;
 return;
 }
 }
 catch (InvalidArgumentException $e) {
 /**
 * Invalid login attempt.
 * If inputs is set display an error message as there was an invalid form submission.
 * Invalid login from a token won't display this message but the user will see a login form
 */
 if ($this-&gt;inputs) {
 $this-&gt;content['error'] = 'Either your username or password is incorrect. Please try again.';
 $this-&gt;content['captcha'] = true;
 $this-&gt;content['username'] = is_array($this-&gt;results['username']) ? $this-&gt;results['username']['value'] : $this-&gt;results['username'];
 $this-&gt;content['password'] = is_array($this-&gt;results['password']) ? $this-&gt;results['password']['value'] : $this-&gt;results['password'];
 }
 }
</pre>
<p>This code looks for the existence of either form values ($this-&gt;inputs) or values from a persistent login cookie ($this-&gt;identifier and $this-&gt;token). For the code checking the persistent login cookie the checkPersistentLogin() method is then called. This throws an InvalidArgumentException if the users details cannot be verified, which is caught later in the example above. If the user is verified a result array is returned and a new persistent login cookie is set. The SQL to check the persistent login is as follows:</p>
<pre class="brush: sql; title: ; notranslate">

SELECT Organisation, ContactName, Email
FROM users
WHERE Identifier = :identifier
AND Token = :token
AND Timeout &gt; DATE_SUB(NOW(), INTERVAL 7 DAY)
</pre>
<h2>Logging a User Out</h2>
<p>When logging a user out it&#8217;s necessary to delete a persistent login cookie to make sure that they really are logged out. This is done in the following code:</p>
<pre class="brush: php; title: ; notranslate">

if (isset($_COOKIE['auth'])) {
 setcookie('auth', false, time() + 60 * 60 * 24 * 7, '/', '', true);
 }
</pre>
<p>The php manual states that:</p>
<blockquote><p>If the value argument is an empty string, or <strong><tt>FALSE</tt></strong>,  and all other arguments       match a previous call to setcookie, then the cookie with the  specified       name will be deleted from the remote client.       This is internally achieved by setting value to &#8216;deleted&#8217; and  expiration       time to one year in past.</p></blockquote>
<p>By using the same arguments to set cookie but setting the value to false this code makes sure that the persistent login cookie is deleted.</p>
<p>I&#8217;m fairly confident that this implementation of a persistent login is as secure as I can make it but it can never be as secure as requiring a manual login from a user on each visit. In deciding that a persistent login mechanism was appropriate for this application I considered the user base and how they would likely be using the application. While a persistent login was appropriate for this application it may well not be for others. I&#8217;d appreciate any comments or ideas anyone may have on how I&#8217;ve done this or on the benefits and risks of persistent login.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeremycook.ca/2010/03/28/creating-a-persistent-login-mechanism/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Handling Binary Data with PDO</title>
		<link>http://jeremycook.ca/2010/02/21/handling-binary-data-with-pdo/</link>
		<comments>http://jeremycook.ca/2010/02/21/handling-binary-data-with-pdo/#comments</comments>
		<pubDate>Sun, 21 Feb 2010 17:27:11 +0000</pubDate>
		<dc:creator>Jeremy Cook</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Web Development]]></category>
		<category><![CDATA[PDO]]></category>

		<guid isPermaLink="false">http://jeremycook.ca/?p=64</guid>
		<description><![CDATA[This post looks at an issue with handling binary data from a database using PDO and a partial workaround for the problem.]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m a great fan of the <a href="http://www.php.net/pdo">PDO</a> database access library in PHP 5 and use it for all of my database work in PHP. I love its&#8217; clean, object oriented syntax and great support for prepared statements. I also like the fact that it supports most of the most common database engines. Although all of my dev work in PHP so far has been with MySQL I like the fact that if I needed to use MS SQL Server, Oracle or any of the other big RDBMS&#8217;s I could use the same PDO syntax to access them rather than learning a new database access library. However, there do seem to be some bugs in PDO  according to what I&#8217;ve read on the web. While I haven&#8217;t encountered most of them and can&#8217;t comment on them I&#8217;d like to write about one that I ran into the other day and how I worked around it.</p>
<p>I have a project that I&#8217;m working on where I&#8217;m storing some images in a database as binary data. PDO allows you to bind a file handle to a parameter in a prepared statement and when the statement is executed the contents of the file are slurped into the database. This works perfectly but the problem comes when getting the image out of the database again to display it. According to the <a href="http://www.php.net/manual/en/pdo.lobs.php">PHP manual</a> the following code should work:</p>
<pre class="brush: php; title: ; notranslate">

&lt;?php
$db = new PDO('odbc:SAMPLE', 'db2inst1', 'ibmdb2');
$stmt = $db-&gt;prepare(&quot;select contenttype, imagedata from images where id=?&quot;);
$stmt-&gt;execute(array($_GET['id']));
$stmt-&gt;bindColumn(1, $type, PDO::PARAM_STR, 256);
$stmt-&gt;bindColumn(2, $lob, PDO::PARAM_LOB);
$stmt-&gt;fetch(PDO::FETCH_BOUND);

header(&quot;Content-Type: $type&quot;);
fpassthru($lob);
?&gt;
</pre>
<p>Binding a column from a result set to a variable using PDO::PARAM_LOB is supposed to return a stream resource into the variable when PDO::fetch() is called. This stream can then be operated on using any PHP function that handles files. Unfortunately there&#8217;s a <a href="http://bugs.php.net/bug.php?id=40913">bug</a> which means that instead of returning a stream into $lob PDO returns a string containing the binary data. When this is then passed to fpassthru() an error is triggered. Fortunately there&#8217;s a simple fix for displaying the image: replace the call to fpassthru() with echo or print. Since the browser is expecting an image after the call to header() writing the binary data using echo or print has the same effect as calling fpassthru(). In my code I&#8217;ve added the following just in case this bug is fixed in a future release:</p>
<pre class="brush: php; title: ; notranslate">

if (is_string($lob)) {

echo $lob;

} else {

fpassthru($lob);

}
</pre>
<p>This neatly gets around the problem if you just want to send the binary data back to the browser to be displayed. Anything more requiring the use of any file functions or image editing functions would need quite a few contortions in the code. The information from the database would probably need to be written to a temporary file to allow it to be operated on. This bug was first reported almost three years ago in PHP 5.2.6 and it&#8217;s still not fixed today in the most recent version, 5.3.1. It would be great if this bug was finally taken care of.</p>
<p><strong>Edit:</strong> Joshua Johnston has posted a comment below that explains how to convert a string of data into a stream using the data stream wrapper. I&#8217;ve tried it out and it works very well. I think it gives a cleaner solution to the problem and allows the data returned from the database to be manipulated with file functions.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeremycook.ca/2010/02/21/handling-binary-data-with-pdo/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Location Aware Webpages</title>
		<link>http://jeremycook.ca/2010/02/13/location-aware-webpages/</link>
		<comments>http://jeremycook.ca/2010/02/13/location-aware-webpages/#comments</comments>
		<pubDate>Sat, 13 Feb 2010 20:55:55 +0000</pubDate>
		<dc:creator>Jeremy Cook</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Web Development]]></category>
		<category><![CDATA[Web Services]]></category>

		<guid isPermaLink="false">http://jeremycook.ca/?p=54</guid>
		<description><![CDATA[This post discusses a how I implemented a solution for a client which shows which of the client's stores a user visiting a website is geographically close to.]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m working on a site for a client at the moment who has around 35 store locations across Canada. The client was keen to have a &#8216;your local store&#8217; feature on the home page where the store closest to the visitors location was featured. I thought I&#8217;d write briefly about this, the solution I came up with to address this problem, and some of the limitations of it.</p>
<p>To display the local store information I needed to go through a number of distinct steps in the code.</p>
<ol>
<li>Determine the visitors location from their IP address.</li>
<li>If the visitor is in Canada calculate the distance to their closest store.</li>
<li>If they&#8217;re within 100km of a store display information on that store on the homepage.</li>
<li>If the user is not in Canada, is more than 100km away from a store or if there is any kind of error display a generic message about store locations.</li>
</ol>
<h2>Getting the Users Location</h2>
<p>Determining a visitors location from their IP address can be done through a number of GeoIp services, some available for free and some paid for. Ideally I wanted to use a PECL <a href="http://www.php.net/manual/en/book.geoip.php">PHP extension</a> to do the location lookup but this was not possible as I could not persuade my web host to install this. I fell back to using a free web service provided by <a href="http://ipinfodb.com/">ipinfodb</a> for the lookup. I found a class on PHPClasses that used this web service, which I substantially rewrote to serve my purposes. The code that performs the lookup in the class is below:</p>
<pre class="brush: php; title: ; notranslate">

$strAPIURL = $this-&gt;apiUrl . &quot;?ip=&quot; . urlencode ($this-&gt;remoteAddress);
 //Make a call to the api and fetch the result.
 $ch        = curl_init ($strAPIURL);
 curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
 $xmlResult = curl_exec ($ch);
 curl_close ($ch);

 if ($xmlResult &amp;&amp; strlen ($xmlResult) &gt; 2) {
 //Process the result
 $result = simplexml_load_string ($xmlResult);

 if ((string)$result-&gt;Status == 'OK') {
 $this-&gt;countryName   = (string)$result-&gt;CountryName;
 $this-&gt;countryCode   = (string)$result-&gt;CountryCode;
 $this-&gt;cityName      = (string)$result-&gt;City;
 $this-&gt;zipPostalCode = (string)$result-&gt;ZipPostalCode;
 $this-&gt;regionName    = (string)$result-&gt;RegionName;
 $this-&gt;regionCode    = (string)$result-&gt;RegionCode;
 $this-&gt;timezone      = (int)$result-&gt;Timezone;
 $this-&gt;gmtOffset     = (int)$result-&gt;Gmtoffset;
 $this-&gt;dstOffset     = (int)$result-&gt;Dstoffset;
 $this-&gt;lat           = (float)$result-&gt;Latitude;
 $this-&gt;long          = (float)$result-&gt;Longitude;
 //Set the ipSniffed flag to true
 $this-&gt;ipSniffed     = true;
 }
 }
</pre>
<p>Once the call to the web service has been completed the information returned is stored in public properties and a flag is set to show that a successful lookup was performed. My class uses filter_var() to make sure a valid IP address that is not in a private range is passed to it, throwing an InvalidArgument Exception if not. Of course, if an exception is thrown I fall back to my default position of displaying a generic message about store locations.</p>
<h2>Finding the Closest Store</h2>
<p>Once I have the users location from their IP address and I&#8217;ve determined that they&#8217;re visiting from Canada I proceed to calculate the distance to their closest store. This is done with a database query. Information about all of the stores is stored in a database table along with the latitude and longitude of each store. Using the latitude and longitude returned from the web service I can then calculate the users distance to the nearest store. This is done in the following code:</p>
<pre class="brush: php; title: ; notranslate">

protected function getClosestStore ($lat, $long) {
 $sql = &lt;&lt;&lt; _SQL_

SELECT StoreId, CONCAT_WS(', ', Address, City, CONCAT(Province, ' ', PostCode)) AS Address, Lat, Lon, ROUND(6371 * 2 * ASIN(SQRT(POWER(SIN((:lat - abs(lat)) * pi()/180 / 2), 2) +
COS(:lat * pi()/180 ) * COS(abs(lat) * pi()/180) *  POWER(SIN((:lon - lon) * pi()/180 / 2), 2) )),2)
AS Distance
FROM Stores
HAVING Distance &lt;= 100
ORDER BY Distance Limit 1
_SQL_;
 try {
 $db = db::getConn();
 $st = $db-&gt;prepare($sql);
 $st-&gt;execute(array(':lat' =&gt; $lat, ':lon' =&gt; $long));
 $this-&gt;storeInfo = $st-&gt;fetch(PDO::FETCH_ASSOC);
 }
 catch (PDOException $e) {
 error_log($e-&gt;getMessage());
 }
 }
</pre>
<p>This PHP in this code simply connects to the database (db::getConn() is a static method that returns a singleton instance of a PDO object), performs the query and stores the result in the storeInfo property. If no result is returned from the database the fetch() method will return false, which then tells me that a user is not within 100 km of a store. The real meat of this code is in the SQL query. It selects some information about the store and then uses some math to calculate the distance to the nearest store. This is done using the Haversine formula, which calculates the distance between two sets of latitude and longitude, taking into account the curvature of the earth. I&#8217;m not going to try to explain the math (mostly because I don&#8217;t fully understand it myself!) but there is a <a href="http://en.wikipedia.org/wiki/Haversine_formula">Wikipedia article</a> on the formula for anyone who is interested. The query then limits the results to stores that are within 100km, orders the results by distance to make sure the closest one is listed first and then returns the first result. If a result is returned I then record the StoreId of the closest store in a cookie, which is then used when the visitor visits the site again. This is to cut down on processing time for subsequent request for the home page and means that I won&#8217;t need to hit the ipinfodb web service on every request for the index page of the site.</p>
<h2>Limitations of this Solution</h2>
<p>There are three major limitations to this solution that I can see: the accuracy of GeoIP services, the database query and saving the local store information in a cookie.</p>
<p>GeoIP services claim about an 80% accuracy when tracking the geographic location of an IP address down to the city level (accuracy is about 95% for finding the country a user is visiting from). For example I am writing this in Guelph, Ontario but the IP address I am connected to the internet with resolves geographically to the nearby city of Kitchener. This is not a major problem for my application as the client only has 35 stores across Canada and I am just trying to find the closest one to a user. Given the small number of stores chances are that I will hit on the closest one, even allowing for only 80% accuracy in GeoIP tracking. For other applications this could be a problem. The W3C has a draft <a href="http://dev.w3.org/geo/api/spec-source.html">Geolocation API</a> which some browsers (such as Firefox) are starting to implement. Using this (perhaps with some AJAX) could help to get a more accurate fix on some users locations but would open up some privacy concerns. For this application I display links that allow a user to manually select their closest store if the GeoIP tracking is wrong, hopefully alleviating any problems caused by the 80% accuracy.</p>
<p>The SQL query as I&#8217;ve written it would have performance issues if a large number of locations were being stored. To calculate the distance the query has to find the distance to every store held in the database, only then narrowing it down to the closest. This is not a problem currently as the client only has 35 locations, but if hundreds or thousands of locations were being stored the query would be extremely innefficient and take a long time to execute. In the course of my research I did read something about limiting the search to a radius (100 km in this case). This would involve several queries and probably a stored procedure to carry them out in. As I was dealing with a small number of locations I decided against this for simplicity. For another solution involving more locations I would probably go with this approach.</p>
<p>Saving the information on the users local store in a cookie makes the application more efficient but it potentially slightly degrades the user experience. In a scenario where a user travels and expects to see information on the closest store to where they currently are this would not work. I made a judgement call here and decided that it was better to go with the more efficient approach for this client, but in other cases this may not be so. Of course, if I could install the PECL GeoIP extension I wouldn&#8217;t need to store the cookie. Looking up a users location using this extension would be no more complicated than calling some PHP functions. The GeoIP information would be stored locally as part of the extension. This would be quicker than creating a call to and processing a response from a web service. Unfortunately this was not an option for me on this occasion. This problem can be overcome by the user manually selecting their closest store, overriding the mechanism I programmed.</p>
<p>All in all I am fairly happy with the solution I arrived at for this problem. It enables me to tailor content on the page based on where a user is. My example is a fairly simple one but it would be possible to do far more sophisticated things using GeoIP such as automatically displaying content in different languages depending on which country a visitor is coming from or displaying prices in different currencies. Due to the less than 100% accuracy of GeoIP mechanisms would always need to be provided to allow a user to override the conclusions found by the code though.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeremycook.ca/2010/02/13/location-aware-webpages/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

