Google Custom Search Engine

I recently published my first class on PHPClasses and thought I would write a brief post about it here. The class allows a developer to send a query to a Google Site Search custom search engine, using the XML API.

Google Site Search

Google Site Search can be used to create a customised search engine for your website. The costs start at $100 per year (at the time of writing) for a website of up to 1,000 pages and this allows 250,000 searches per year. The search engine can be queried in a number of ways, one of which is via a REST web service. The service seems to be ideal for small to medium websites. Larger websites would probably benefit from using a service such as Solr or Lucene to create a search index on the sites’ web server, which may well provide a more customisable solution. For sites where it’s not possible to create such an index, or the owner doesn’t want to go to the effort, Google Site Search seems to be ideal.

The Google Custom Search Class

The class I’ve written can be downloaded from PHPClasses along with a file giving some examples of it’s use. The class can use cURL or the PHP HTTP stream wrapper to perform the web service request, with the constructor deciding on which method to use based on what the server has installed. The response from Google, consisting of search results and optional spelling suggestions, is processed using SimpleXML into arrays which are set as properties of the object. The search results XML contains the results as HTML and the class allows a user to specify how this should be handled. The user can ask for the results to be made available as plain text (no HTML and no entities encoded), plain text with all HTML special characters entity encoded, or with the HTML that Google returns left in. The user can specify which character encoding, defaulting to ISO-8859-1, a search query uses and results are returned in the same encoding. The Google API also allows many parameters to be set in the request to the service (see http://www.google.com/cse/docs/resultsxml.html#wsRequestParameters for full documentation). The class allows an array of key => value pairs to be passed which are then set as query string parameters, allowing the search to be further customised. Finally, any errors generated in the process cause a RuntimeException to be thrown which has to be caught outside of the object. Below are some examples of using the class.


try {
		//A simple example, default parameters in constructor and no extra parameters set.
		$search = new GoogleCustomSearch('CSE Key Here');
		$search->query('some search text');
		//The results property contains the search results, spellingSuggestions contains any spelling suggestions from Google.
		var_dump($search->results);
		var_dump($search->spellingSuggestions);

		//Example with results returned with HTML from Google left in and character encoding set to UTF-16. Result text will also be encoded in UTF-16
		//NOTE: When using a custom character encoding the search query must also be encoded using this character set.
		//The object defaults to character encoding in iso-8859-1 and assumes that queries passed to it are also encoded in this character set.
		$search = new GoogleCustomSearch('CSE Key Here', 'utf-16', GoogleCustomSearch::HTML);
		$search->query('some search text');
		var_dump($search->results);
		var_dump($search->spellingSuggestions);

		//Use the same object to perform an extra search, but with some different parameters
		$search->charEncoding = 'iso-8859-1';
		$search->processText = GoogleCustomSearch::ENTITIES_ENCODED; //Strips HTML but leaves HTML special characters entity encoded.
		$search->query('another query');
		var_dump($search->results);
		var_dump($search->spellingSuggestions);

		//Example using an array to set extra, custom options in the query string used to send a request to Google.
		//See http://www.google.com/cse/docs/resultsxml.html#WebSearch_Query_Parameter_Definitions for a full list of query parameter options that may be passed in the request.
		$opts = array(
			'lr' => 'lang_fr', //Request search results only for French language pages.
			'cr' => 'countryCA', //Request search results only for a particular country, in this case Canada
			'num' => 7 //Limit the search to a maximum of 7 results
		);
		//Create an object using default search options
		$search = new GoogleCustomSearch('CSE Key Here');
		//Set the custom options for the search.
		//These can also be set by passing the $opts array as the final argument in the object constructor.
		$search->opts = $opts;
		$search->query('yet another query');
		var_dump($search->results);
		var_dump($search->spellingSuggestions);
	}
	catch (RuntimeException $e) {
		//Handle any exceptions raised by the object here.
		echo $e->getMessage();
	}

I hope somebody finds the class to be useful. It’s still a version 1 release and I can see two major ways that it could be improved. Firstly, the handling of error response from the API is a little basic. I need to do some more research into error response codes that the API generates and use that to make the error messages set in any RuntimeException a little more specific. Secondly the response XML contains much more information than I am currently processing. I could make much more of it available as properties of the object and/or make the XML returned from the service available as a property of the object to allow a user to process it in any way they wish. If anyone has any problems, comments or suggestions post them below and I’ll get back to you.

try {
//A simple example, default parameters in constructor and no extra parameters set.
$search = new GoogleCustomSearch(‘CSE Key Here’);
$search->query(‘some search text’);
//The results property contains the search results, spellingSuggestions contains any spelling suggestions from Google.
var_dump($search->results);
var_dump($search->spellingSuggestions);

//Example with results returned with HTML from Google left in and character encoding set to UTF-16. Result text will also be encoded in UTF-16
//NOTE: When using a custom character encoding the search query must also be encoded using this character set.
//The object defaults to character encoding in iso-8859-1 and assumes that queries passed to it are also encoded in this character set.
$search = new GoogleCustomSearch(‘CSE Key Here’, ‘utf-16′, GoogleCustomSearch::HTML);
$search->query(‘some search text’);
var_dump($search->results);
var_dump($search->spellingSuggestions);

//Use the same object to perform an extra search, but with some different parameters
$search->charEncoding = ‘iso-8859-1′;
$search->processText = GoogleCustomSearch::ENTITIES_ENCODED; //Strips HTML but leaves HTML special characters entity encoded.
$search->query(‘another query’);
var_dump($search->results);
var_dump($search->spellingSuggestions);

//Example using an array to set extra, custom options in the query string used to send a request to Google.
//See http://www.google.com/cse/docs/resultsxml.html#WebSearch_Query_Parameter_Definitions for a full list of query parameter options that may be passed in the request.
$opts = array(
‘lr’ => ‘lang_fr’, //Request search results only for French language pages.
‘cr’ => ‘countryCA’, //Request search results only for a particular country, in this case Canada
‘num’ => 7 //Limit the search to a maximum of 7 results
);
//Create an object using default search options
$search = new GoogleCustomSearch(‘CSE Key Here’);
//Set the custom options for the search.
//These can also be set by passing the $opts array as the final argument in the object constructor.
$search->opts = $opts;
$search->query(‘yet another query’);
var_dump($search->results);
var_dump($search->spellingSuggestions);
}
catch (RuntimeException $e) {
//Handle any exceptions raised by the object here.
echo $e->getMessage();
}

5 thoughts on “Google Custom Search Engine

  1. Hi Jeremy,

    You mention that the Google API key has a cost. Have you done anything with the Google AJAX Search (which appears to be free)?

    1. Hi Steve.

      No, I haven’t done anything with the AJAX API but would be interested in looking into it. Do you know of any resources that illustrate how to use it? Does the AJAX API search the main Google index? One of the advantages of the Custom Search Engine API is that Google builds a customised index of the site. Still, I believe in using the right tools for the job and for another client that may well be the AJAX API.

    1. Unfortunately no I don’t know. Is it working now? Was your account activated and had the site been crawled at the time? Let me know and if you’re still having problems I’ll give it some more thought.

  2. Please i have two questions:
    1) How can i group the results of my search by specific url of specific pages
    2) How can i use your class to display images as results of my search ?

    thks

Leave a Reply