Page 1 of 2 12 LastLast
Results 1 to 15 of 23

Thread: GEO Targeting Made EASY! (For PHP)

  1. #1
    Site Contributor KLB's Avatar
    Join Date
    Feb 2006
    Location
    Saco Maine
    Posts
    1,181

    Thumbs up GEO Targeting Made EASY! (For PHP)

    Okay I've seen many threads with people wanting to easily implement geo targeting on their website for things like Yahoo. Well last night I implemented it on my site and I thought I'd share my methods with everyone else. The strengths of my methodology is that it does not require a database server as it works from a CSV file, It only requires adding one function to your site AND it is FREE.

    Step 1) download a current IP "database" from http://software77.net/cgi-bin/ip-country/geo-ip.pl. There is a link to download the database part way down the right hand side of the page.

    Step 2) extract the CSV file to a folder on your website.

    Step 3) add the following function to your website scripts (I placed it in my "functions.inc" include). I should note that this function is a boiled down version of the script found at http://webnet77.com/scripts/geo-ip/index.html:
    Code:
    	function ipcountrycode($ip){
    		// convert IP to decimal
    		$ip=sprintf("%u", ip2long($ip));
    
    		// set initial low
    		$low = 0;
    
    		// Open the csv file for reading
    		$csvfilename="ip2country.csv"; // Change to proper filename and path for IP dataset.
    		$fp = fopen($csvfilename, "r");
    
    		// Set initial high
    		fseek($fp, 0, SEEK_END);
    		$high = ftell($fp);
    
    		while ($low <= $high) {
    			$mid = floor(($low + $high) / 2); // C floors for you
    			
    			//Seek to half way through
    			fseek($fp, $mid);
    			
    			// Moves to end of line
    			if($mid != 0){
    				$line=fgets($fp);
    				}
    
    			// Read line
    			$ipdata=fgetcsv($fp,100);
    
    			if ($ip >=$ipdata[0] && $ip<=$ipdata[1]){
    				$low=999999999;
    				}
    			elseif($ip >$ipdata[0]){
    				$low = $mid + 1;
    				}
    			else {
    				$high = $mid - 1;
    				}
    			}
    		fclose($fp);
    		$line="";
    		return $ipdata[4];
    		}
    Step 4) Call the above function from the top of the script and store the user's two digit country code in a string that can be referenced where needed:
    Code:
    	$strCountryCode=ipcountrycode($REMOTE_ADDR);
    The ipcountrycode function should really only be called once per page load and stored in a string variable for use throughout the script to reduce server overhead. To further reduce server load requirements, the afore mentioned function call can be replaced with the following code, which would set the country code as a cookie allowing the country code to be checked only once every seven days for users who allow cookies:
    Code:
    	// Returns User's country code
    	//+++++++++++++++++++++++++++++
    	if(strlen(addslashes($_COOKIE['GeoLocation']))==2){ 
    		$strCountryCode=$_COOKIE['GeoLocation'];
    		}
    	else{
    		$strCountryCode=ipcountrycode($REMOTE_ADDR);
    
    		//	Sets cookie for geo location.
    		$HostDomain=str_replace("http://","",$_SERVER['HTTP_HOST']);
    		$SetCookieExpire=mktime() +604800; //Set cookie for one week
    		setcookie ("GeoLocation", $strCountryCode, $SetCookieExpire, "/", $HostDomain, 0) ;
    		}

    There you have it, a very simple and very clean way to get the country code of your users to do things like geo target ads. Oh make sure to periodically download updates to the IP CSV file so that your targeting remains as accurate as possible.

    ==EDIT==
    2007-03-20:I replaced the function that read the entire file into memory with a function that did a binary compare on the file without reading the entire file into memory. Per rpanella's observations and critique. See post #14

    2007-03-23: Fixed some errors that caused country codes not to be returned some times. See post #23
    Last edited by KLB; 03-23-2007 at 09:57 AM.
    Ken Barbalace - EnvironmentalChemistry.com (Environmental Careers, Blog)
    InternetSAR.org: Volunteers Assisting Search and Rescue via the Internet
    My Firefox Theme Classic Compact: Based onFirefox's classic theme but uses much less window space

  2. #2
    Registered
    Join Date
    Jul 2005
    Posts
    33
    Thanks dude, such easy setup rocks. Just curious if it will be good enough for sites with some real traffic.

    Personally I'm using geoip php extension for ages, as easy as calling geoip_record_by_name($_SERVER['REMOTE_ADDR']); to get everything. Bit geeky to setup tho'

  3. #3
    Site Contributor KLB's Avatar
    Join Date
    Feb 2006
    Location
    Saco Maine
    Posts
    1,181
    Well my site gets around 20,000 page views per day and so far today it has gotten over 10,000 according to Google AdSense and I'm not seeing any problems.

    Although the data table has some 150,000 rows, whomever created the original record search routine was sharp enough to use a bisection search, which allows the script to only look at a tiny fraction of the actual records.

    Think of it this way, the script takes the middle record of the dataset and decides if the IP address is higher or lower than that record. If it is lower then it takes the record at the halfway point of the lower sub-set of records and again makes this determination. This continues until it gets to the correct record. This is one of the most efficient methods finding a matching record and allows the script to only need to every look at a very small fraction of the total records.

    Here is what practically happens on a dataset of 100,000 records:
    Check #1) 50,000 records eliminated. 50% remaining
    Check #2) 25,000 records eliminated. 25% remaining
    Check #3) 12,500 records eliminated. 12.5% remaining
    Check #4) 6,250 records eliminated. 6.25% remaining
    Check #5) 3,125 records eliminated. 3.13% remaining
    Check #6) 1,562 records eliminated. 1.56% remaining
    Check #7) 781 records eliminated. 0.78% remaining
    Check #8) 390 records eliminated. 0.39% remaining
    Check #9) 195 records eliminated. 0.20% remaining
    Check #10) 98 records eliminated. 0.10% remaining
    Check #11) 49 records eliminated. 0.05% remaining
    Check #12) 24 records eliminated. 0.02% remaining
    Check #13) 12 records eliminated. 0.01% remaining
    Check #14) 6 records eliminated. 0.006% remaining
    Check #15) 3 records eliminated. 0.003% remaining
    Check #16) 2 records eliminated. 0.002% remaining
    Check #17) 1 recored eliminated match made.

    As you can see it is very very efficient.
    Last edited by KLB; 03-19-2007 at 12:53 PM.
    Ken Barbalace - EnvironmentalChemistry.com (Environmental Careers, Blog)
    InternetSAR.org: Volunteers Assisting Search and Rescue via the Internet
    My Firefox Theme Classic Compact: Based onFirefox's classic theme but uses much less window space

  4. #4
    Administrator Chris's Avatar
    Join Date
    Feb 2003
    Location
    East Lansing, MI USA
    Posts
    7,055
    Couldn't you just import the textfile into MySQL and access it with 1 query


    select location where ip = '$_SERVER['REMOTE_ADDR]';
    Chris Beasley - My Guide to Building a Successful Website[size=1]
    Content Sites: ABCDFGHIJKLMNOP|Forums: ABCD EF|Ecommerce: Swords Knives

  5. #5
    Site Contributor KLB's Avatar
    Join Date
    Feb 2006
    Location
    Saco Maine
    Posts
    1,181
    I didn't want to import into MySQL for a couple of reasons. One of them was I wanted to be able to update the dataset by simply overwriting the data file without having to purge and reimport the data. Basically I wanted the ease of updating and I wanted to keep the processing load on my Webserver, not the database server. When I tested it on my own laptop it didn't seem to hit the processor all that hard. I'd actually expect as big a hit from MySQL given the 150,000 records involved.

    There isn't a record for every IP address, there are records for ranges and the IP addresses are stored as integers. So you would have to still convert 'REMOTE_ADDR' to an integer and then do a WHERE StartIP<=$REMOTE_ADDR AND EndIP>=$REMOTE_ADDR.

    As they say there are different ways to skin a cat and given my current web hosting setup, it is preferable to put the load on the web server than it is on the MySQL server.
    Ken Barbalace - EnvironmentalChemistry.com (Environmental Careers, Blog)
    InternetSAR.org: Volunteers Assisting Search and Rescue via the Internet
    My Firefox Theme Classic Compact: Based onFirefox's classic theme but uses much less window space

  6. #6
    Registered
    Join Date
    Mar 2006
    Posts
    330
    What are you using it for Ken?

    To deliver a different page based on country or state, please explain?

  7. #7
    Site Contributor KLB's Avatar
    Join Date
    Feb 2006
    Location
    Saco Maine
    Posts
    1,181
    I'm using it mostly to geo target ads. For instance I'm turning off my career listings links for countries where they are not appropriate. Basically I figure I can save my users some bandwidth and time by not serving up ads that don't apply to them. This also reduces clutter from useless links for those users.

    Eventually I hope to pick up some country specific banner ads for my top four countries. For instance I could push out links to Canadian job listings to those in Canada instead of giving them links U.S. job listings.

    For users of Yahoo's Publisher Network such geo targeting would allow them to turn off YPN ads to non-US users, which is necessary to stay in Yahoo's good graces.

    Basically the smarter one targets their ads, the more effective those ads become.
    Ken Barbalace - EnvironmentalChemistry.com (Environmental Careers, Blog)
    InternetSAR.org: Volunteers Assisting Search and Rescue via the Internet
    My Firefox Theme Classic Compact: Based onFirefox's classic theme but uses much less window space

  8. #8
    Registered
    Join Date
    Aug 2006
    Location
    Sacramento, CA
    Posts
    208
    What's the point of doing a binary search on it if you are already reading all the values into memory? The point of doing a binary search is so that you only have to look up (log n) values, but with this code you are still reading in the entire file, so you are not gaining any efficiency.

    I would recommend using Maxmind's free GeoIP database and their PHP class which would be much more efficient and extremely simple to implement: http://www.maxmind.com/app/php
    ________
    Blonde asian
    Last edited by rpanella; 03-17-2011 at 10:43 AM.

  9. #9
    Chronic Entrepreneur
    Join Date
    Nov 2003
    Location
    Tulsa, Oklahoma, USA
    Posts
    1,112
    The point is that you spend less cycles searching through the values once they are in memory. It's more efficient to load all sequential values into memory and then do a binary search for the proper value than it is to load all values into memory and then, say, do a sequential search for the proper value.

    I do agree with you about using Maxmind, though. It has the same benefits as KLB's (no database required, easy to implement, free, one file to replace to update the data). They release an updated IP file every month. It's pretty efficient, but if you're worried about using it for extremely high traffic applications it has a shared memory option you can invoke to cache the list in memory, and a mod_geoip apache module that you can install to make it run as a native binary (although doing it this way takes away the easy setup benefit).

  10. #10
    Site Contributor KLB's Avatar
    Join Date
    Feb 2006
    Location
    Saco Maine
    Posts
    1,181
    The disadvantage of GeoIP is that it is a module that must be installed, which isn't always an option on a shared hosting environment. I did at one point play with GeoIP and even have it installed on my laptop, but I did not like it. The method I've posted is the easiest I've found to implement and does not require one to "install" anything. It is also entirely free.

    Now granted reading the entire file into memory isn't optimal, but the binary search does at least streamlines the process.

    What would help is if one could do the binary search without loading the entire file into memory. The method above doesn't seem to cause problems with the 150,000 row dataset, but I wouldn't want to apply it to the IP dataset that was broken down to city level.

    By the way, I did update the code some to clear the buffer and close the handle, which on my laptop (which contains an off line version of my site) seemed to reduce processor load).
    Ken Barbalace - EnvironmentalChemistry.com (Environmental Careers, Blog)
    InternetSAR.org: Volunteers Assisting Search and Rescue via the Internet
    My Firefox Theme Classic Compact: Based onFirefox's classic theme but uses much less window space

  11. #11
    Registered
    Join Date
    Aug 2006
    Location
    Sacramento, CA
    Posts
    208
    Quote Originally Posted by Westech View Post
    The point is that you spend less cycles searching through the values once they are in memory. It's more efficient to load all sequential values into memory and then do a binary search for the proper value than it is to load all values into memory and then, say, do a sequential search for the proper value.
    You have to load the entire file into memory first though so still O(n) (the time to search grows linearly with the number of records). A true binary search, which is what is done by Maxminds class or would be done with a mysql index if you imported it to mysql, only has to access log n records, so the time to search grows logarithmically. Log2 150000 = 17 accesses vs all 150k with the code above.

    My point was if you are reading the entire array into memory, might as well check the value as you read it in and stop reading once you find it. To read the whole thing sequentially into memory, and then to a binary search makes no sense.


    Quote Originally Posted by KLB View Post
    The disadvantage of GeoIP is that it is a module that must be installed, which isn't always an option on a shared hosting environment. I did at one point play with GeoIP and even have it installed on my laptop, but I did not like it. The method I've posted is the easiest I've found to implement and does not require one to "install" anything. It is also entirely free.
    The Maxmind database is also free, and intsallation only requires uploading their database file and a php file with the class, which u include in any script you need it in. Westech actually wrote an article on it on this site.
    ________
    Honda Cb125 Specifications
    Last edited by rpanella; 03-17-2011 at 10:44 AM.

  12. #12
    Site Contributor KLB's Avatar
    Join Date
    Feb 2006
    Location
    Saco Maine
    Posts
    1,181
    Quote Originally Posted by rpanella View Post
    The Maxmind database is also free, and intsallation only requires uploading their database file and a php file with the class, which u include in any script you need it in. Westech actually wrote an article on it on this site.
    Maxmind's code is a little complex and quite convoluted for my brain this late at night, but as I dig through the pure PHP include version of their code, it appears that it essentially opens up their file and then reads it into memory, like mine does. If it doesn't, then one should be able to do the binary search as part of the process of reading the file with a heck of a lot less code than what GeoIP does as all we really need is to turn an IP address into a two digit country code.

    I do not like putting really complicated code that I can not easily follow into my site's scripts. If for no other reason, I want to be able to fix it if something breaks (e.g. version change with PHP).

    I do follow your logic of not reading the entire file into memory, and maybe I can accomplish this with something along the lines of what I posted above and without all the clutter of GeoIP.
    Ken Barbalace - EnvironmentalChemistry.com (Environmental Careers, Blog)
    InternetSAR.org: Volunteers Assisting Search and Rescue via the Internet
    My Firefox Theme Classic Compact: Based onFirefox's classic theme but uses much less window space

  13. #13
    Registered
    Join Date
    Aug 2006
    Location
    Sacramento, CA
    Posts
    208
    I have not gone through Maxmind's code line by line but I know they use a binary format for their file, so that each record is the same exact size, and you can then calculate and read in any record in the file using fseek(). This means it only needs 17 reads instead of loading the entire file into memory each pageview.

    All you need to get the countrycode with Maxmind is the geoip.inc file and their database file. All the other files are only if you need cities, regions, and other more specific information.
    ________
    BRUNETTE LINGERIE
    Last edited by rpanella; 03-17-2011 at 10:44 AM.

  14. #14
    Site Contributor KLB's Avatar
    Join Date
    Feb 2006
    Location
    Saco Maine
    Posts
    1,181
    Okay I didn't really figure out Maxmind's script (too much extraneous stuff), but when I went to PHP.net and looked up the function fseek() I found a function someone posted that I was able to merge with the function I originally posted above to create a binary search of the file without reading it into memory. Here it is for your inspection:

    Code:
    	function ipcountrycode($ip){
    		$ip=ip2long($ip);
    
    		// IP data set file name (modify to the correct filename)
    		$csvfilename="ip2country.csv";
    		// Open the csv file for reading
    		$fp = fopen($csvfilename, "r");
    
    		fseek($fp, 0, SEEK_END);
    		
    		$low = 0;
    		$high = ftell($fp);
    		$found=false;
    		while ($low <= $high && $found==false) {
    			$mid = floor(($low + $high) / 2); // C floors for you
    			
    			//Seek to half way through
    			fseek($fp, $mid);
    			
    			// Moves to proper line
    			if($mid != 0){
    				$line=fgets($fp);
    				}
    
    			// Read line
    			$line=fgets($fp);
    			$line = str_replace("\"", "",$line);
    			$ipdata = explode(",",$line);
    			if ($ip >=$ipdata[0] && $ip<=$ipdata[1]){
    				$found=true;
    				}
    			elseif($ip >=$ipdata[0]){
    				$low = $mid;
    				}
    			else {
    				$high = $mid;
    				}
    			}
    		fclose($fp);
    		$line="";
    		return $ipdata[4];
    		}
    Last edited by KLB; 03-19-2007 at 11:44 PM.
    Ken Barbalace - EnvironmentalChemistry.com (Environmental Careers, Blog)
    InternetSAR.org: Volunteers Assisting Search and Rescue via the Internet
    My Firefox Theme Classic Compact: Based onFirefox's classic theme but uses much less window space

  15. #15
    Site Contributor KLB's Avatar
    Join Date
    Feb 2006
    Location
    Saco Maine
    Posts
    1,181

    Important Notice!!!

    Okay I just got off the phone with my web hosting provider after they temporarily disabled my site this morning and learned some very interesting things from our experiments. Doing a binary search on the file instead of reading the entire file into memory and then doing a binary search in memory actually took much more processing power. Enough in fact for them to deem it necessary to disable my site.

    So, it would appear that I need to find a way to more efficiently read the file into memory and then crunch it there if I want to use a flat file.

    I'm going to post the original function that read the file into memory and see if any great brains here can help me make the script more efficient.

    Code:
    	function ipcountrycode($ip){
    
    		$ip=ip2long($ip);
    		// Open the csv file for reading
    		$csvfilename="ip2country.csv";
    		$handle = fopen($csvfilename, "r");
    		
    		// Load array with start ips
    		$row = 1;
    		while (($buffer = fgets($handle, 4096)) !== FALSE) {
    			$array[$row] = substr($buffer, 1, strpos($buffer, ",") - 1);
    			$row++;
    			}
    
    		// Locate the row with our ip using bisection
    		$row_lower = '0';
    		$row_upper = $row;
    		while (($row_upper - $row_lower) > 1) {
    			$row_midpt = (int) (($row_upper + $row_lower) / 2);
    			if ($ip >= $array[$row_midpt]) {
    				$row_lower = $row_midpt;
    				}
    			else {
    				$row_upper = $row_midpt;
    				}
    			}
    		// Read the row with our ip
    		rewind($handle);
    		$row = 1;
    		while ($row <= $row_lower) {
    			$buffer = fgets($handle, 4096);
    			$row++;
    			}
    		fclose($handle);
    		$buffer = str_replace("\"", "", $buffer);
    		$ipdata = explode(",", $buffer);
    		$buffer="";
    		return $ipdata[4];
    		}
    Ken Barbalace - EnvironmentalChemistry.com (Environmental Careers, Blog)
    InternetSAR.org: Volunteers Assisting Search and Rescue via the Internet
    My Firefox Theme Classic Compact: Based onFirefox's classic theme but uses much less window space

Similar Threads

  1. Looking for easy website building software, need suggestions?
    By tony101 in forum Website Programming & Databases
    Replies: 15
    Last Post: 12-11-2007, 07:56 PM
  2. Replies: 1
    Last Post: 01-22-2007, 10:32 AM
  3. Selling Custom made video entertainment site script
    By sucka in forum The Marketplace
    Replies: 0
    Last Post: 06-20-2006, 01:38 PM
  4. Is Web Publishing Easy?
    By Cutter in forum General Chat
    Replies: 11
    Last Post: 11-17-2005, 08:18 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •