Glype proxy filtering using MySQL and PHP
I’m pretty new to proxy sites, two up on a low budget VPS, and have been playing around with logging my traffic. I haven’t heard of any big proxies getting in trouble for what visitors go to but there’s always that chance. Plus, I know I get a lot of traffic from schools. I don’t want to be accused of making it easier for kids to look at porn by passing school filters. I’ve also had some ‘questionable’ traffic show up and I want to block that as well.
So, here are a few bits of code that I made to track traffic and ban certain domain names from loading. I tried to make this as easy as I could. This should work on any php proxy other than glype but that’s what I’m using. You’ll need a little knowledge in php and mysql. I didn’t write this guide for idiots so please, if you’re not certain of something, google it. I wrote it for lazy programmers.
First, create a new mysql database called proxy_log.
Then add this table:
CREATE TABLE `domains` (
`domain_id` int(11) NOT NULL auto_increment,
`domain` varchar(100) NOT NULL default ”,
`banned` int(2) default NULL,
`timestamp` timestamp NOT NULL default CURRENT_TIMESTAMP,
PRIMARY KEY (`domain_id`),
KEY `domain` (`domain`),
KEY `domain_2` (`domain`,`domain_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COMMENT=’domains’;
Here’s the logging code. Put it in the top of browse.php
$sm['db_hostname'] = ‘localhost’;
$sm['db_username'] = ‘proxy_log’;
$sm['db_password'] = ‘password’;
$sm['db_name'] = ‘proxy_log’;
@mysql_connect ($sm['db_hostname'], $sm['db_username'], $sm['db_password']) or die (’Unable to connect to MySQL: ‘ . mysql_error());
@mysql_select_db ($sm['db_name']) or die (’Unable to select database: ‘ . mysql_error());
function log_domain($domain) {
$sql = “SELECT domain_id, domain FROM `domains` WHERE domain = ‘$domain’ limit 1″;
$result = @mysql_query($sql);
if (!$result) { die(”Invalid query: ” . mysql_error() . “\n$sql\n”); }
$row = @mysql_fetch_array($result);
if (isset($row['domain_id'])) {
return $row['domain_id'];
}
$sql = “INSERT INTO domains (domain) values (’$domain’) “;
$result = @mysql_query($sql);
$insert_id = @mysql_insert_id();
return $insert_id;
}
function start_logging($domain = ”) {
banned_domain($domain);
$domain_id = log_domain($domain);
}
function banned_domain($domain = ”) {
$sql = “SELECT * from domains WHERE domain = ‘$domain’ AND banned = 1 LIMIT 1″;
$result = @mysql_query($sql);
f (!$result) { die(”Invalid query: ” . mysql_error() . “\n$sql\n”); }
$row = @mysql_fetch_array($result);
if (isset($row['domain']) && ($row['domain'] == $domain)) {
header(”HTTP/1.1 403 Forbidden”);
error(’Sorry the domain ‘.$domain.’ has been banned. Generally a domain is banned if it contains adult content. If you feel this has been an error on our part, please contact natefanaro@gmail.com. Thanks.’);
die();
}
}
The above code starts with start_logging(). That will check to see if the domain is blocked, then record it if it’s not.
There’s one last step that will get everything going with logging. Look for the ‘## Define URL parts’ comment and add ’start_logging($tmp[2]);’ right after ‘define(’urlDOMAIN’,$tmp[2]);’ You might have to modify the variable going in if that’s changed in a newer release of glype.
So now we have a simple database that records all new domains that are visited. If you set banned to 1 any visits to that site will get caught by the banned_domain() function. You can change that die(); to an error message.
If you want an easy way to ban a bunch of sites, just run:
UPDATE `domains` set banned = ‘1′ where domain like ‘%foo%’;
Use your own bad word instead of foo since the Foo Fighters are a great band. I have a cron job setup that goes through a list and bans sites.
I think that’s all the info I’m going to post for now. This is part of a larger system that I have in place. It’s set up now to be fully automatic. Flagging new domains based on keywords and banning visitors whose visits are a high percentage of those domains. I don’t think I’ll post all of it but the auto flagging of domain names is what I’ll post next.
PS: If a domain is banned it will redirect the user to your front page. This is actually a great feature as it can drive up your adsense clicks. If someone gets blocked from your proxy and there’s an ad showing for another one, they might click on it
PPS: You might want to comment out the mysql error lines here when everything is setup. If something breaks you don’t want to prevent users from continuing using your proxy.