Dansguardian web content filtering HOWTO instal & configure on sme server
Author: Ray Mitchell - mitchellcpa_AT_yahoo.com.au
Release Date: 1 April 2006 v6 DRAFT
sme versions supported: 5.6, 6.0, 7.0 (earlier versions supported with some changes)
Contributors
The information contained in this HOWTO was primarily obtained from various forum posts with some minor changes & tidying up as required. Thanks to those posters particularly Shad Lords & Stephen Noble & thanks to Abe Loveless for his testing and feedback. Stephen Noble has kindly provided rpms.
Information
To have a proper understanding of how Dansguardian works and the importance of certain configuration settings you should visit the Dansguardian web site at
http://www.dansguardian.org
and also see the detailed installation notes at
http://dansguardian.org/downloads/detailedinstallation2.4.html#further
Some of this information is of a generic nature and is NOT applicable to sme server installations, refer to the instructions in this HOWTO in preference.
Installation instructions for sme v5.6 & 6.x
Download the rpms from
http://distro.ibiblio.org/pub/linux/distributions/smeserver//contribs/dungog/packages/smeserver/6.0/i386/dungog/RPMS/into an empty folder
wget http://www.contribs.org/contribs/dungog/files/dansguardian/DansGuardian-2.6.1-3.RH72.i386.rpm
wget http://www.contribs.org/contribs/dungog/files/dansguardian/dungog-dansguardian-blacklists-0.1-9.noarch.rpm
Instal the rpms
rpm -Uvh *.rpm
Installation instructions for sme v7.0
Download the rpms from http://distro.ibiblio.org/pub/linux/distributions/smeserver//contribs/dungog/packages/smeserver/7.0/i386/RPMS.dungog/
into an empty folder
wget http://distro.ibiblio.org/pub/linux/distributions/smeserver//contribs/dungog/packages/smeserver/7.0/i386/RPMS.dungog/
dansguardian-2.8.0.6-1.2.el4.rf.i386.rpm
wget http://distro.ibiblio.org/pub/linux/distributions/smeserver//contribs/dungog/packages/smeserver/7.0/i386/RPMS.dungog/
dungog-blacklists-0.1-12.noarch.rpmwget
http://distro.ibiblio.org/pub/linux/distributions/smeserver//contribs/dungog/packages/smeserver/7.0/i386/RPMS.dungog/
smeserver-dansguardian-1.2-4.noarch.rpmInstal the rpms
rpm -Uvh *.rpm
Starting Dansguardian
You must initially start Dansguardian to enable web content filtering
/etc/init.d/dansguardian start
Restarting Dansguardian
You will need to restart Dansguardian after any config changes
/etc/init.d/dansguardian restart
Stopping Dansguardian
If you need to stop Dansguardian
/etc/init.d/dansguardian stop
Enabling the Dansguardian service at startup for sme5.6 & 6.x (not required for 7.0)
You need the following entry in the sme configuration database /home/e-smith/configuration
dansguardian=service|InitscriptOrder|92|status|enabled
To add this do
/sbin/e-smith/config set dansguardian service Initscriptorder 92 status enabled
You will also need to add a link in /etc/rc7.d
To add this do
ln -s /etc/rc.d/init.d/e-smith-service /etc/rc.d/rc7.d/S92dansguardian
Enabling logrotation for sme5.6 & 6.x (not required for 7.0)
To enable the weekly rotation of dansguardian logs, do the following.
cd /etc/cron.weekly
pico dansguardian
Add the following lines
# logrotation script for dansguardian
exec /etc/dansguardian/logrotation
Ctrl o
To save
Ctrl x
To exit
Using squidguard blocking rules
If you wish to make dansguardian use squidguard blocking rules & have them updated weekly then ad the following to the /etc/cron.weekly/dansguardian file mentioned above
Please check the location of the blacklists is still current, if necessary search Google on "squidGuard blacklists" to find a current location.
cd /etc/dansguardian
rm -r blacklists.tar.gz
wget -qnv http://ftp.teledanmark.no/pub/www/proxy/squidGuard/contrib/blacklists.tar.gz -O blacklists.tar.gz
tar -zxf blacklists.tar.gz
chown -R root.root blacklists
chmod -R 640 blacklists
find blacklists -name new\* -exec rm {} \;
rm blacklists/README
chmod ug+x blacklists
chmod ug+x blacklists/*
Modifying Dansguardian configuration for sme v5.6 & 6.x
pico /etc/dansguardian/dansguardian.conf
make your required changes
Ctrl o
Ctrl x
Make required changes to suit your situation
You will initially need to change:
accessdeniedaddress = 'http://YOURSERVER.YOURDOMAIN/cgi-bin/dansguardian.pl'
for example to
accessdeniedaddress = 'http://www.mydomain.com/cgi-bin/dansguardian.pl'
You may also need to change (to suit adult level of protection)
naughtynesslimit = 50
to
naughtynesslimit = 160
Modifying Dansguardian configuration for sme v7.0
You need to modify two configuration files
/etc/dansguardian/dansguardian.conf
and
/etc/dansguardian/dansguardianf1.conf
pico /etc/dansguardian/dansguardian.conf
You will initially need to change:
accessdeniedaddress = 'http://YOURSERVER.YOURDOMAIN/cgi-bin/dansguardian.pl'
for example to
accessdeniedaddress = 'http://www.mydomain.com/cgi-bin/dansguardian.pl'
Make any other required changes to suit your situation
Ctrl o
Ctrl x
pico /etc/dansguardian/dansguardianf1.conf
You may initially need to change (to suit adult level of protection)
naughtynesslimit = 50
to
naughtynesslimit = 160
Make any other required changes to suit your situation
Ctrl o
Ctrl x
Modifying other Dansguardian configuration files for sme 5.6, 6.x & 7.0
You may also need to change other config files to suit your requirements:
These are located in /etc/dansguardian/..... eg
pico /etc/dansguardian/bannedextensionlist
make the required changes
then do
Ctrl O
to save your changes
then do
Ctrl X
to exit
Most users will need to change these 4 files as a minimum
exceptionsitelist
bannedsitelist
bannedurllist
bannedextensionlist
Here is the full list of config files, see detailed information at www.dansguardian.org about these files or in the "Further customisation" section below. Some of the default settings in these files will prevent access to certain web sites and file types, so you should review all the files mentioned.
bannedextensionlist
bannediplist
bannedmimetypelist
bannedphraselist
bannedregexpurllist
bannedsitelist
bannedurllist
banneduserlist
contentregexplist
exceptioniplist
exceptionphraselist
exceptionsitelist
exceptionurllist
exceptionuserlist
pics
weightedphraselist
You will also want to tailor the html template for the error message displayed when Dansguardian blocks a site, see
/etc/dansguardian/languages/(languagename)/template.html
eg
/etc/dansguardian/languages/ukenglish/template.html
For older versions of dansguardian see
/etc/dansguardian/template.html
After making any changes to Dansguardian configuration settings you should Restart Dansguardian for those changes to take effect
to restart do
/etc/init.d/dansguardian restart
Configuring the Proxy server settings
Dansguardian uses port 8080 for web proxy requests. If your browser does not use port 8080 then Dansguardian filtering will be bypassed and therefore ineffective.
Manually configuring your browser to use port 8080
Go to your workstation and open your browser
eg Internet Explorer or your browser
Change the settings for Connections to LAN
use the server IP 192.168.1.1 (or whatever yours is)
use a port of 8080 (instead of 3128)
Make sure you disable Auto detect as this will allow the browser to bypass Dansguardian
Users can easily change the setting in the browser to bypass Dansguardian filtering and gain access to blocked sites & inappropriate content. To overcome this possibility you need to change the sme server proxy port as follows.
Configuring your sme server to use Proxy port 8080
By default the proxy server is on port 3128
To change this setting to port 8080 permanently, do the following
To change the default Transparent proxy port on sme server
/sbin/e-smith/db configuration setprop squid TransparentPort 8080
/sbin/e-smith/signal-event post-upgrade
/sbin/e-smith/signal-event reboot
On sme 7.0 you only need to do
db configuration setprop squid TransparentPort 8080
signal-event post-upgrade
signal-event reboot
Then configure your browser to either automatically detect the port or to use port 8080
Additionally you may wish to prevent users configuring their browser to use port 3128 in order to circumvent Dansguardian, and thus allow unimpeded access to the Internet.
The following details are in DRAFT form only.
You will have to determine the correct custom template fragments to use & which template to expand yourself. There have been numerous posts at the contribs.org forums with solutions provided. These have not been documented at this stage but can be found by doing searches.
To block access to port 80 and 3128 and force users to use 8080
add the following and remove the transproxy lines from masq
The following applies to sme v5.6, 6.x & 7.0 which use iptables.
Earlier sme versions require a different fix as they use ipchains.
$OUT .= " /sbin/iptables --append Forward$AllowLocals -s $local -p tcp --destination-port 80 -j DROP\n";
$OUT .= " /sbin/iptables --append Forward$AllowLocals -d $local -p tcp --destination-port 80 -j DROP\n";
$OUT .= " /sbin/iptables --append Input$AllowLocals -s $local -p tcp --destination-port 80 -j DROP\n";
$OUT .= " /sbin/iptables --append Forward$AllowLocals -s $local -p tcp --destination-port 3128 -j DROP\n";
$OUT .= " /sbin/iptables --append Forward$AllowLocals -d $local -p tcp --destination-port 3128 -j DROP\n";
$OUT .= " /sbin/iptables --append Input$AllowLocals -s $local -p tcp --destination-port 3128 -j DROP\n";
Expand the template when changes have been made.
Testing access
Try browsing to the site of
www.sex.com
You should receive a message advising the site is blocked
Try browsing to other sites with inappropriate content or a site on your banned site list and you should also receive a site blocked message.
Remember that access to sites is controlled by settings in the config files mentioned above.
DansGuardian is highly configurable. The source code is provided so you have the ultimate in configurability, although most people will be content with modifying the configuration files.
After you have modified any configuration file, to apply the changes you will need to restart DansGuardian.
There is one main configuration file, several banned lists and an exception list. These are all explained below:
This contains a list of domain endings that if found in the requested URL, DansGuardian will not filter the page. Note that you should not put the http:// or the www. at the beginning of the entries.
This contains a list of client IPs who you want to bypass the filtering. For example, the network administrator's computer's IP.
Usernames who will not be filtered (basic authentication or ident must be enabled).
If any of the phrases listed here appear in a web page then the filtering is bypassed. Care should be taken adding phrases to this file as they can easily stop many pages from being blocked. It would be better to put a negative value in the weightedphraselist.
URLs in here are for parts of sites that filtering should be switched off for.
IP addresses of client machines to disallow web access to. Only put IP addresses here, not host names.
This contains a list of banned phrases. The phrases must be enclosed between < and >. DansGuardian is supplied with an example list. You can not use phrases such as <sex> as this will block sites such as Middlesex University. The phrases can contain spaces. Use them to your advantage. This is the most useful part of DansGuardian and will catch more pages than PICS and URL filtering put together.
Combinations of phrases can also be used, which if they are all found in a page, it is blocked. Exception phrases are no longer listed in this file - see exceptionphraselist.
Users names, who, if basic proxy authentication is enabled, will automatically be denied web access.
This contains a list of banned MIME-types. If a URL request returns a MIME-type that is in this list, DansGuardian will block it. DansGuardian comes with some example MIME-types to deny. This is a good way of blocking inappropriate movies for example. It is obviously unwise to ban the MIME-types text/html or image/*.
This contains a list of banned file extensions. If a URL ends in an extension that is in this list, DansGuardian will block it. DansGuardian comes with some example file extensions to deny. This is a good way of blocking kiddies from downloading those lovely screen savers and hacking tools. You are a fool if you ban the file extension .html, or .jpg etc.
This contains a list of banned regular expression URLs. For more information on regular expressions, see http://www.opengroup.org/onlinepubs/7908799/xbd/re.html
Regular expressions are a very powerful pattern matching system. This file allows you to match URLs using this method.
This file contains a list of banned sites. Entering a domain name here bans the entire site. For banning specific parts of a site, see bannedurllist. Also, you can have a blanket ban all sites except those specifically excluded in exceptionsitelist. You can also block sites specified only as an IP address, and include a stock squidGuard blacklists collection. To enable these blacklists, download them from the extras section http://dansguardian.org/?page=extras
Simply put them somewhere appropriate, un-comment the squidGuard blacklists collection lines at the bottom of the bannedsitelist file, and check the paths are correct. For URL blacklists, edit the bannedurllist in a similar way.
This allows you to block specific parts of a site rather than the whole site. To block an entire site, see bannedsitelist. To enable squidGuard blacklists for URLs, you will need to download the blacklists and edit the squidGuard blacklists collection section at the bottom (as for bannedsitelist above).
Each phrase is given a value either positive or negative and the values are added up. Phrases to do with good subjects will have negative values, and bad subjects will have positive values. Once the naughtyness limit is reached (within dansguardian.conf), the page is blocked. See the Naughtyness Limit description within the dansguardian.conf section below.
This file allows you to finely tune the PICS filtering. Each PICS section comes with a description of the allowed settings and what they represent. The default settings with DansGuardian are set for youngish children, for example mild profanities and artistic nudity are allowed. PICS filtering can also be totally disabled / enabled using the enablePICS = on | off option.
For more detailed information on PICS ratings, see http://www.w3.org/PICS/
The ICRA section is fairly self-explanatory. A value of 0 means nothing of that category is allowed, whereas a value of 1 allows it. For example,
ICRAnudityartistic = 1
allows nude art. For more in-depth information see http://www.rsac.org/
RSAC is an older version of ICRA. The values here range from 0 meaning none allowed, through 2 (the default value), to 4, which allows wanton and gratuitous amounts of the given category. For more in-depth information see http://www.rsac.org/
evaluWEB rating uses a system similar to the British Film classification system:
0 = U (Universal, ie. suitable for even the youngest viewer)
1 = PG (Parental Guidance recommended)
2 = 18 (Only suitable for viewers aged 18 and over)
Similar to RSAC, but containing a larger range of categories with the range from 0 = full filtering to 9 = wanton and gratuitous. For more in-depth information, see http://www.safesurf.com
See evaluWEB. For more in-depth information, see http://www.weburbia.com/safe/index.shtml
This is yet another ratings scheme. See http://vancouver-webpages.com/VWP1.0/
for more information.
The only setting that is vital for you to configure in the dansguardian.conf file is the accessdeniedaddress setting. You should set this to the address (not the file path) of your Apache server with the perl access denied reporting script. For most people this will be the same server as squid and DansGuardian. If you really want you can change this address to a normal html static page on any server.
You can change the reporting level for when a page gets denied. It can say just 'Access Denied', or report why, or report why and what the denied phrase is. The latter may be more useful for testing, but the middler would be more useful in a school environment. Stealth mode logs what would be denied but doesn't do any blocking.
This setting lets you configure the logging level. You can log nothing, just denied pages, text based and all requests. HTTPS requests only get logged when the logging is set to 3 - all requests.
Log if an exception (user, ip, URL, or phrase) is matched and so the page gets let through. This can be useful for diagnosing why a site gets through the filter.
This setting alters the format of the DansGuardian log file. Please note option 3 (standard log format) is not yet unimplemented.
These allow you to modify the IP address that DansGuardian is listening on, the port DansGuardian listens on, the IP address of the server running squid as well as the squid port. It is possible to configure the Access Denied reporting page here also.
Here you can modify the location of the list files. Adjusting these locations is not recommended.
This setting refers to the weighted phrase limit over which the page will be blocked. Each weighted phrase is given a value either positive or negative and the values added up. Phrases to do with good subjects will have negative values, and bad subjects will have positive values. See the weightedphraselist file for examples. As a rough guide, a value of 50 is for young children, 100 for older children, 160 for young adults.
If enabled then the phrases found that made up the total which exceeds the naughtyness limit will be logged and, if the reporting level is high enough, reported.
If set to on, DansGuardian will look up the forward DNS for an IP URL address and search for both in the banned site and URL lists. This would prevent a user from simply entering the IP for a banned address. It will reduce searching speed somewhat so unless you have a local caching DNS server, leave it off and use the Blanket IP Block option in the bannedsitelist file instead.
This will compare the date stamp of the list file with the date stamp of the cache file and will recreate as needed. If a bsl or bul .processed file exists, then that will be used instead. It will increase process start speed by 300%. On slow computers this will be significant. Fast computers do not need this option.
This is for blocking or limiting uploads, not for blocking forms without any file upload. The value is given in kilobytes after MIME encoding and header information.
The proxyauth option is for when basic proxy authentication is used (obviously no good for transparent proxying). The ntlm option is for when the proxy supports the MS NTLM authentication. This only works with IE5.5 sp1 and later, and has not been implemented yet. The ident option causes DansGuardian to try to connect to an identd server on the computer originating the request.
This option adds an X-Forwarded-For: <clientIP> to the HTTP request header. This may help solve some problem sites that need to know the source IP.
This sets the maximum number of processes to spawn to handle the incoming connections. This will prevent DoS attacks killing the server with too many spawned processes. On large sites you might want to double or triple this number.
This option logs some debug info regarding fork()ing and accept()ing which can usually be ignored. These are logged by syslog. It is safe to leave this setting on or off.