set up SWISH-E index engine on Fedora Core 4 -------------------------------------------- + download source code from <http://swish-e.org>, build and install it + under /var/www/html directory: [root@raider html]# ls -al swish.cgi -rwxr-xr-x 1 root root 108466 Sep 29 20:14 swish.cgi [root@raider html]# cat .swishcgi.conf return { title => 'Search Localhost Documentation Archive', swish_binary => '/usr/local/bin/swish-e', swish_index => '/var/www/html/index.swish-e', } [root@raider html]# cat swish.conf IndexDir /var/www/html/docs IndexDir /var/www/html/Webmail IndexOnly .htm .html .doc .pdf FileRules filename regex /threads.html/ FileRules filename regex /maillist.html/ ## don't index thrd2.html, mail2.html... FileRules filename regex /thrd[0-9]+.html/ FileRules filename regex /mail[0-9]+.html/ FileFilter .doc /usr/local/bin/catdoc "-s8859-1 -d8859-1 '%p'" FileFilter .pdf /usr/bin/pdftotext "'%p' -" DefaultContents HTML* StoreDescription HTML* <body> 200000 MetaNames swishdocpath swishtitle ReplaceRules remove /var/www/html/ [root@raider html]# swish-e -c swish.conf Indexing Data Source: "File-System" Indexing "/var/www/html/docs" Indexing "/var/www/html/Webmail" Removing very common words... no words removed. Writing main index... Sorting words ... Sorting 44,982 words alphabetically Writing header ... Writing index entries ... Writing word text: Complete Writing word hash: Complete Writing word data: Complete 44,982 unique words indexed. 5 properties sorted. 917 files indexed. 14,127,052 total bytes. 1,101,648 total words. Elapsed time: 00:00:04 CPU time: 00:00:03 Indexing done! [root@raider html]# ls -alt total 7148 drwxr-xr-x 22 root root 4096 Sep 29 23:50 . -rw-r--r-- 1 root root 4817977 Sep 29 23:47 index.swish-e -rw-r--r-- 1 root root 2179584 Sep 29 23:47 index.swish-e.prop + add the following section in Apache configuration file: <Directory "/var/www/html"> ... <files swish.cgi> Options ExecCGI Allow From All SetHandler cgi-script </files> ... </Directory> |