# $Id: WHATSNEW 4726 2012-01-13 21:44:57Z pro $ $URL: svn://svn.setun.net/search/trunk/WHATSNEW $

0.19.2: 
bugs fixed (dc: small filelists upload, ip crop fix)
dc: to add cid fileld to resource run 
perl crawler.pl upgrade=4600
or in config.pl to disable:
delete $config{'sql'}{'table'}{ $config{'sql_tresource'} }{'cid'};

dont upload duplicate filelists by CID
$config{'use_dc_user_dupes'} = 1; #disable it

?tth= == ?tiger=

adc:// fixes

stat in html footer hidden. to show:
$config{'html_footer_stat_bef'} = $config{'html_footer_stat_aft'} = '';

nfs scan (beta) (not under windows)

web: page modes disabled by default, but you can enable as in 0.19.1
$config{'modes'} = [qw(simple standard advanced)];
$config{'mode'} = 'simple';



0.19.1: 
utf bugfix
auto copy config.pl.dist -> config.pl


0.19.0: DATABASE REINSTALL RECOMMENDED
new lib dependency: lib::abs

upgrading: remove ./Net and ./web/Net

moving to utf8: db, crawler, web

$config{'cp_db'} = 'cp1251'; # use for old base

smb.conf: display charset = utf-8
or if old smb.conf:
$config{'fine'}{'file'}{'cp_res'} = 'koi8-r';

bugs fixed (dc, crawler params, ...)

adc://hubs.support

Net::DirectConnect removed from source, install from cpan

dc bot can share files ( $config{fine}{dchub}{share} = ['/pub']; )
correct filelist parsing (br, hit fields)

can login to file:// with user:pass via smbclient

support full-text queries via sphinx ( http://sphinxsearch.com/ ) : 
$config{'use_sphinx'} = 1; #MUST REINSTALL DATABASE AFTER CHANGE
sample sphinx config in tools/sphinx.conf
index with: 
indexer --rotate filebase

db: new filebase.meta field for codepages, audio-video info

DC: new recommended module: HTML::Parser

0.18.3: 

bugs fixed (crawler after scan sql error, correct 0: name=0 path=0 ext=0, 
 mysql koi8, utf: cut ip, dead ranges, dcppp many bugs, ...)

pg: fulltext: 8.3 

new http scanner [in progress]

$config{'allow_cp_detect'} -> !$config{'no_cp_detect'}
$config{'lng_default'} -> $config{'lang_default'}

automatic codepade detect from HTTP_ACCEPT_CHARSET
$config{'codepage'} -> $config{'codepage_default'}
$config{'codepage'} now can be used for disable detect

dcppp module now is Net::DirectConnect (preparing for CPAN)

dc: magnet links now with xs= -> auto connect to specified hub

gpl2 -> gpl3

mod_rewrite ./123 == ./?size=123 ; ./xxx == ./?q=xxx ; ./a/b/c == ./?path=/a/b/c ; ./add == ./?show=add ...
$config{'allow_rewrite'} = 1; 

$config{'cache_http'}=14 now in days to expire cache

$config{'purge_time'} -> $config{'purge'} ;# ./tmp,var files older than this value ignored

$config{'purgedef'} -> $config{'purge_by'}

local path scan:
perl crawler.pl C:\pub
perl crawler.pl local://C:\pub
perl crawler.pl /pub --local_prot=ftp --local_host=10.20.30.40




0.18.2: DATABASE REINSTALL RECOMMENDED (stem) recommended mysqld var: ft_min_word_len = 1 

bugs fixed (xml, online, mod_perl, cosmetic, dc, language autoselect)

fastcgi support (FCGI perl module required) [but mod_perl 2.5x faster]

order_rev -> order_mode : now possible /index.cgi?...&order!=size

%out -> $config{out}
$config{'cl_ip* -> $config{'client_ip*

stemmer fixes
$config{'use_stem'} tested (fulltext + stem indexes, 'accurate' selector)
$config{'use_stem'} = 1; # latest version (4)
$config{'use_stem'} = 2; # 0.18.1 vesion

$config{'use_stem'} = 1; #NOW DEFAULT recommended mysqld var: ft_min_word_len = 1 
for old base use $config{'use_stem'} = 0;

crawler.pl can run from any dir (cd /tmp && /uar/local/www/search/crawler.pl stat)

dc show user ip by default. $config{'no_ip'} = 1; to disable

crawler.pl upload=10 - like proc, start 10 uploads


$config{'allow_flush'} -> !$config{'no_flush'}


0.18.1: freebsd ports fix

default dir www/prosearch -> www/search


0.18.0: test release. DATABASE REINSTALL RECOMMENDED

new sql lib with mysql, pgsql, sqlite support. BUGS MUST BE HERE 8)

mod_perl broken

insert filters (size, ext, ..., ANY!), see config.pl.dist
default xxx filter removed, uncomment if you need

many (not all) $config{'sql_*'} -> $config{'sql'}{'*'}
$config{'sql_query_check'} -> $config{'sql'}{'auto_check'}

bugs fixed (dc upload empty dir, web:proxy:dchub:fine, firefox ftp://user:pass@host link, web highlight russian words, ftp max_errors, dc negative scan time, UTF & codepages,)

stemming support
# $config{'use_stem_only'} = 1; #fuzzy search #PREINSTALL # recommended mysqld var: ft_min_word_len = 1 

mod_perl: $ENV{'PROSEARCH_PATH'} now not used

default unix install path changed to /usr/local/www/prosearch/

crawler.pl file.txt   ==   crawler.pl file=file.txt
crawler.pl scan=host.ru   ==   crawler.pl host.ru


q= parsing quoted params with space :
 q=ext:"zzz ttt" -> ext=zzz ttt
 q=ext="zzz ttt" -> ext=zzz ttt

dcbot: faster download

$config{'search_url'} ->  $config{'root_url'} 
$config{'search_name'} -> $config{'title'}
$config{'dcbot_upload'} -> !$config{'no_dcbot_upload'}
$config{'search_name'} -> $config{'title'} 
$config{'search_name_head'} -> $config{'search_name'}
$config{'allow_online'} -> !$config{'no_online'}

web SECURITY XSS(Cross-Site Scripting) fixed

# $config{'use_dc_only'} = 1;	# use for ONLY dc filebase(optimize,disable range-ftp-file). uncomment BEFORE install.

crawler params: --xxx__yyy=zzz  == $config{'xxx'}{'yyy'} = 'zzz';

$config{'on_interrupt'} ||= 1; # restart scan on next start if previous was interrupted 

config params $processor{'prot'}* -> $config{'scanner'}*

$config{'range_period'} -> $config{'fine'}{'range'}{'period'}

dc users with 0 share now ignored

crawler dcbot: $config{'no_dl'} -> $config{'no_dcbot_download'}

added row enabled on resource, host, ranges. if you want to add row in filebase (dont forget drop)
#  $config{'sql'}{'table'}{ $config{'sql_tfile'} }{'added'} = pssql::row( 'added', 'array_insert' => 1, );    #'index' => 1,

 $param{'page'} -> $param{'show'}
 pre_query() -> $self->{'limit_calc'}
 $stat{'show_page'} -> $self->{'page'}
 $work{'on_page'} -> $self->{'limit'}
 $stat{'show_from'} ->  $self->{'limit_offset'}
 $config{'max_results'} -> $self->{'limit_max'}
 $config{'on_page'} -> $self->{'limit'}
 $stat{'founded_files'} -> $self->{'founded'}
 $stat{'dbirows'} -> $self->{'dbirows'}
 $stat{'maxpage'} -> $self->{'page_last'}

changed log logic

$config{'no_img'} -> $config{'no_player'}

unix: crawler: siginfo support (ctrl-t)


0.17.2: 

bugs fixed (web: cosmetic, dcbot: double GetNickList, get filelist from one 
 user without/, microdc2 filelist parse,  ipdig, autorepair, ftp: in ls: user or group = (?) (deleted user),
 purge, ..)

perltidy -b -i=2 -ce -l=128 -nbbc -sob -otr -sot *.pm *.pl *.cgi

old code comments deleted

internal changes: left join, flush tables, 

web: ftp://,file:// proxy download (client < search < (ftp|file) server)
$config{'allow_proxy'} ||= 0;
$config{'proxy_server_mask'}   ||= $config{'local_mask'};      
$config{'proxy_client_mask'}   ||= $config{'local_mask'};      

config names changed:
$config{'allow_dcdl'} -> $config{'allow_proxy'}
$config{'dchub_param'} -> $config{'fine'}{'dchub'}
$config{'myip'}		->	$config{'fine'}{'dchub'}{'myip'}
$config{'myport'}	->	$config{'fine'}{'dchub'}{'myport'}
$config{'dcbot_name'}	->	$config{'fine'}{'dchub'}{'Nick'}
$config{'dchub'}{hubname} -> $config{'fine'}{hubname}
$config{'ftp_timeout'} -> $config{'timeout'}
$config{'allow_query_expansion'} -> $config{'fulltext_extra'} = ''; # 'IN BOOLEAN MODE' or 'IN NATURAL LANGUAGE MODE' or 'IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION' or 'WITH QUERY EXPANSION'
$config{'ftp_max_errors'} -> $config{'max_errors'}


README + FAQ + BUGS => doc/doc.html


0.17.1: big BUGFIX for 0.17.0:

bugs fixed (res size when scan, debug=on, dcbot bugs)

$processor{'prot'}{*}{'webadd'} =>  !$processor{'prot'}{*}{'no_webadd'}

mod_perl conf() cache = x2 faster
all configs MUST be like:
conf( sub { 
 $config{'xxx'} = 1;
 $config{'yyyyy'} = 'utf-8';
}); 
do '/usr/local/www/search/zzzzz.pl'; # outside of conf()
conf( sub {
 $config{'cccc'} = 555;
 $config{'nnnn'} = 'fff';
}); 

crawler: $skipfromip -> $static{'banned'}
banned now can contain ranges,ips,resources 

web: all <a> now is sorted (via mylink)

new presets system in main query (&q=:preset):
 look $config{'preset'}
 list: search/?page=presets
 example:
  search/?q=:nero
  search/?q=:video
 you can combine:
  search/?q=:month%20:exec
 and now this is possible:
  http://search.setun.net/?q=%3Acd+%3Ayear+%3Asetun.net+size%3E%3D600m


0.17.0: 
Database structure changed! you must reinstall base 
 perl crawler.pl drop install
or upgrade:
 perl crawler.pl upgrade

bugs fixed (DC lists filenames, mysql utf8, query stat first symbol, cp, js, ie, 
 windows correct closing crawler, proc start user err, one string, auto lang, ...)

http cp autodetect

ipdig now eats [prot://][pass:][user@]host[:port]
ipdig params names changes

similar queryes

sql: sleep & retry on ANY sql errors except $config{'dbi_err_fatal'} and $config{'dbi_err_syntax'}

web config changes:
*_left ->               _bef
*_top->                 _bef
*_pre_left ->           _bef_bef
*_pre_top ->            _bef_bef
*_right ->              _aft
*_bottom ->             _aft
*_post_bottom ->        _aft_aft
*_post_right ->         _aft_aft
form-* ->               html_form-

mod_perl only: cached *.js cp trans -> client codepage

new table definition system. BUGS      BE      8)
internal changes                  MUST    HERE

auto install if database or tables not exists - enabled by default
  $config{'allow_auto_install'} ||= 1;

  mode normal -> standart

web pinger: optimized

media, swf, images on page player

per file voting enabled by default (need upgrade)

file types images

web:vote without popup win

$config{'web_adder_mask'} - regexp ip who can add resources - default: local ips
host_by_name_norm using in web add


0.16.1:

bugs fixed (.toscan names, web cosmetic, web dl stat bug, )

ipdig now eats dns names 

repair options
  $config{'rep_quick'} ||= 1; # repair option QUICK
  $config{'rep_ext'} ||= 0; # repair option EXTENDED
  $config{'rep_frm'} ||= 0; # repair option USE_FRM

web: add: auto detect user services

web: normal mode restored

cookie params filter

dc: $config{'myip'} now automatic


0.16.0: 

new style added (unfinished)

bugs fixed (desc param parse and xml output, cookies saving, english lang fixes,
            period change bug, various periods in table, dc fixes(can download files from dctc client),
            xml codepage, auto repair, mod_perls, sql errors  handler, port makefile)

mod_perl tested and enabled by default (in .htaccess if installed)

new feature !DEV BUGS!
to enable web - dc - downloading (without client)
$config{'allow_dcdl'} = 1;

new ./tmp/* filenames, new ./var/*.toscan filenames. 
please delete old ./var/*.toscan

web: my last query, my last dl (via cookies)

nmap: -PE option added (true ping (ICMP echo request) packet.)

autorepair now enabled:
  $config{'allow_auto_repair'}  ||= 15; # or number 10-30 

you can enable better? dchub client pinging. 
$config{'allow_ping_cl_port'} = 1;
New port column in host table. need to run  perl crawler.pl upgrade

crawler: file=-  - open STDOUT, also for hublist=-

removed old codepages names (win utf dos koi iso)

web: hide user or pass : $config{'hide_user'} $config{'hide_pass'}

web: codepage selector
$config{'allow_cp_select'} = 1;

utf8 now is utf-8

web in utf codepage fixes

$config{'filetype'} -> $config{'ext'}
$config{'where'} -> $config{'host'}
$config{'filetype'}{*}{'types'} -> $config{'ext'}{*}{'to'} 
$config{'where'}{*}{'host'} -> $config{'host'}{*}{'to'} 

query short names:
ext=:video -> ext=mpg|mpe|mpeg|...
 for all $config{'ext'}
hint:  ext=:archive|:cd
you can configure something like:
$config{'host'}{'servers'}{'to'} = '10.20.30.40|10.200.100.*';
and query
host=:servers

q= parsing params:
 q=ext:zzz -> ext=zzz
 q=ext=zzz -> ext=zzz
 q=ext::video or q=ext=:video -> ext=:video -> ext=mpg|mpe|mpeg|...


configs rename:
 $config{'cp_web'} -> $config{'codepage'}       
 $config{'mode_default'} -> $config{'mode'} 
 $config{'advanced'} -> $config{'specify'} 


0.15.18: 

bugs fixed (xml, time::hires, DC [hub pass, ...], ...)

internal changes

crawler now save DC user ip
web now ping by this ip 

multi order

rsync:// protocol support


0.15.17: 

bugs fixed

if $config{'skip_hidden'} defined (default ) skipping .* filenames (.htaccess, ..)

new logging
debug -> log_dbg
sql_debug -> log_dmp

internal changes: cp_trans(..)
utf8 codepage now fully supported, other codepages renamed
ukr symbols added
new variables:
  $config{'cp_dc_list'} ||= 'utf8';
  $config{'cp_dc_hub'}  ||= 'cp1251';
  $config{'cp_config'}  ||= 'cp1251'; # this config
  $config{'cp_web'}     ||= 'cp1251'; # index.cgi output


any DBD::mysql connect params supported.
example: for mysql_socket: 
$config{'sql_mysql_socket'} = '/tmp/mysql.sock';

defaults reversed: now:
  $config{'scan_dead'}  ||= 1; # dead host/ranges = was scanned and have 0 bytes
  $config{'scan_online_res'}    ||= 0; #scan resources ONLY if they was pinged at last $config{'online_minutes'}


automatic repair:
$config{'allow_auto_repair'} = 20;  

perl -w  warnings fixed.

dcbot: one user: perl crawler.pl dchub://hub/user

dcbot passive downloading support, ADCGet support

0.15.16: 

bugs fixed

dcbot in crawler tested and fixed

hostname resolving: deleted $config{'host_by_name'}
 added  $config{'nmap_nores'}   ||= ' -n ';
 to enable nmap resolv: $config{'nmap_nores'} = '';

mod_perl now can work with PerlResponseHandler ModPerl::Registry, but need more fixes

web/.htaccess renamed to web/.htaccess.dist

ftp codepage autodetect (dev.. need testing)
$config{'allow_cp_detect'} = 1;


0.15.15: 

bugs fixed

changed distributive name from search.* to pro-search.*

sql table: ranges table changed, run: perl crawler.pl drop=ranges install
                                  or  perl crawler.pl upgrade
and reload ranges.

top 100 resources - goto page ..

crawler and index now prints config.pl error

web: output modes - normal, easy, lite, ...

web: new output system - big changes in config params

to run proc once and exit after all scans -
$config{'once'} = 1;
OR
perl crawler.pl proc=3 --once

web: This Page Is Valid XHTML 1.0 Strict!
but now with $config{'w3_org_valid'}=1;
web: Congratulations! This document validates as CSS! 
web: Congratulations! This is a valid RSS feed.

web: Shortcuts: keys ctrl + <- , ctrl + ->, ctrl + ^ - prev page, next page, focus to query string

New css system, css style switcher

Cokies!
 params 
  @{ $config{'user_param_save'} } = qw(show_how lang css advanced expert); 
 now saved in cookies.

now is possible to define many parametrs with one name
old: &q=one&q1=two    crawler.pl file=fileone file1=filetwo
now: &q=one&q=two     crawler.pl file=fileone file=filetwo

web: param: show_adv -> advanced

ftp://host/russian and not standart ([]!@#$%^)symbols/now/possible/to/click/in/fucked/ie

web: expert mode
 &expert=on
 %-)

regexp enabled
 $config{'allow_regex'} ||= 1; # allow regex queryes
 example:
 &host=re:10\.10\.[12].?\..+

now possible: 
 /search/?[size time prot host path name ext ...][!<>=~]=100
 example:
 /search/?size>=100

$config{'up_link'} now is $config{'custom_link'}, and must be full html:
 '<a href="../">..</a>'

$config{'web_max_time'} now splitted to
 $config{'web_max_search_time'} ||= 60; 
 $config{'web_max_finish_time'} ||= 60; 


search in founded or complex queryes now perfect 8)

internal changes (new proc)
now every action can be runned from proc

dcbot now in crawler 
 hublist load:
 perl crawler.pl hublist[=url or file]
 perl crawler.pl --noscan dchub://hub.com dchub://hub2.com 
 or load and scan:
 perl crawler.pl dchub://hub.com dchub://hub3.com 
 or you can add hubs to 'range' file and load it with
 perl crawler.pl --noscan file=range

0.15.14: 

bugs fixed (user:pass handling, ....)

$config{'allow_search_in_founded'} = 1; #now works!

http titles fixed

  $config{'res_list_min_size'} ||= 100; # if 1 - show http in top list


$config{'allow_topfiles'}       = 1; # top downloaded files
if you upgrade run  
 perl crawler.pl install 
to enable

internal sql query changes

$config{'web_max_time'} = 60; #force close after 60 seconds (buggy DB, heavy load)

$config{'show_query_ip'} = 0; # query last: mobile[172.16.1.17](21s) games[172.16.1.17](2m)  

web: add/rescan key to rescan resource by user request
  $config{'allow_web_rescan'} = 0;
  $config{'web_rescan_other'} = 0; # allow rescan any host
  $config{'web_rescan_min_time'}        = 60 * 60; 

crawler.pl file= param can be ftp://.. http://.. file://..

dcbot: locktokey now works!

dcbot: upload without ftp scans

mysql4 wrong query for mrtg 


0.15.13: 

dc: bot repaired.

dc: $config{'hublist'}  = '[ftp|file|http]://zz/z/z.xml[.bz2]';

web: tigers removed from desc, tiger can be defined in files.bbs

web: voting display changed


0.15.12: first dc test

download counting now is hidden (direct href's with hidden dynamic image)

dc hublist support
$config{'hublist'} = 'http://hub.lan/hublist.xml';

dc multihub support 
dc new config
new hub config:
$config{'dchub'}{'cool_hub_name'} = {
  'host' => 'dcpp.hub.ru',
  'port' => 4111, 
};
$config{'myip'}   = '10.20.30.40'; # Must be. for dc incoming connections
$config{'myport'} = '6778'; # incoming dc port
new dc table creation:
see config.pl.dist:
$config{'use_dc'} = 1;

new confdef:
  $config{'dcbot_sleep'} ||= 60 * 60 * 1; # sleep after all dchubs scan: 1 hour
  $config{'dcbot_upload'} = 1; #upload filelists to DB after downloading
  $config{'dcbot_once'} = 0; # scan one times and exit


new action: deldead - delete all resources with 0 files

bugs fixed

0.15.11:

now we count skipped files in resource

votes system 
if you upgrade from previous version run
 perl crawler.pl upgrade
or reinstall tables:
 perl crawler.pl drop install

pinger now pings file:// resources

bugs fixed(http redirector, alt, ftp get)

Yahaa! now we have sf.net page: 
http://sourceforge.net/projects/pro-search/


0.15.10:

mysql4 and mysql3 bugs fixed
 $config{'sql_driver'} = 'mysql5';  # or mysql4 or mysql3


0.15.9: 

lot of bugs added
, some bugs fixed

ftp crawling algo changed [need to teeeest]

web ftp add: supports user:pass@host:port/path url, many url separated by space

ftp user pass support ftp://user:pass@host:port/path

ftp passive mode auto detect [need to more tests]

ftp passive mode
for one host: $config{'fine'}{'host-name-or-ip'}{'passive'} = 1;
for all hosts: $config{'passive'} = 1;

0.15.8:

bugs fixed


0.15.7:

serious bugs fixed

dc very alfa bot


0.15.6:

bugs fixed, new added 8(

new + - links in web results

new actions: one_range one_res  used by  proc

sigtrap's NEED TO TEST

help writing started. enable help : $config{'help_dev'} = 1;
help files: web/help_*.js

fixed crawler filters

$config{'more_examples'} now is $config{'example_bottom'}
$config{'example_bottom'} now is $config{'main_bottom'}

disabled $config{'mrtg_out'} - default output for mrtg to screen
added crawler.pl mrtg=online 
for online files and bytes


0.15.5:

bugs fixed

new $config{'range_period'} (was $config{'periodic_time'})

action killer for biiiiiig bases

hashes removed by default, for old bases use $config{'keep_hash'} = 1;


0.15.4:

+ $config{'allow_pinger'} = 1; # new online web pinger - enabled by default  

+ $config{'scan_online_res'} = 1; #scan resources ONLY if they was pinged at last $config{'online_minutes'}

fixed broken web online

fixed redirector

in css all "_" in class name changed to "-"

ftp recurse mode now save and load toscan list if scan fails

...................................................

$config{'sql_mysql'} -> $config{'mysql'}

$config{'nmap_prog'} -> $config{'nmap'}

cvs -> svn

popular (download stat)

 $config{'sql_name'}  $config{'sql_user'} 

RSS web output [devel]

 $config{'sql'}{'  $config{'sql_
  $config{'sql'}{'mysql'}   $config{'sql_mysql'}
 -    crawler.pl    $config 
perl crawler.pl conf-log_screen=1 conf-sql_mysql=/usr/local/mysql/bin/mysql 

auto detect disabled ls -R (vsftpd)

samba support

Cvs started


