Scott Seago
2008-Jul-30 22:36 UTC
[Ovirt-devel] [PATCH] includes basic search functionality.
For this to work, you've got to have xapian-core and xapian-bindings-ruby installed. The latter is currently only available in updates-testing, but apevec is adding it to the ovirt repo for now. In addition, there's a new migration required here, so you'll nede to run rake db:migrate. Currently we do a full reindex when restarting mongrel, and incremental index updates are not yet in place. Also missing is permission checking. I've included basic fields in the search parameters -- we should look through and determine if any additional attributes/associations need to be included. Also, the styling is _mostly_ consistent with tallen's mockups. Notable differences here: 1) inline search field needs css and possibly a new image. 2) I need "selected" and "unselected" images for the type pulldown -- currently I'm just using a text "X" for selected and for unselected 3) No "smart" pool column or pulldown, since that functionality is not yet implemented 4) I've added a ranking column to the results 5) details pane works here just like it does for hosts, vms, etc. on other pages. Signed-off-by: Scott Seago <sseago at redhat.com> --- wui/conf/ovirt-mongrel-rails | 3 +- wui/ovirt-wui.spec | 6 + wui/scripts/ovirt-reindex-search | 4 + wui/scripts/ovirt-update-search | 4 + wui/src/app/controllers/search_controller.rb | 110 ++++ wui/src/app/models/host.rb | 13 + wui/src/app/models/pool.rb | 8 + wui/src/app/models/storage_pool.rb | 5 + wui/src/app/models/vm.rb | 8 + wui/src/app/models/vm_resource_pool.rb | 2 +- wui/src/app/views/layouts/_header_redux.rhtml | 4 +- wui/src/app/views/layouts/_navigation_tabs.rhtml | 4 + wui/src/app/views/layouts/_tree.rhtml | 6 +- wui/src/app/views/search/_grid.rhtml | 40 ++ wui/src/app/views/search/results.rhtml | 68 +++ wui/src/db/migrate/011_create_acts_as_xapian.rb | 14 + wui/src/dutils/active_record_env.rb | 1 + wui/src/vendor/plugins/acts_as_xapian/.gitignore | 3 + wui/src/vendor/plugins/acts_as_xapian/LICENSE.txt | 21 + wui/src/vendor/plugins/acts_as_xapian/README.txt | 236 ++++++++ .../acts_as_xapian/generators/acts_as_xapian/USAGE | 1 + .../acts_as_xapian/acts_as_xapian_generator.rb | 13 + .../acts_as_xapian/templates/migration.rb | 14 + wui/src/vendor/plugins/acts_as_xapian/init.rb | 9 + .../plugins/acts_as_xapian/lib/acts_as_xapian.rb | 629 ++++++++++++++++++++ .../plugins/acts_as_xapian/tasks/xapian.rake | 43 ++ 26 files changed, 1262 insertions(+), 7 deletions(-) create mode 100755 wui/scripts/ovirt-reindex-search create mode 100755 wui/scripts/ovirt-update-search create mode 100644 wui/src/app/controllers/search_controller.rb create mode 100644 wui/src/app/views/search/_grid.rhtml create mode 100644 wui/src/app/views/search/results.rhtml create mode 100644 wui/src/db/migrate/011_create_acts_as_xapian.rb create mode 100644 wui/src/vendor/plugins/acts_as_xapian/.gitignore create mode 100644 wui/src/vendor/plugins/acts_as_xapian/LICENSE.txt create mode 100644 wui/src/vendor/plugins/acts_as_xapian/README.txt create mode 100644 wui/src/vendor/plugins/acts_as_xapian/generators/acts_as_xapian/USAGE create mode 100644 wui/src/vendor/plugins/acts_as_xapian/generators/acts_as_xapian/acts_as_xapian_generator.rb create mode 100644 wui/src/vendor/plugins/acts_as_xapian/generators/acts_as_xapian/templates/migration.rb create mode 100644 wui/src/vendor/plugins/acts_as_xapian/init.rb create mode 100644 wui/src/vendor/plugins/acts_as_xapian/lib/acts_as_xapian.rb create mode 100644 wui/src/vendor/plugins/acts_as_xapian/tasks/xapian.rake diff --git a/wui/conf/ovirt-mongrel-rails b/wui/conf/ovirt-mongrel-rails index 9c770ac..7cfaf2d 100755 --- a/wui/conf/ovirt-mongrel-rails +++ b/wui/conf/ovirt-mongrel-rails @@ -21,7 +21,7 @@ PREFIX="${PREFIX:-/ovirt}" MONGREL_PROG=mongrel_rails ADDR=127.0.0.1 - +REINDEX_PROG=/usr/sbin/ovirt-reindex-search RETVAL=0 . /etc/init.d/functions @@ -29,6 +29,7 @@ RETVAL=0 start() { echo -n "Starting ovirt-mongrel-rails: " + RAILS_ENV=$RAILS_ENVIRONMENT $REINDEX_PROG $MONGREL_PROG start -c $OVIRT_DIR -l $MONGREL_LOG -P $MONGREL_PID \ -a $ADDR -e $RAILS_ENVIRONMENT --user $USER --group $GROUP \ -d --prefix=$PREFIX diff --git a/wui/ovirt-wui.spec b/wui/ovirt-wui.spec index 9ed33ca..9dda52a 100644 --- a/wui/ovirt-wui.spec +++ b/wui/ovirt-wui.spec @@ -21,6 +21,8 @@ Requires: rubygem(krb5-auth) >= 0.6 Requires: ruby-gettext-package Requires: postgresql-server Requires: ruby-postgres +Requires: xapian-bindings-ruby +Requires: xapian-core Requires: pwgen Requires: httpd >= 2.0 Requires: mod_auth_kerb @@ -99,6 +101,8 @@ touch %{buildroot}%{_localstatedir}/log/%{name}/host-status.log %{__cp} -a %{pbuild}/scripts/ovirt-add-host %{buildroot}%{_bindir} %{__cp} -a %{pbuild}/scripts/ovirt-wui-install %{buildroot}%{_sbindir} +%{__cp} -a %{pbuild}/scripts/ovirt-reindex-search %{buildroot}%{_sbindir} +%{__cp} -a %{pbuild}/scripts/ovirt-update-search %{buildroot}%{_sbindir} %{__rm} -rf %{buildroot}%{app_root}/tmp %{__mkdir} %{buildroot}%{_localstatedir}/lib/%{name}/tmp %{__ln_s} %{_localstatedir}/lib/%{name}/tmp %{buildroot}%{app_root}/tmp @@ -158,6 +162,8 @@ fi %files %defattr(-,root,root,0755) %{_sbindir}/ovirt-wui-install +%{_sbindir}/ovirt-reindex-search +%{_sbindir}/ovirt-update-search %{_bindir}/ovirt-add-host %{_initrddir}/ovirt-host-browser %{_initrddir}/ovirt-host-status diff --git a/wui/scripts/ovirt-reindex-search b/wui/scripts/ovirt-reindex-search new file mode 100755 index 0000000..9fef717 --- /dev/null +++ b/wui/scripts/ovirt-reindex-search @@ -0,0 +1,4 @@ +#!/bin/bash +RAKEFILE=/usr/share/ovirt-wui/Rakefile +MODELS="Host Vm IscsiStoragePool NfsStoragePool HardwarePool VmResourcePool" +rake -f $RAKEFILE xapian:rebuild_index models="$MODELS" diff --git a/wui/scripts/ovirt-update-search b/wui/scripts/ovirt-update-search new file mode 100755 index 0000000..b6add27 --- /dev/null +++ b/wui/scripts/ovirt-update-search @@ -0,0 +1,4 @@ +#!/bin/bash +RAKEFILE=/usr/share/ovirt-wui/Rakefile +MODELS="Host Vm IscsiStoragePool NfsStoragePool HardwarePool VmResourcePool" +rake -f $RAKEFILE xapian:update_index models="$MODELS" diff --git a/wui/src/app/controllers/search_controller.rb b/wui/src/app/controllers/search_controller.rb new file mode 100644 index 0000000..ca4fbb1 --- /dev/null +++ b/wui/src/app/controllers/search_controller.rb @@ -0,0 +1,110 @@ +# +# Copyright (C) 2008 Red Hat, Inc. +# Written by Scott Seago <sseago at redhat.com> +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; version 2 of the License. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, +# MA 02110-1301, USA. A copy of the GNU General Public License is +# also available at http://www.gnu.org/copyleft/gpl.html. + +class SearchController < ApplicationController + + MODELS = {"HardwarePool" => {:controller => "hardware", + :show_action => "quick_summary"}, + "VmResourcePool" => {:controller => "resources", + :show_action => "quick_summary"}, + "Host" => {:controller => "host", + :show_action => "show"}, + "Vm" => {:controller => "vm", + :show_action => "show"}, + "IscsiStoragePool" => {:controller => "storage", + :show_action => "show"}, + "NfsStoragePool" => {:controller => "storage", + :show_action => "show"}} + + MULTI_TYPE_MODELS = {"StoragePool" => ["IscsiStoragePool", "NfsStoragePool"]} + + + def single_result + class_and_id = params[:class_and_id].split("_") + + + redirect_to :controller => MODELS[class_and_id[0]][:controller], + :action => MODELS[class_and_id[0]][:show_action], + :id => class_and_id[1] + end + + def results_internal + @terms = params[:terms] + @model_param = params[:model] + @model_param ||= "" + + if @model_param == "" + @models = MODELS.keys + else + @models = MULTI_TYPE_MODELS[@model_param] + @models ||= [@model_param] + end + @user = get_login_user + + @page = params[:page].to_i + @page ||= 1 + @per_page = params[:rp].to_i + @per_page ||= 20 + @offset = (@page-1)*@per_page + @results = ActsAsXapian::Search.new(@models, + @terms, + :offset => @offset, + :limit => @per_page, + :sort_by_prefix => nil, + :collapse_by_prefix => nil) + #FIXME filter on permissions + end + + def results + results_internal + @types = [["Hardware Pools", "HardwarePool"], + ["Virtual Machine Pools", "VmResourcePool"], + ["Hosts", "Host"], + ["VMs", "Vm"], + ["Storage Pools", "StoragePool", "break"], + ["Show All", ""]] + end + + def results_json + results_internal + + + json_hash = {} + json_hash[:page] = @page + json_hash[:total] = @results.matches_estimated + json_hash[:rows] = @results.results.collect do |result| + item_hash = {} + item = result[:model] + item_hash[:id] = item.class.name+"_"+item.id.to_s + item_hash[:cell] = ["display_name", "display_class"].collect do |attr| + if attr.is_a? Array + value = item + attr.each { |attr_item| value = value.send(attr_item)} + value + else + item.send(attr) + end + end + item_hash[:cell] << result[:percent] + item_hash + end + render :json => json_hash.to_json + + end +end diff --git a/wui/src/app/models/host.rb b/wui/src/app/models/host.rb index 80acaeb..33a5fa1 100644 --- a/wui/src/app/models/host.rb +++ b/wui/src/app/models/host.rb @@ -40,6 +40,12 @@ class Host < ActiveRecord::Base end end + acts_as_xapian :texts => [ :hostname, :uuid ], + :values => [ [ :created_at, 0, "created_at", :date ], + [ :updated_at, 1, "updated_at", :date ] ], + :terms => [ [ :hostname, 'H', "hostname" ] ] + + KVM_HYPERVISOR_TYPE = "KVM" HYPERVISOR_TYPES = [KVM_HYPERVISOR_TYPE] STATE_UNAVAILABLE = "unavailable" @@ -75,4 +81,11 @@ class Host < ActiveRecord::Base def cpu_speed "FIX ME!" end + + def display_name + hostname + end + def display_class + "Host" + end end diff --git a/wui/src/app/models/pool.rb b/wui/src/app/models/pool.rb index 4a21c4a..4aa240b 100644 --- a/wui/src/app/models/pool.rb +++ b/wui/src/app/models/pool.rb @@ -75,6 +75,8 @@ class Pool < ActiveRecord::Base end end + acts_as_xapian :texts => [ :name ] + # this method lists pools with direct permission grants, but does not # include implied permissions (i.e. subtrees) def self.list_for_user(user, privilege) @@ -225,6 +227,12 @@ class Pool < ActiveRecord::Base obj.send(method, *args) end + def display_name + name + end + def display_class + get_type_label + end protected def traverse_parents if id diff --git a/wui/src/app/models/storage_pool.rb b/wui/src/app/models/storage_pool.rb index a135047..39b6a08 100644 --- a/wui/src/app/models/storage_pool.rb +++ b/wui/src/app/models/storage_pool.rb @@ -32,6 +32,7 @@ class StoragePool < ActiveRecord::Base validates_presence_of :ip_addr, :hardware_pool_id + acts_as_xapian :texts => [ :ip_addr, :target, :export_path ] ISCSI = "iSCSI" NFS = "NFS" STORAGE_TYPES = { ISCSI => "Iscsi", @@ -55,4 +56,8 @@ class StoragePool < ActiveRecord::Base def get_type_label STORAGE_TYPES.invert[self.class.name.gsub("StoragePool", "")] end + def display_class + "Storage Pool" + end + end diff --git a/wui/src/app/models/vm.rb b/wui/src/app/models/vm.rb index b607886..34d5bf4 100644 --- a/wui/src/app/models/vm.rb +++ b/wui/src/app/models/vm.rb @@ -31,6 +31,8 @@ class Vm < ActiveRecord::Base validates_presence_of :uuid, :description, :num_vcpus_allocated, :memory_allocated_in_mb, :memory_allocated, :vnic_mac_addr + acts_as_xapian :texts => [ :uuid, :description, :vnic_mac_addr, :state ] + BOOT_DEV_HD = "hd" BOOT_DEV_NETWORK = "network" BOOT_DEV_CDROM = "cdrom" @@ -185,6 +187,12 @@ class Vm < ActiveRecord::Base (state == Vm::STATE_RUNNING ) and host and vnc_port end + def display_name + description + end + def display_class + "VM" + end protected def validate resources = vm_resource_pool.max_resources_for_vm(self) diff --git a/wui/src/app/models/vm_resource_pool.rb b/wui/src/app/models/vm_resource_pool.rb index de8bcb3..d3f7d43 100644 --- a/wui/src/app/models/vm_resource_pool.rb +++ b/wui/src/app/models/vm_resource_pool.rb @@ -20,7 +20,7 @@ class VmResourcePool < Pool def get_type_label - "Hardware Pool" + "Virtual Machine Pool" end def get_controller return 'resources' diff --git a/wui/src/app/views/layouts/_header_redux.rhtml b/wui/src/app/views/layouts/_header_redux.rhtml index 6dbf0d0..18014ab 100644 --- a/wui/src/app/views/layouts/_header_redux.rhtml +++ b/wui/src/app/views/layouts/_header_redux.rhtml @@ -2,8 +2,8 @@ <div class="header_info"> <div id="hi-username">Hi, <%= @user %></div> - <form id="search-form" action="globalSearch.jspa"> - <input id="textfield_effect" name="q" value="Search" onkeypress="" onfocus="if( this.value == this.defaultValue ) this.value='';" type="text"> + <form method="POST" id="search-form" action="<%= url_for :controller => "search", :action => 'results' %>"> + <input id="textfield_effect" name="terms" value="Search" onkeypress="" onfocus="if( this.value == this.defaultValue ) this.value='';" type="text"> <input id="searchbox-button" src="<%= image_path "icon_search.png"%>" title="Search" type="image"> | </form> <a id="help-link" href="#" ><%= image_tag "icon_help.png" %></a> <!-- FIXME wire link correctly --> diff --git a/wui/src/app/views/layouts/_navigation_tabs.rhtml b/wui/src/app/views/layouts/_navigation_tabs.rhtml index 0db49b6..771958c 100644 --- a/wui/src/app/views/layouts/_navigation_tabs.rhtml +++ b/wui/src/app/views/layouts/_navigation_tabs.rhtml @@ -28,4 +28,8 @@ <li id="nav_vmpool"> <%= link_to "Virtual Machines", {:action => 'show_vms', :id => @vm_resource_pool.id, :nolayout => :true}, :title => "content area" %></li> <li id="nav_access"> <%= link_to "User Access", {:action => 'show_users', :id => @vm_resource_pool.id, :nolayout => :true}, :title => "content area" %></li> </ul> +<% elsif controller.controller_name == "search" %> + <ul id="resources_nav_tabs" class="ui-tabs-nav"> + <li id="nav_search" class="ui-tabs-selected"><a href="#">Search Results</a></li> + </ul> <% end %> \ No newline at end of file diff --git a/wui/src/app/views/layouts/_tree.rhtml b/wui/src/app/views/layouts/_tree.rhtml index 182c1b1..8b996cc 100644 --- a/wui/src/app/views/layouts/_tree.rhtml +++ b/wui/src/app/views/layouts/_tree.rhtml @@ -19,9 +19,9 @@ hardware_url: "<%= url_for :controller =>'/hardware', :action => 'show' %>", resource_url: "<%= url_for :controller =>'/resources', :action => 'show' %>" } - $('#test-tree').everyTime(10000,function(){ - load(tree_reload, {}, this, this); - }) + //$('#test-tree').everyTime(10000,function(){ + // load(tree_reload, {}, this, this); + //}) }); </script> diff --git a/wui/src/app/views/search/_grid.rhtml b/wui/src/app/views/search/_grid.rhtml new file mode 100644 index 0000000..6de9074 --- /dev/null +++ b/wui/src/app/views/search/_grid.rhtml @@ -0,0 +1,40 @@ +<% per_page = 40 %> +<div id="<%= table_id %>_div"> +<%= '<form id="#{table_id}_form">' if checkboxes %> +<table id="<%= table_id %>" style="display:none"></table> +<%= '</form>' if checkboxes %> +</div> +<script type="text/javascript"> + $("#<%= table_id %>").flexigrid + ( + { + url: '<%= url_for :controller => "search", + :action => "results_json" %>', + params: [{name: "terms", value: '<%=terms%>'}, + {name: "model", value: '<%=model%>'}, + {name: "checkboxes", value: <%=checkboxes%>}], + dataType: 'json', + colModel : [ + <%= "{display: '', width : 20, align: 'left', process: #{table_id}checkbox}," if checkboxes %> + {display: 'Name', width : 200, align: 'left'}, + {display: 'Type', width : 120, align: 'left'}, + {display: 'Rank', width : 60, align: 'left'} + ], + //sortname: "hostname", + //sortorder: "asc", + usepager: true, + useRp: true, + rp: <%= per_page %>, + showTableToggleBtn: true, + onSelect: <%= on_select %>, + onDeselect: false, + onHover: false, + onUnhover: false + } + ); + function <%= table_id %>checkbox(celDiv) + { + $(celDiv).html('<input class="grid_checkbox" type="checkbox" name="grid_checkbox'+$(celDiv).html()+'" value="'+$(celDiv).html()+'"/>'); + } + +</script> diff --git a/wui/src/app/views/search/results.rhtml b/wui/src/app/views/search/results.rhtml new file mode 100644 index 0000000..a7641b7 --- /dev/null +++ b/wui/src/app/views/search/results.rhtml @@ -0,0 +1,68 @@ +<div id="toolbar_nav"> +<form method="POST" id="search-form" action="<%= url_for :controller => "search", :action => 'results' %>"> +<ul> + <li> + <input id="searchform-field" name="terms" value="<%=@terms%>" onkeypress="" type="text"> + <input id="searchform-button" src="<%= image_path "icon_search.png"%>" title="Search" type="image"> + <input id="searchform-model" type="hidden" name="model" value="<%=@model_param%>"> + </li> + <li> + <%= image_tag "icon_move.png", :style => "vertical-align:middle;" %> Actions <%= image_tag "icon_toolbar_arrow.gif", :style => "vertical-align:middle;" %> + <ul> + <% @types.each_index { |index| %> +<!-- for each button we need to submit current form with "model" set to @types[index][1] --!> + <li onclick="$('#searchform-model').val('<%=@types[index][1]%>'); $('#searchform-button').click();" + <% if (index == @types.length - 1) or @types[index].length == 3 %> + style="border-bottom: 1px solid #CCCCCC;" + <% end %> + > +<!-- < % = image_tag ... --> + <%= @model_param == @types[index][1] ? "X" : " " %> + <%=@types[index][0]%> + </li> + <% } %> + </ul> + </li> +</ul> +</form> +</div> + +<script type="text/javascript"> + function results_select(selected_rows) + { + var selected_ids = new Array() + for(i=0; i<selected_rows.length; i++) { + selected_ids[i] = selected_rows[i].id; + } + if (selected_ids.length == 1) + { + $('#results_selection').load('<%= url_for :controller => "search", :action => "single_result" %>', + { class_and_id: selected_ids[0].substring(3)}) + } + } +</script> + +<div class="panel_header"></div> + <% if @results.matches_estimated != 0 %> + <div class="data_section"> + <%= render :partial => "/search/grid", :locals => { :table_id => "search_grid", + :terms => @terms, + :model => @model_param, + :checkboxes => false, + :on_select => "results_select" } %> + </div> + <div class="selection_detail" id="results_selection"> + <div class="selection_left"> + <div>Select an item above.</div> + </div> + </div> +<% else %> + <div class="data_section"> + <div class="no-grid-items"> + <%= image_tag 'no-grid-items.png', :style => 'float: left;' %> + <div class="no-grid-items-text"> + No results found. <br/><br/> + </div> + </div> + </div> +<% end %> diff --git a/wui/src/db/migrate/011_create_acts_as_xapian.rb b/wui/src/db/migrate/011_create_acts_as_xapian.rb new file mode 100644 index 0000000..84a9dd7 --- /dev/null +++ b/wui/src/db/migrate/011_create_acts_as_xapian.rb @@ -0,0 +1,14 @@ +class CreateActsAsXapian < ActiveRecord::Migration + def self.up + create_table :acts_as_xapian_jobs do |t| + t.column :model, :string, :null => false + t.column :model_id, :integer, :null => false + t.column :action, :string, :null => false + end + add_index :acts_as_xapian_jobs, [:model, :model_id], :unique => true + end + def self.down + drop_table :acts_as_xapian_jobs + end +end + diff --git a/wui/src/dutils/active_record_env.rb b/wui/src/dutils/active_record_env.rb index 72feb89..9b7b416 100644 --- a/wui/src/dutils/active_record_env.rb +++ b/wui/src/dutils/active_record_env.rb @@ -36,6 +36,7 @@ require 'erb' OVIRT_DIR = "/usr/share/ovirt-wui" require "#{OVIRT_DIR}/vendor/plugins/betternestedset/init.rb" +require "#{OVIRT_DIR}/vendor/plugins/acts_as_xapian/lib/acts_as_xapian" def database_connect $dbconfig = YAML::load(ERB.new(IO.read("#{OVIRT_DIR}/config/database.yml")).result) diff --git a/wui/src/vendor/plugins/acts_as_xapian/.gitignore b/wui/src/vendor/plugins/acts_as_xapian/.gitignore new file mode 100644 index 0000000..3fad9cc --- /dev/null +++ b/wui/src/vendor/plugins/acts_as_xapian/.gitignore @@ -0,0 +1,3 @@ +xapiandbs +CVS +*.swp diff --git a/wui/src/vendor/plugins/acts_as_xapian/LICENSE.txt b/wui/src/vendor/plugins/acts_as_xapian/LICENSE.txt new file mode 100644 index 0000000..72d93c4 --- /dev/null +++ b/wui/src/vendor/plugins/acts_as_xapian/LICENSE.txt @@ -0,0 +1,21 @@ +acts_as_xapian is released under the MIT License. + +Copyright (c) 2008 UK Citizens Online Democracy. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of the acts_as_xapian software and associated documentation files (the +"Software"), to deal in the Software without restriction, including without +limitation the rights to use, copy, modify, merge, publish, distribute, +sublicense, and/or sell copies of the Software, and to permit persons to whom +the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. diff --git a/wui/src/vendor/plugins/acts_as_xapian/README.txt b/wui/src/vendor/plugins/acts_as_xapian/README.txt new file mode 100644 index 0000000..239289e --- /dev/null +++ b/wui/src/vendor/plugins/acts_as_xapian/README.txt @@ -0,0 +1,236 @@ +Do patch this file if there is documentation missing / wrong. It's called +README.txt and is in git, using Textile formatting. The wiki page is just +copied from the README.txt file. + +Contents +=======+ +* a. Introduction to acts_as_xapian +* b. Installation +* c. Comparison to acts_as_solr (as on 24 April 2008) +* d. Documentation - indexing +* e. Documentation - querying +* f. Support + + +a. Introduction to acts_as_xapian +================================+ +"Xapian":http://www.xapian.org is a full text search engine library which has +Ruby bindings. acts_as_xapian adds support for it to Rails. It is an +alternative to acts_as_solr, acts_as_ferret, Ultrasphinx, acts_as_indexed, +acts_as_searchable or acts_as_tsearch. + +acts_as_xapian is deployed in production on these websites. +* "WhatDoTheyKnow":http://www.whatdotheyknow.com +* "MindBites":http://www.mindbites.com + +The section "c. Comparison to acts_as_solr" below will give you an idea of +acts_as_xapian's features. + +acts_as_xapian was started by Francis Irving in May 2008 for search and email +alerts in WhatDoTheyKnow, and so was supported by "mySociety":http://www.mysociety.org +and initially paid for by the "JRSST Charitable Trust":http://www.jrrt.org.uk/jrsstct.htm + + +b. Installation +==============+ +Retrieve the plugin directly from the git version control system by running +this command within your Rails app. + + git clone git://github.com/frabcus/acts_as_xapian.git vendor/plugins/acts_as_xapian + +Xapian 1.0.5 and associated Ruby bindings are also required. + +Debian or Ubuntu - install the packages libxapian15 and libxapian-ruby1.8. + +Mac OSX - follow the instructions for installing from source on +the "Installing Xapian":http://xapian.org/docs/install.html page - you need the +Xapian library and bindings (you don't need Omega). + +There is no Ruby Gem for Xapian, it would be great if you could make one! + + +c. Comparison to acts_as_solr (as on 24 April 2008) +============================+ +* Offline indexing only mode - which is a minus if you want changes +immediately reflected in the search index, and a plus if you were going to +have to implement your own offline indexing anyway. + +* Collapsing - the equivalent of SQL's "group by". You can specify a field +to collapse on, and only the most relevant result from each value of that +field is returned. Along with a count of how many there are in total. +acts_as_solr doesn't have this. + +* No highlighting - Xapian can't return you text highlighted with a search +query. You can try and make do with TextHelper::highlight (combined with +words_to_highlight below). I found the highlighting in acts_as_solr didn't +really understand the query anyway. + +* Date range searching - this exists in acts_as_solr, but I found it +wasn't documented well enough, and was hard to get working. + +* Spelling correction - "did you mean?" built in and just works. + +* Similar documents - acts_as_xapian has a simple command to find other models +that are like a specified model. + +* Multiple models - acts_as_xapian searches multiple types of model if you +like, returning them mixed up together by relevancy. This is like +multi_solr_search, only it is the default mode of operation and is properly +supported. + +* No daemons - However, if you have more than one web server, you'll need to +work out how to use "Xapian's remote backend":http://xapian.org/docs/remote.html. + +* One layer - full-powered Xapian is called directly from the Ruby, without +Solr getting in the way whenever you want to use a new feature from Lucene. + +* No Java - an advantage if you're more used to working in the rest of the +open source world. acts_as_xapian, it's pure Ruby and C++. + +* Xapian's awesome email list - the kids over at +"xapian-discuss":http://lists.xapian.org/mailman/listinfo/xapian-discuss +are super helpful. Useful if you need to extend and improve acts_as_xapian. The +Ruby bindings are mature and well maintained as part of Xapian. + + +d. Documentation - indexing +==========================+ +Xapian is an *offline indexing* search library - only one process can have the +Xapian database open for writing at once, and others that try meanwhile are +unceremoniously kicked out. For this reason, acts_as_xapian does not support +immediate writing to the database when your models change. + +Instead, there is a ActsAsXapianJob model which stores which models need +updating or deleting in the search index. A rake task 'xapian:update_index' +then performs the updates since last change. You can run it on a cron job, or +similar. + +Here's how to add indexing to your Rails app: + +1. Put acts_as_xapian in your models that need search indexing. e.g. + + acts_as_xapian :texts => [ :name, :short_name ], + :values => [ [ :created_at, 0, "created_at", :date ] ], + :terms => [ [ :variety, 'V', "variety" ] ] + +Options must include: + +* :texts, an array of fields for indexing with full text search. +e.g. :texts => [ :title, :body ] + +* :values, things which have a range of values for sorting, or for collapsing. +Specify an array quadruple of [ field, identifier, prefix, type ] where +** identifier is an arbitary numeric identifier for use in the Xapian database +** prefix is the part to use in search queries that goes before the : +** type can be any of :string, :number or :date + +e.g. :values => [ [ :created_at, 0, "created_at", :date ], +[ :size, 1, "size", :string ] ] + +* :terms, things which come with a prefix (before a :) in search queries. +Specify an array triple of [ field, char, prefix ] where +** char is an arbitary single upper case char used in the Xapian database, just +pick any single uppercase character, but use a different one for each prefix. +** prefix is the part to use in search queries that goes before the : +For example, if you were making Google and indexing to be able to later do a +query like "site:www.whatdotheyknow.com", then the prefix would be "site". + +e.g. :terms => [ [ :variety, 'V', "variety" ] ] + +A 'field' is a symbol referring to either an attribute or a function which +returns the text, date or number to index. Both 'identifier' and 'char' must be +the same for the same prefix in different models. + +Alternatively, +* :instead_index, a field which refers to another model that should be reindexed + instead of this one. + +Options may include: +* :eager_load, added as an :include clause when looking up search results in +database +* :if, either an attribute or a function which if returns false means the +object isn't indexed + +2. Generate a database migration to create the ActsAsXapianJob model: + + script/generate acts_as_xapian + rake db:migrate + +3. Call 'rake xapian:rebuild_index models="ModelName1 ModelName2"' to build the index +the first time (you must specify all your indexed models). It's put in a +development/test/production dir in acts_as_xapian/xapiandbs. + +4. Then from a cron job or a daemon, or by hand regularly!, call 'rake xapian:update_index' + + +e. Documentation - querying +==========================+ +Testing indexing +---------------- + +If you just want to test indexing is working, you'll find this rake task +useful (it has more options, see tasks/xapian.rake) + + rake xapian:query models="PublicBody User" query="moo" + +Performing a query +------------------ + +To perform a query from code call ActsAsXapian::Search.new. This takes in turn: +* model_classes - list of models to search, e.g. [PublicBody, InfoRequestEvent] +* query_string - Google like syntax, see below + +And then a hash of options: +* :offset - Offset of first result (default 0) +* :limit - Number of results per page (default -1, all) +* :sort_by_prefix - Optionally, prefix of value to sort by, otherwise sort by relevance +* :sort_by_ascending - Default true, set to false for descending sort +* :collapse_by_prefix - Optionally, prefix of value to collapse by (i.e. only return most relevant result from group) + +Google like query syntax is as described in + "Xapian::QueryParser Syntax":http://www.xapian.org/docs/queryparser.html +Queries can include prefix:value parts, according to what you indexed in the +acts_as_xapian part above. You can also say things like model:InfoRequestEvent +to constrain by model in more complex ways than the :model parameter, or +modelid:InfoRequestEvent-100 to only find one specific object. + +Returns an ActsAsXapian::Search object. Useful methods are: +* description - a techy one, to check how the query has been parsed +* matches_estimated - a guesstimate at the total number of hits +* spelling_correction - the corrected query string if there is a correction, otherwise nil +* words_to_highlight - list of words for you to highlight, perhaps with TextHelper::highlight +* results - an array of hashes each containing: +** :model - your Rails model, this is what you most want! +** :weight - relevancy measure +** :percent - the weight as a %, 0 meaning the item did not match the query at all +** :collapse_count - number of results with the same prefix, if you specified collapse_by_prefix + +Finding similar models +---------------------- + +To find models that are similar to a given set of models call ActsAsXapian::Similar.new. This takes: +* model_classes - list of model classes to return models from within +* models - list of models that you want to find related ones to + +Returns an ActsAsXapian::Similar object. Has all methods from ActsAsXapian::Search above, except +for words_to_highlight. In addition has: +* important_terms - the terms extracted from the input models, that were used to search for output +You need the results methods to get the similar models. + + +f. Support +=========+ +Please ask any questions on the +"acts_as_xapian Google Group":http://groups.google.com/group/acts_as_xapian + +The official home page and repository for acts_as_xapian are the +"acts_as_xapian github page":http://github.com/frabcus/acts_as_xapian/wikis + +For more details about anything, see source code in lib/acts_as_xapian.rb diff --git a/wui/src/vendor/plugins/acts_as_xapian/generators/acts_as_xapian/USAGE b/wui/src/vendor/plugins/acts_as_xapian/generators/acts_as_xapian/USAGE new file mode 100644 index 0000000..2d027c4 --- /dev/null +++ b/wui/src/vendor/plugins/acts_as_xapian/generators/acts_as_xapian/USAGE @@ -0,0 +1 @@ +./script/generate acts_as_xapian diff --git a/wui/src/vendor/plugins/acts_as_xapian/generators/acts_as_xapian/acts_as_xapian_generator.rb b/wui/src/vendor/plugins/acts_as_xapian/generators/acts_as_xapian/acts_as_xapian_generator.rb new file mode 100644 index 0000000..5ac587d --- /dev/null +++ b/wui/src/vendor/plugins/acts_as_xapian/generators/acts_as_xapian/acts_as_xapian_generator.rb @@ -0,0 +1,13 @@ +class ActsAsXapianGenerator < Rails::Generator::Base + def manifest + record do |m| + m.migration_template 'migration.rb', 'db/migrate', + :migration_file_name => "create_acts_as_xapian" + end + end + + protected + def banner + "Usage: #{$0} acts_as_xapian" + end +end diff --git a/wui/src/vendor/plugins/acts_as_xapian/generators/acts_as_xapian/templates/migration.rb b/wui/src/vendor/plugins/acts_as_xapian/generators/acts_as_xapian/templates/migration.rb new file mode 100644 index 0000000..84a9dd7 --- /dev/null +++ b/wui/src/vendor/plugins/acts_as_xapian/generators/acts_as_xapian/templates/migration.rb @@ -0,0 +1,14 @@ +class CreateActsAsXapian < ActiveRecord::Migration + def self.up + create_table :acts_as_xapian_jobs do |t| + t.column :model, :string, :null => false + t.column :model_id, :integer, :null => false + t.column :action, :string, :null => false + end + add_index :acts_as_xapian_jobs, [:model, :model_id], :unique => true + end + def self.down + drop_table :acts_as_xapian_jobs + end +end + diff --git a/wui/src/vendor/plugins/acts_as_xapian/init.rb b/wui/src/vendor/plugins/acts_as_xapian/init.rb new file mode 100644 index 0000000..336bee1 --- /dev/null +++ b/wui/src/vendor/plugins/acts_as_xapian/init.rb @@ -0,0 +1,9 @@ +# acts_as_xapian/init.rb: +# +# Copyright (c) 2008 UK Citizens Online Democracy. All rights reserved. +# Email: francis at mysociety.org; WWW: http://www.mysociety.org/ +# +# $Id: init.rb,v 1.1 2008/04/23 13:33:50 francis Exp $ + +require 'acts_as_xapian' + diff --git a/wui/src/vendor/plugins/acts_as_xapian/lib/acts_as_xapian.rb b/wui/src/vendor/plugins/acts_as_xapian/lib/acts_as_xapian.rb new file mode 100644 index 0000000..74eb4bf --- /dev/null +++ b/wui/src/vendor/plugins/acts_as_xapian/lib/acts_as_xapian.rb @@ -0,0 +1,629 @@ +# acts_as_xapian/lib/acts_as_xapian.rb: +# Xapian full text search in Ruby on Rails. +# +# Copyright (c) 2008 UK Citizens Online Democracy. All rights reserved. +# Email: francis at mysociety.org; WWW: http://www.mysociety.org/ +# +# Documentation +# ============+# +# See ../README.txt foocumentation. Please update that file if you edit +# code. + +# Make it so if Xapian isn't installed, the Rails app doesn't fail completely, +# just when somebody does a search. +begin + require 'xapian' + $acts_as_xapian_bindings_available = true +rescue LoadError + STDERR.puts "acts_as_xapian: No Ruby bindings for Xapian installed" + $acts_as_xapian_bindings_available = false +end + +module ActsAsXapian + ###################################################################### + # Module level variables + # XXX must be some kind of cattr_accessor that can do this better + def ActsAsXapian.bindings_available + $acts_as_xapian_bindings_available + end + class NoXapianRubyBindingsError < StandardError + end + + def ActsAsXapian.db_path + @@db_path + end + # XXX global class intializers here get loaded more than once, don't know why. Protect them. + if not $acts_as_xapian_class_var_init + @@db = nil + @@writable_db = nil + @@writable_suffix = nil + @@init_values = [] + $acts_as_xapian_class_var_init = true + end + def ActsAsXapian.db + @@db + end + def ActsAsXapian.writable_db + @@writable_db + end + def ActsAsXapian.stemmer + @@stemmer + end + def ActsAsXapian.term_generator + @@term_generator + end + def ActsAsXapian.enquire + @@enquire + end + def ActsAsXapian.query_parser + @@query_parser + end + def ActsAsXapian.values_by_prefix + @@values_by_prefix + end + + ###################################################################### + # Initialisation + def ActsAsXapian.init(classname = nil, options = nil) + if not classname.nil? + # store class and options for use later, when we open the db in readable_init + @@init_values.push([classname,options]) + end + + # make the directory for the xapian databases to go in + db_parent_path = File.join(File.dirname(__FILE__), '../xapiandbs/') + Dir.mkdir(db_parent_path) unless File.exists?(db_parent_path) + raise "Set RAILS_ENV, so acts_as_xapian can find the right Xapian database" if not ENV['RAILS_ENV'] + @@db_path = File.join(db_parent_path, ENV['RAILS_ENV']) + + # make some things that don't depend on the db + # XXX this gets made once for each acts_as_xapian. Oh well. + @@stemmer = Xapian::Stem.new('english') + end + + # Opens / reopens the db for reading + # XXX we perhaps don't need to rebuild database and enquire and queryparser - + # but db.reopen wasn't enough by itself, so just do everything it's easier. + def ActsAsXapian.readable_init + raise NoXapianRubyBindingsError.new("Xapian Ruby bindings not installed") unless ActsAsXapian.bindings_available + + # basic Xapian objects + begin + @@db = Xapian::Database.new(@@db_path) + @@enquire = Xapian::Enquire.new(@@db) + rescue IOError + raise "Xapian database not opened; have you built it with scripts/rebuild-xapian-index ?" + end + + init_query_parser + end + + # Make a new query parser + def ActsAsXapian.init_query_parser + # for queries + @@query_parser = Xapian::QueryParser.new + @@query_parser.stemmer = @@stemmer + @@query_parser.stemming_strategy = Xapian::QueryParser::STEM_SOME + @@query_parser.database = @@db + @@query_parser.default_op = Xapian::Query::OP_AND + + @@terms_by_capital = {} + @@values_by_number = {} + @@values_by_prefix = {} + @@value_ranges_store = [] + + for init_value_pair in @@init_values + classname = init_value_pair[0] + options = init_value_pair[1] + + # go through the various field types, and tell query parser about them, + # and error check them - i.e. check for consistency between models + @@query_parser.add_boolean_prefix("model", "M") + @@query_parser.add_boolean_prefix("modelid", "I") + if options[:terms] + for term in options[:terms] + raise "Use a single capital letter for term code" if not term[1].match(/^[A-Z]$/) + raise "M and I are reserved for use as the model/id term" if term[1] == "M" or term[1] == "I" + raise "model and modelid are reserved for use as the model/id prefixes" if term[2] == "model" or term[2] == "modelid" + raise "Z is reserved for stemming terms" if term[1] == "Z" + raise "Already have code '" + term[1] + "' in another model but with different prefix '" + @@terms_by_capital[term[1]] + "'" if @@terms_by_capital.include?(term[1]) && @@terms_by_capital[term[1]] != term[2] + @@terms_by_capital[term[1]] = term[2] + @@query_parser.add_boolean_prefix(term[2], term[1]) + end + end + if options[:values] + for value in options[:values] + raise "Value index '"+value[1].to_s+"' must be an integer, is " + value[1].class.to_s if value[1].class != 1.class + raise "Already have value index '" + value[1].to_s + "' in another model but with different prefix '" + @@values_by_number[value[1]].to_s + "'" if @@values_by_number.include?(value[1]) && @@values_by_number[value[1]] != value[2] + + # date types are special, mark them so the first model they're seen for + if !@@values_by_number.include?(value[1]) + if value[3] == :date + value_range = Xapian::DateValueRangeProcessor.new(value[1]) + elsif value[3] == :string + value_range = Xapian::StringValueRangeProcessor.new(value[1]) + elsif value[3] == :number + value_range = Xapian::NumberValueRangeProcessor.new(value[1]) + else + raise "Unknown value type '" + value[3].to_s + "'" + end + + @@query_parser.add_valuerangeprocessor(value_range) + + # stop it being garbage collected, as + # add_valuerangeprocessor ref is outside Ruby's GC + @@value_ranges_store.push(value_range) + end + + @@values_by_number[value[1]] = value[2] + @@values_by_prefix[value[2]] = value[1] + end + end + end + end + + def ActsAsXapian.writable_init(suffix = "") + raise NoXapianRubyBindingsError.new("Xapian Ruby bindings not installed") unless ActsAsXapian.bindings_available + + # XXX called so db_path is made, shouldn't really be calling .init here + # as will make it remake stemmer etc. excessively often. + ActsAsXapian.init + + new_path = @@db_path + suffix + raise "writable_suffix/suffix inconsistency" if @@writable_suffix && @@writable_suffix != suffix + if @@writable_db.nil? + # for indexing + @@writable_db = Xapian::WritableDatabase.new(new_path, Xapian::DB_CREATE_OR_OPEN) + @@term_generator = Xapian::TermGenerator.new() + @@term_generator.set_flags(Xapian::TermGenerator::FLAG_SPELLING, 0) + @@term_generator.database = @@writable_db + @@term_generator.stemmer = @@stemmer + @@writable_suffix = suffix + end + end + + ###################################################################### + # Search with a query or for similar models + + # Base class for Search and Similar below + class QueryBase + attr_accessor :offset + attr_accessor :limit + attr_accessor :query + attr_accessor :matches + attr_accessor :query_models + + def initialize_db + ActsAsXapian.readable_init + if ActsAsXapian.db.nil? + raise "ActsAsXapian not initialized" + end + end + + # Set self.query before calling this + def initialize_query(options) + #raise options.to_yaml + offset = options[:offset] || 0; offset = offset.to_i + limit = options[:limit] || -1; limit = limit.to_i # -1 means all matches? + sort_by_prefix = options[:sort_by_prefix] || nil + sort_by_ascending = options[:sort_by_ascending] || true + collapse_by_prefix = options[:collapse_by_prefix] || nil + + ActsAsXapian.enquire.query = self.query + + if sort_by_prefix.nil? + ActsAsXapian.enquire.sort_by_relevance! + else + value = ActsAsXapian.values_by_prefix[sort_by_prefix] + raise "couldn't find prefix '" + sort_by_prefix + "'" if value.nil? + ActsAsXapian.enquire.sort_by_value_then_relevance!(value, sort_by_ascending) + end + if collapse_by_prefix.nil? + ActsAsXapian.enquire.collapse_key = Xapian.BAD_VALUENO + else + value = ActsAsXapian.values_by_prefix[collapse_by_prefix] + raise "couldn't find prefix '" + collapse_by_prefix + "'" if value.nil? + ActsAsXapian.enquire.collapse_key = value + end + + self.matches = ActsAsXapian.enquire.mset(offset, limit, 100) + @cached_results = nil + end + + # Return a description of the query + def description + self.query.description + end + + # Estimate total number of results + def matches_estimated + self.matches.matches_estimated + end + + # Return query string with spelling correction + def spelling_correction + correction = ActsAsXapian.query_parser.get_corrected_query_string + if correction.empty? + return nil + end + return correction + end + + # Return array of models found + def results + # If they've already pulled out the results, just return them. + if not @cached_results.nil? + return @cached_results + end + + # Pull out all the results + docs = [] + iter = self.matches._begin + while not iter.equals(self.matches._end) + docs.push({:data => iter.document.data, + :percent => iter.percent, + :weight => iter.weight, + :collapse_count => iter.collapse_count}) + iter.next + end + + # Look up without too many SQL queries + lhash = {} + lhash.default = [] + for doc in docs + k = doc[:data].split('-') + lhash[k[0]] = lhash[k[0]] + [k[1]] + end + # for each class, look up all ids + chash = {} + for cls, ids in lhash + conditions = [ "#{cls.constantize.table_name}.id in (?)", ids ] + found = cls.constantize.find(:all, :conditions => conditions, :include => cls.constantize.xapian_options[:eager_load]) + for f in found + chash[[cls, f.id]] = f + end + end + # now get them in right order again + results = [] + docs.each{|doc| k = doc[:data].split('-'); results << { :model => chash[[k[0], k[1].to_i]], + :percent => doc[:percent], :weight => doc[:weight], :collapse_count => doc[:collapse_count] } } + @cached_results = results + return results + end + end + + # Search for a query string, returns an array of hashes in result order. + # Each hash contains the actual Rails object in :model, and other detail + # about relevancy etc. in other keys. + class Search < QueryBase + attr_accessor :query_string + + # Note that model_classes is not only sometimes useful here - it's + # essential to make sure the classes have been loaded, and thus + # acts_as_xapian called on them, so we know the fields for the query + # parser. + + # model_classes - model classes to search within, e.g. [PublicBody, + # User]. Can take a single model class, or you can express the model + # class names in strings if you like. + # query_string - user inputed query string, with syntax much like Google Search + def initialize(model_classes, query_string, options = {}) + # Check parameters, convert to actual array of model classes + new_model_classes = [] + model_classes = [model_classes] if model_classes.class != Array + for model_class in model_classes: + raise "pass in the model class itself, or a string containing its name" if model_class.class != Class && model_class.class != String + model_class = model_class.constantize if model_class.class == String + new_model_classes.push(model_class) + end + model_classes = new_model_classes + + # Set things up + self.initialize_db + + # Case of a string, searching for a Google-like syntax query + self.query_string = query_string + + # Construct query which only finds things from specified models + model_query = Xapian::Query.new(Xapian::Query::OP_OR, model_classes.map{|mc| "M" + mc.to_s}) + user_query = ActsAsXapian.query_parser.parse_query(self.query_string, + Xapian::QueryParser::FLAG_BOOLEAN | Xapian::QueryParser::FLAG_PHRASE | + Xapian::QueryParser::FLAG_LOVEHATE | Xapian::QueryParser::FLAG_WILDCARD | + Xapian::QueryParser::FLAG_SPELLING_CORRECTION) + self.query = Xapian::Query.new(Xapian::Query::OP_AND, model_query, user_query) + + # Call base class constructor + self.initialize_query(options) + end + + # Return just normal words in the query i.e. Not operators, ones in + # date ranges or similar. Use this for cheap highlighting with + # TextHelper::highlight, and excerpt. + def words_to_highlight + query_nopunc = self.query_string.gsub(/[^a-z0-9:\.\/_]/i, " ") + query_nopunc = query_nopunc.gsub(/\s+/, " ") + words = query_nopunc.split(" ") + # Remove anything with a :, . or / in it + words = words.find_all {|o| !o.match(/(:|\.|\/)/) } + words = words.find_all {|o| !o.match(/^(AND|NOT|OR|XOR)$/) } + return words + end + + end + + # Search for models which contain theimportant terms taken from a specified + # list of models. i.e. Use to find documents similar to one (or more) + # documents, or use to refine searches. + class Similar < QueryBase + attr_accessor :query_models + attr_accessor :important_terms + + # model_classes - model classes to search within, e.g. [PublicBody, User] + # query_models - list of models you want to find things similar to + def initialize(model_classes, query_models, options = {}) + self.initialize_db + + # Case of an array, searching for models similar to those models in the array + self.query_models = query_models + + # Find the documents by their unique term + input_models_query = Xapian::Query.new(Xapian::Query::OP_OR, query_models.map{|m| "I" + m.xapian_document_term}) + ActsAsXapian.enquire.query = input_models_query + matches = ActsAsXapian.enquire.mset(0, 100, 100) # XXX so this whole method will only work with 100 docs + + # Get set of relevant terms for those documents + selection = Xapian::RSet.new() + iter = matches._begin + while not iter.equals(matches._end) + selection.add_document(iter) + iter.next + end + + # Bit weird that the function to make esets is part of the enquire + # object. This explains what exactly it does, which is to exclude + # terms in the existing query. + # http://thread.gmane.org/gmane.comp.search.xapian.general/3673/focus=3681 + eset = ActsAsXapian.enquire.eset(40, selection) + + # Do main search for them + self.important_terms = [] + iter = eset._begin + while not iter.equals(eset._end) + self.important_terms.push(iter.term) + iter.next + end + similar_query = Xapian::Query.new(Xapian::Query::OP_OR, self.important_terms) + # Exclude original + combined_query = Xapian::Query.new(Xapian::Query::OP_AND_NOT, similar_query, input_models_query) + + # Restrain to model classes + model_query = Xapian::Query.new(Xapian::Query::OP_OR, model_classes.map{|mc| "M" + mc.to_s}) + self.query = Xapian::Query.new(Xapian::Query::OP_AND, model_query, combined_query) + + # Call base class constructor + self.initialize_query(options) + end + end + + ###################################################################### + # Index + + # Offline indexing job queue model, create with migration made + # using "script/generate acts_as_xapian" as described in ../README.txt + class ActsAsXapianJob < ActiveRecord::Base + end + + # Update index with any changes needed, call this offline. Only call it + # from a script that exits - otherwise Xapian's writable database won't + # flush your changes. Specifying flush will reduce performance, but + # make sure that each index update is definitely saved to disk before + # logging in the database that it has been. + def ActsAsXapian.update_index(flush = false, verbose = false) + ActsAsXapian.writable_init + + ids_to_refresh = ActsAsXapianJob.find(:all).map() { |i| i.id } + for id in ids_to_refresh + begin + ActiveRecord::Base.transaction do + job = ActsAsXapianJob.find(id, :lock =>true) + STDOUT.puts("ActsAsXapian.update_index #{job.action} #{job.model} #{job.model_id.to_s}") if verbose + if job.action == 'update' + # XXX Index functions may reference other models, so we could eager load here too? + model = job.model.constantize.find(job.model_id) # :include => cls.constantize.xapian_options[:include] + model.xapian_index + elsif job.action == 'destroy' + # Make dummy model with right id, just for destruction + model = job.model.constantize.new + model.id = job.model_id + model.xapian_destroy + else + raise "unknown ActsAsXapianJob action '" + job.action + "'" + end + job.destroy + + if flush + ActsAsXapian.writable_db.flush + end + end + rescue => detail + # print any error, and carry on so other things are indexed + # XXX If item is later deleted, this should give up, and it + # won't. It will keep trying (assuming update_index called from + # regular cron job) and mayhap cause trouble. + STDERR.puts(detail.backtrace.join("\n") + "\nFAILED ActsAsXapian.update_index job #{id} #{$!}") + end + end + end + + # You must specify *all* the models here, this totally rebuilds the Xapian database. + # You'll want any readers to reopen the database after this. + def ActsAsXapian.rebuild_index(model_classes, verbose = false) + raise "when rebuilding all, please call as first and only thing done in process / task" if not ActsAsXapian.writable_db.nil? + + # Delete any existing .new database, and open a new one + new_path = ActsAsXapian.db_path + ".new" + if File.exist?(new_path) + raise "found existing " + new_path + " which is not Xapian flint database, please delete for me" if not File.exist?(File.join(new_path, "iamflint")) + FileUtils.rm_r(new_path) + end + ActsAsXapian.writable_init(".new") + + # Index everything + # XXX not a good place to do this destroy, as unindexed list is lost if + # process is aborted and old database carries on being used. Perhaps do in + # transaction and commit after rename below? Not sure if thenlocking is then bad + # for live website running at same time. + ActsAsXapianJob.destroy_all + for model_class in model_classes + models = model_class.find(:all) + for model in models + STDOUT.puts("ActsAsXapian.rebuild_index #{model_class} #{model.id}") if verbose + model.xapian_index + end + end + ActsAsXapian.writable_db.flush + + # Rename into place + old_path = ActsAsXapian.db_path + temp_path = ActsAsXapian.db_path + ".tmp" + if File.exist?(temp_path) + raise "temporary database found " + temp_path + " which is not Xapian flint database, please delete for me" if not File.exist?(File.join(temp_path, "iamflint")) + FileUtils.rm_r(temp_path) + end + if File.exist?(old_path) + FileUtils.mv old_path, temp_path + end + FileUtils.mv new_path, old_path + + # Delete old database + if File.exist?(temp_path) + raise "old database now at " + temp_path + " is not Xapian flint database, please delete for me" if not File.exist?(File.join(temp_path, "iamflint")) + FileUtils.rm_r(temp_path) + end + + # You'll want to restart your FastCGI or Mongrel processes after this, + # so they get the new db + end + + ###################################################################### + # Instance methods that get injected into your model. + + module InstanceMethods + # Used internally + def xapian_document_term + self.class.to_s + "-" + self.id.to_s + end + + # Extract value of a field from the model + def xapian_value(field, type = nil) + value = self[field] || self.send(field.to_sym) + if type == :date + value.utc.strftime("%Y%m%d") + elsif type == :boolean + value ? true : false + else + value.to_s + end + end + + # Store record in the Xapian database + def xapian_index + # if we have a conditional function for indexing, call it and destory object if failed + if self.class.xapian_options.include?(:if) + if_value = xapian_value(self.class.xapian_options[:if], :boolean) + if not if_value + self.xapian_destroy + return + end + end + + # otherwise (re)write the Xapian record for the object + doc = Xapian::Document.new + ActsAsXapian.term_generator.document = doc + + doc.data = self.xapian_document_term + + doc.add_term("M" + self.class.to_s) + doc.add_term("I" + doc.data) + if self.xapian_options[:terms] + for term in self.xapian_options[:terms] + doc.add_term(term[1] + xapian_value(term[0])) + end + end + if self.xapian_options[:values] + for value in self.xapian_options[:values] + doc.add_value(value[1], xapian_value(value[0], value[3])) + end + end + if self.xapian_options[:texts] + for text in self.xapian_options[:texts] + ActsAsXapian.term_generator.increase_termpos # stop phrases spanning different text fields + ActsAsXapian.term_generator.index_text(xapian_value(text)) + end + end + + ActsAsXapian.writable_db.replace_document("I" + doc.data, doc) + end + + # Delete record from the Xapian database + def xapian_destroy + ActsAsXapian.writable_db.delete_document("I" + self.xapian_document_term) + end + + # Used to mark changes needed by batch indexer + def xapian_mark_needs_index + model = self.class.to_s + model_id = self.id + ActiveRecord::Base.transaction do + found = ActsAsXapianJob.delete_all([ "model = ? and model_id = ?", model, model_id]) + job = ActsAsXapianJob.new + job.model = model + job.model_id = model_id + job.action = 'update' + job.save! + end + end + def xapian_mark_needs_destroy + model = self.class.to_s + model_id = self.id + ActiveRecord::Base.transaction do + found = ActsAsXapianJob.delete_all([ "model = ? and model_id = ?", model, model_id]) + job = ActsAsXapianJob.new + job.model = model + job.model_id = model_id + job.action = 'destroy' + job.save! + end + end + end + + ###################################################################### + # Main entry point, add acts_as_xapian to your model. + + module ActsMethods + # See top of this file for docs + def acts_as_xapian(options) + # Give error only on queries if bindings not available + if not ActsAsXapian.bindings_available + return + end + + include InstanceMethods + + cattr_accessor :xapian_options + self.xapian_options = options + + ActsAsXapian.init(self.class.to_s, options) + + after_save :xapian_mark_needs_index + after_destroy :xapian_mark_needs_destroy + end + end + +end + +# Reopen ActiveRecord and include the acts_as_xapian method +ActiveRecord::Base.extend ActsAsXapian::ActsMethods + + diff --git a/wui/src/vendor/plugins/acts_as_xapian/tasks/xapian.rake b/wui/src/vendor/plugins/acts_as_xapian/tasks/xapian.rake new file mode 100644 index 0000000..682b138 --- /dev/null +++ b/wui/src/vendor/plugins/acts_as_xapian/tasks/xapian.rake @@ -0,0 +1,43 @@ +require 'rubygems' +require 'rake' +require 'rake/testtask' +require 'activerecord' +require File.dirname(__FILE__) + '/../lib/acts_as_xapian.rb' + +namespace :xapian do + # Parameters - specify "flush=true" to save changes to the Xapian database + # after each model that is updated. This is safer, but slower. Specify + # "verbose=true" to print model name as it is run. + desc 'Updates Xapian search index with changes to models since last call' + task (:update_index => :environment) do + ActsAsXapian.update_index(ENV['flush'] ? true : false, ENV['verbose'] ? true : false) + end + + # Parameters - specify 'models="PublicBody User"' to say which models + # you index with Xapian. + # This totally rebuilds the database, so you will want to restart any + # web server afterwards to make sure it gets the changes, rather than + # still pointing to the old deleted database. Specify "verbose=true" to + # print model name as it is run. + desc 'Completely rebuilds Xapian search index (must specify all models)' + task (:rebuild_index => :environment) do + raise "specify ALL your models with models=\"ModelName1 ModelName2\" as parameter" if ENV['models'].nil? + ActsAsXapian.rebuild_index(ENV['models'].split(" ").map{|m| m.constantize}, ENV['verbose'] ? true : false) + end + + # Parameters - are models, query, offset, limit, sort_by_prefix, + # collapse_by_prefix + desc 'Run a query, return YAML of results' + task (:query => :environment) do + raise "specify models=\"ModelName1 ModelName2\" as parameter" if ENV['models'].nil? + raise "specify query=\"your terms\" as parameter" if ENV['query'].nil? + s = ActsAsXapian::Search.new(ENV['models'].split(" ").map{|m| m.constantize}, + ENV['query'], + :offset => (ENV['offset'] || 0), :limit => (ENV['limit'] || 10), + :sort_by_prefix => (ENV['sort_by_prefix'] || nil), + :collapse_by_prefix => (ENV['collapse_by_prefix'] || nil) + ) + STDOUT.puts(s.results.to_yaml) + end +end + -- 1.5.5.1
Jason Guiditta
2008-Aug-01 18:02 UTC
[Ovirt-devel] [PATCH] includes basic search functionality.
Ok, couple bits of feedback inline, but works well overall for me. On Wed, 2008-07-30 at 18:36 -0400, Scott Seago wrote:> + > + def results_json > + results_internal > + > + > + json_hash = {} > + json_hash[:page] = @page > + json_hash[:total] = @results.matches_estimated > + json_hash[:rows] = @results.results.collect do |result| > + item_hash = {} > + item = result[:model] > + item_hash[:id] = item.class.name+"_"+item.id.to_s > + item_hash[:cell] = ["display_name", "display_class"].collect do |attr| > + if attr.is_a? Array > + value = item > + attr.each { |attr_item| value = value.send(attr_item)} > + value > + else > + item.send(attr) > + end > + end > + item_hash[:cell] << result[:percent] > + item_hash > + end > + render :json => json_hash.to_json > + > + end > +endWe are starting to have a lot of these *_json methods. I wonder if we should instead drop the '_json' part so if we want to change the return type we could just have a param passed in (maybe json is the default)? Just feels like we are being a bit implementation-specific. If everyone agrees with this idea, maybe rename this one, and we can start cleaning up the others as we go.> diff --git a/wui/src/app/models/host.rb b/wui/src/app/models/host.rb > index 80acaeb..33a5fa1 100644 > --- a/wui/src/app/models/host.rb > +++ b/wui/src/app/models/host.rb > @@ -40,6 +40,12 @@ class Host < ActiveRecord::Base > end > end > > + acts_as_xapian :texts => [ :hostname, :uuid ], > + :values => [ [ :created_at, 0, "created_at", :date ], > + [ :updated_at, 1, "updated_at", :date ] ], > + :terms => [ [ :hostname, 'H', "hostname" ] ] > + > +I know they are all 'QEMU' now, but if we are going to have different hypervisor types, perhaps that would be a useful thing to search on as well. I could kind of see the 'arch' field also, though that is arguably less useful.> KVM_HYPERVISOR_TYPE = "KVM" > HYPERVISOR_TYPES = [KVM_HYPERVISOR_TYPE] > STATE_UNAVAILABLE = "unavailable" > @@ -75,4 +81,11 @@ class Host < ActiveRecord::Base > def cpu_speed > "FIX ME!" > end > + > + def display_name > + hostname > + end > + def display_class > + "Host" > + end > end> diff --git a/wui/src/app/models/storage_pool.rb b/wui/src/app/models/storage_pool.rb > index a135047..39b6a08 100644 > --- a/wui/src/app/models/storage_pool.rb > +++ b/wui/src/app/models/storage_pool.rb > @@ -32,6 +32,7 @@ class StoragePool < ActiveRecord::Base > > validates_presence_of :ip_addr, :hardware_pool_id > > + acts_as_xapian :texts => [ :ip_addr, :target, :export_path ]Would 'type' perhaps be useful?> ISCSI = "iSCSI" > NFS = "NFS" > STORAGE_TYPES = { ISCSI => "Iscsi", > @@ -55,4 +56,8 @@ class StoragePool < ActiveRecord::Base > def get_type_label > STORAGE_TYPES.invert[self.class.name.gsub("StoragePool", "")] > end > + def display_class > + "Storage Pool" > + end > + > enddiff --git a/wui/src/app/views/search/_grid.rhtml b/wui/src/app/views/search/_grid.rhtml> new file mode 100644 > index 0000000..6de9074 > --- /dev/null > +++ b/wui/src/app/views/search/_grid.rhtml > @@ -0,0 +1,40 @@ > +<% per_page = 40 %> > +<div id="<%= table_id %>_div"> > +<%= '<form id="#{table_id}_form">' if checkboxes %> > +<table id="<%= table_id %>" style="display:none"></table> > +<%= '</form>' if checkboxes %> > +</div> > +<script type="text/javascript"> > + $("#<%= table_id %>").flexigrid > + ( > + { > + url: '<%= url_for :controller => "search", > + :action => "results_json" %>', > + params: [{name: "terms", value: '<%=terms%>'}, > + {name: "model", value: '<%=model%>'}, > + {name: "checkboxes", value: <%=checkboxes%>}], > + dataType: 'json', > + colModel : [ > + <%= "{display: '', width : 20, align: 'left', process: > #{table_id}checkbox}," if checkboxes %> > + {display: 'Name', width : 200, align: 'left'}, > + {display: 'Type', width : 120, align: 'left'}, > + {display: 'Rank', width : 60, align: 'left'} > + ],As I mentioned in irc, I think '% Match' would be clearer than 'Rank' (especially in the case of 1 result). So aside from that (mostly opinion-based) stuff, ACK, works fine for me. -j