{"id":58,"date":"2004-09-01T09:25:27","date_gmt":"2004-09-01T14:25:27","guid":{"rendered":"http:\/\/blog.uvm.edu\/hag\/2004\/09\/01\/swish-e-open-xml-indexing\/"},"modified":"2004-09-01T09:25:27","modified_gmt":"2004-09-01T14:25:27","slug":"swish-e-open-xml-indexing","status":"publish","type":"post","link":"https:\/\/blog.uvm.edu\/hag\/2004\/09\/01\/swish-e-open-xml-indexing\/","title":{"rendered":"Swish-e open xml indexing"},"content":{"rendered":"<p>http:\/\/swish-e.org\/<br \/>\nSwish-e is an open source indexer\/search engine. It excels at indexing<br \/>\n(X)HTML files, but indexes plain text and XML files almost as easily.<br \/>\nIt comes with C, PHP, and Perl API&#8217;s, and it runs under (over?) Unix as<br \/>\nwell as Window&#8217;s operating systems.<br \/>\nI am\/will be using swish-e as the underlying indexer for searches<br \/>\nagainst TEI documents. Specifically, I have been marking sets of<br \/>\nliterature up in TEI. I then convert the sets into a number of formats<br \/>\nsuch as plain text, XHTML, PDF, various Palm flavors, etc. I then use<br \/>\nswish-e to index the XHTML because swish-e does makes it easy to pull<br \/>\nout the meta tags of HTML head elements and make them field searchable<br \/>\nas well as the body of the text being free-text searchable. I could<br \/>\nhave almost as easily indexed the raw TEI files, then then I have to<br \/>\ndeal with transforming the XML before it gets to the browser. (&#8220;I know.<br \/>\nThere are many ways to do that.&#8221;). See:<br \/>\nhttp:\/\/infomotions.com\/alex2\/<br \/>\nI have also been fiddling with Plucene, a Perl port of Lucene, a<br \/>\nJava-based indexer\/search engine library:<br \/>\nhttp:\/\/search.cpan.org\/dist\/Plucene\/<br \/>\nUnlike swish-e, Lucene\/Plucene are libraries. Swish-e is a<br \/>\nindexer\/search engine binary as well as a library.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>http:\/\/swish-e.org\/ Swish-e is an open source indexer\/search engine. It excels at indexing (X)HTML files, but indexes plain text and XML files almost as easily. It comes with C, PHP, and Perl API&#8217;s, and it runs under (over?) Unix as well &hellip; <a href=\"https:\/\/blog.uvm.edu\/hag\/2004\/09\/01\/swish-e-open-xml-indexing\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":6,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[150],"tags":[],"class_list":["post-58","post","type-post","status-publish","format-standard","hentry","category-techow"],"_links":{"self":[{"href":"https:\/\/blog.uvm.edu\/hag\/wp-json\/wp\/v2\/posts\/58","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.uvm.edu\/hag\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.uvm.edu\/hag\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.uvm.edu\/hag\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.uvm.edu\/hag\/wp-json\/wp\/v2\/comments?post=58"}],"version-history":[{"count":0,"href":"https:\/\/blog.uvm.edu\/hag\/wp-json\/wp\/v2\/posts\/58\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.uvm.edu\/hag\/wp-json\/wp\/v2\/media?parent=58"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.uvm.edu\/hag\/wp-json\/wp\/v2\/categories?post=58"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.uvm.edu\/hag\/wp-json\/wp\/v2\/tags?post=58"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}