Solr

Solr e' una piattaforma di ricerca enterprise open source basata su Apache Lucene (pronunciato solar)

sito di riferimento: https://lucene.apache.org/solr/

cd xxx/solr-8.3.0/bin
./solr start
./solr stop

l'applcativo e' disponibile sulla porta 8983 http://localhost:8983/solr/#/

Il server Solr puo' essere avviato in due modalita' Standalone oppure SolrCloud, in modalita' Standalone la configurazione e' chiamata Core, mentre in modalita' SolrCloud e' chiamata Collection.

Procediamo con la modalità Standalone

creare la cartella basic_configs in xxx/solr-8.3.0/server/solr/configsets/, copiare la cartella conf di _default, eseguire il comando ./solr create -c jcg -d basic_configs, viene creata la cartella jcg in xxx/solr-8.3.0/server/solr e da interfaccia grafica ci troviamo la configurazione (la stessa cosa puo' essere fatta da web).

Modifichiamo il file managed-schema per caricare uno specifico tipo di file csv nella cartella xxx/solr-8.3.0/server/solr/configsets/basic_configs/conf sotto uniqueKeyid/uniqueKey andiando and inserire:

<!-- Fields added for books.csv load-->
<field name="cat" type="text_general" indexed="true" stored="true"/>
<field name="name" type="text_general" indexed="true" stored="true"/>
<field name="price" type="pdouble" indexed="true" stored="true"/>
<field name="inStock" type="boolean" indexed="true" stored="true"/>
<field name="author" type="text_general" indexed="true" stored="true"/>

i dati che andremo ad indicizzare estratti da books.csv

id,cat,name,price,inStock,author,series_t,sequence_i,genre_s
0553573403,book,A Game of Thrones,7.99,true,George R.R. Martin,"A Song of Ice and Fire",1,fantasy
0553579908,book,A Clash of Kings,7.99,true,George R.R. Martin,"A Song of Ice and Fire",2,fantasy  (***)
055357342X,book,A Storm of Swords,7.99,true,George R.R. Martin,"A Song of Ice and Fire",3,fantasy
...

sart e stop del server, nella cartella xxx/solr-8.3.0/example/exampledocs troviamo dei file di esempio e un applicativo java per fare un POST.

java -Dtype=text/csv -Durl=http://localhost:8983/solr/jcg/update -jar post.jar books.csv

da broswer possiamo interrogare i dati indicizzati http://localhost:8983/solr/jcg/select?q=name:"A Clash of Kings"

{
  "responseHeader":{
    "status":0,
    "QTime":65,
    "params":{
      "q":"name:\"A Clash of Kings\""}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "id":"0553579908",
        "cat":["book"],
        "name":["A Clash of Kings"],
        "price":[7.99],
        "inStock":[true],
        "author":["George R.R. Martin"],
        "series_t":"A Song of Ice and Fire",
        "sequence_i":2,
        "genre_s":"fantasy",
        "_version_":1649988710000754688}]
  }}

Solr presenta una serie di esempi, avviamo il server con la configurazione techproducts con ./solr -e techproducts.

http://localhost:8983/solr/techproducts/select?q=video
http://localhost:8983/solr/techproducts/select?q=video&fl=id,name,price
http://localhost:8983/solr/techproducts/select?q=name:video
http://localhost:8983/solr/techproducts/select?q=cat:electronics&fl=id,name,price
http://localhost:8983/solr/techproducts/select?q=price:[0 TO 400]&fl=id,name,price&facet=true&facet.field=cat
http://localhost:8983/solr/techproducts/select?q=price:[0 TO 4000]&fl=id,name,price&facet=true&facet.field=cat&facet.range=price&f.price.facet.range.start=0.0&f.price.facet.range.end=1000.0&f.price.facet.range.gap=100

I parametri per le query sono:

  • qt: Query handler for the request. Standard query handler is used if not specified.
  • q: It is used to specify the query event.
  • fq: Used to specify filter queries.
  • sort: Used to sort the results in ascending or descending order.
  • start,rows: start specifies the staring number of the result set. By default it is zero. rows specify the number of records to return.
  • fl: Used to return selective fields.
  • wt: Specifies the response format. Default is XML.
  • indent: Setting to true makes the response more readable.
  • debugQuery: Setting the parameter to true gives the debugging information as part of response.
  • dismax: To specify the dismax parser.
  • edismax: To specify the edismax parser.
  • facet: Setting to true enables the faceting.
  • spatial: Used for geospatial searches.
  • spellcheck: Setting to true help in searching similar terms.