Solr is the most popular, fast and reliable open source enterprise search platform from the Apache Luene project. Among many other features, we love its powerful full-text search, hit highlighting, faceted search, and near real-time indexing. Solr powers the search and navigation features of many of the world's largest internet sites. Solr, written in Java, uses the Lucene Java search library for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language including R.
We invested significant amount of time integrating our R-based data-management platform with Solr using HTTP/JSON based REST interface. This integration allowed us to index millions of data-sets in solr in real-time as these data-sets get processed by R. It took us few days to stabilize and optimize this approach and we are very proud to share this approach and source code with you. The full source code can be found and downloaded from datadolph.in's git repository.
The script has R functions for:
- querying Solr and returning matching docs
- posting a document to solr (taking a list and converting it to JSON before posting it)
- deleting all indexes, deleting indexes for a certain document type and for a certain category within document type
# query a field for the text and return docs
querySolr <- function(queryText, queryfield="all") {
response <- fromJSON(getURL(paste(getQueryURL(), queryfield, ":", queryText, sep="")))
if(!response$responseHeader$status) #if 0
return(response$response$docs)
}
# delete all indexes from solr server
deleteAllIndexes <-function() {
response <- postForm(getUpdateURL(),
.opts = list(postfields = '{"delete": {"query":"*:*"}}',
httpheader = c('Content-Type' = 'application/json',
Accept = 'application/json')
ssl.verifypeer=FALSE
)
) #end of PostForm
return(fromJSON(response)$responseHeader[1])
}
# delete all indexes for a document type from solr server
# in this example : type = sports
deleteSportsIndexes <-function() {
response <- postForm(getUpdateURL(),
.opts = list(postfields = '{"delete": {"query":"type:sports"}}',
httpheader = c('Content-Type' = 'application/json',
Accept = 'application/json'),
ssl.verifypeer=FALSE
)
) #end of PostForm
return(fromJSON(response)$responseHeader[1])
}
# delete indexes for all baskeball category in sports type from solr server
# in this example : type = sports and category: basketball
deleteSportsIndexesForCat <-function(category) {
response <- postForm(getUpdateURL(),
.opts = list(postfields =
paste('{"delete": {"query":"type:sports AND category:', category, '"}}', sep=""),
httpheader = c('Content-Type' = 'application/json',
Accept = 'application/json'),
ssl.verifypeer=FALSE
)
) #end of PostForm
return(fromJSON(response)$responseHeader[1])
}
#deletePadIndexesForCat("baskeball")
#Post a new document to Solr
postDoc <- function(doc) {
solr_update_url <- getUpdateURL()
jsonst <- toJSON(list(doc))
response <- postForm(solr_update_url,
.opts = list(postfields = jsonst,
httpheader = c('Content-Type' = 'application/json',
Accept = 'application/json'),
ssl.verifypeer=FALSE
)) #end of PostForm
return(fromJSON(response)$responseHeader[1])
########## Commit - only if it doesn't work the other way ###############
#return(fromJSON(getURL(getCommitURL())))
}
Happy Coding!
Excellent post, this has been extremely useful to me. I work with a lot of Russian language texts, and to make this work with utf-8 characters you will want this as the first line in querySolr()
ReplyDeleteresponse <- fromJSON(getURL(paste(getQueryURL(), queryfield, ":", curlEscape(queryText), sep="")))
Just thought it might save you or someone else a headache!
R
All Things R: R And Solr Integration Using Solr'S Rest Apis >>>>> Download Now
Delete>>>>> Download Full
All Things R: R And Solr Integration Using Solr'S Rest Apis >>>>> Download LINK
>>>>> Download Now
All Things R: R And Solr Integration Using Solr'S Rest Apis >>>>> Download Full
>>>>> Download LINK 9b
Thanks, this is a good insight, very useful! We have faced this issue too in other place.
ReplyDeleteFantastic.
ReplyDeleteGreat post! Thank you for sharing.. Here is a great new course on youtube for beginners and Data Science aspirants. The content is great and the videos are short and crisp. New ones are getting added, so I suggest to subscribe.
ReplyDeletehttps://www.youtube.com/watch?v=BGWVASxyow8&list=PLFAYD0dt5xCzTQHDhMPZwBoaAXWeVhZzg&index=19
https://www.youtube.com/watch?v=BGWVASxyow8&list=PLFAYD0dt5xCzTQHDhMPZwBoaAXWeVhZzg&index=19
DeleteX Frame with Banner Services Company - Businesses, whether large, medium scale or small scale often use X Frame with banners to promote their businesses like new product announcement, sales event, opening of a new branch, new offers and more such promotion-oriented messages. We are online of the leading printing and design company in USA.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteThis comment has been removed by the author.
ReplyDeletei think your blog is great. thank you for stopping by here. Always great to have new eyes and opinions.
ReplyDeleteEmail Support
Great Post. Good Luck
ReplyDeleteSeems like it will be pretty effective
https://callpcexpert.com/dell-computer-support-phone-number.php
Thanks for sharing this amazing piece of info, Letting you know we are the Guest Post Blogger, You can send your articles to us. Just have a look at some piece of work.
ReplyDeleteHappy New Year Wishes
108 Names of Lord Ganesha
Places to Visit in Varanasi
Top 10 Reasons for Breakups
Hi, I am ELLy Leone is currently working with HP Printer Official which is a top notch company in USA provides HP printer customer service for HP users. We are 24/7 available over the phone, call +1 888-309-0939.
ReplyDeleteHP Officejet 5255 Setup
HP Officejet 5255 Wireless Setup
Compre documentos en línea, documentos originales y registrados.
ReplyDeleteAcerca de Permisodeespana, algunos dicen que somos los solucionadores de problemas, mientras que otros se refieren a nosotros como vendedores de soluciones. Contamos con cientos de clientes satisfechos a nivel mundial. Hacemos documentos falsos autorizados y aprobados como Permiso de Residencia Español, DNI, Pasaporte Español y Licencia de Conducir Española. Somos los fabricantes y proveedores de primer nivel de estos documentos, reconocidos a nivel mundial.
Comprar permiso de residencia,
permiso de residenciareal y falso en línea,
Compre licencia de conducir en línea,
Compre una licencia de conducir española falsa en línea,
Comprar tarjeta de identificación,
Licencia de conducir real y falsa,
Compre pasaporte real en línea,
Visit Here fpr more information. :- https://permisodeespana.com/licencia-de-conducir-espanola/
Address: 56 Guild Street, London, EC4A 3WU (UK)
Email: contact@permisodeespana.com
WhatsApp: +443455280186
All Things R: R And Solr Integration Using Solr'S Rest Apis >>>>> Download Now
ReplyDelete>>>>> Download Full
All Things R: R And Solr Integration Using Solr'S Rest Apis >>>>> Download LINK
>>>>> Download Now
All Things R: R And Solr Integration Using Solr'S Rest Apis >>>>> Download Full
>>>>> Download LINK 6A
This comment has been removed by the author.
ReplyDelete