Data Analytics

Category: natural language processing

Lightweight Text Clustering with Solr

Post author By wpu
Post date December 26, 2019
No Comments on Lightweight Text Clustering with Solr

Clustering is one of the most common unsupervised Machine Learning tasks. Solr is shipped with a clustering module based on Carrot² built-in algorithms. Carrot² comes with 4 algorithms: Lingo, STC, kMeans and Lingo3D each one mapped to a clustering engine. The first three are open-source whereas the last one is commercial. When this approach is used, clustering takes place in memory. Other frameworks, such as Mahout, can be used to do the clustering “off-line.”

Tags apache solr, banana, carrot2, text clustering, visualization

data science machine learning natural language processing novice

Document Classification With Solr Streaming Expressions

Post author By wpu
Post date November 6, 2019
No Comments on Document Classification With Solr Streaming Expressions

Classification is one of the most popular tasks in Natural Language Processing and Machine Learning. Solr ships with features, a subset of Streaming Expressions features, that allows building and deploying statistical classification models out-of-the-box. With adequate preprocessing and indexing tweaks, these features can be used to classify documents quickly and with high accuracy. This post illustrates how Solr streaming expressions and Zeppelin notebooks can be used to build a document classifier.

Tags classification, natural language processing, sole, streaming expressions, text classification, zeppelin

Waiting for PayPal...

Validating payment information...

Waiting for PayPal...