Category: natural language processing

Asking Solr Questions in Natural Language

Post author By aadel
Post date July 31, 2024
No Comments on Asking Solr Questions in Natural Language

With the recent advancements of AI/ML, many tasks that were once unapproachable have become not. One of these tasks is asking questions to computers in a natural language and getting accurate and reasonable answers. Indeed, doing this task today is enabled by large language models that are notable for their ability to achieve general-purpose language generation and other natural language processing tasks.¹

Tags apache solr, natural language, questions, search, solr, sql

natural language processing novice

Lightweight Text Clustering with Solr

Post author By aadel
Post date December 26, 2019
No Comments on Lightweight Text Clustering with Solr

Clustering is one of the most common unsupervised Machine Learning tasks. Solr is shipped with a clustering module based on Carrot² built-in algorithms. Carrot² comes with 4 algorithms: Lingo, STC, kMeans and Lingo3D each one mapped to a clustering engine. The first three are open-source whereas the last one is commercial. When this approach is used, clustering takes place in memory. Other frameworks, such as Mahout, can be used to do the clustering “off-line.”

Tags apache solr, banana, carrot2, text clustering, visualization

data science machine learning natural language processing novice

Document Classification With Solr Streaming Expressions

Post author By aadel
Post date November 6, 2019
No Comments on Document Classification With Solr Streaming Expressions

Classification is one of the most popular tasks in Natural Language Processing and Machine Learning. Solr ships with features, a subset of Streaming Expressions features, that allows building and deploying statistical classification models out-of-the-box. With adequate preprocessing and indexing tweaks, these features can be used to classify documents quickly and with high accuracy. This post illustrates how Solr streaming expressions and Zeppelin notebooks can be used to build a document classifier.

Tags classification, natural language processing, sole, streaming expressions, text classification, zeppelin

Polls

What do you think are the main barriers to data science and machine learning? (Select up to 3)

Limited budget (25%, 4 Votes)
Not enough skilled resources (19%, 3 Votes)
Lack of support/involvement from senior management (19%, 3 Votes)
Hard to build and maintain (13%, 2 Votes)
Algorithms inappropriate for our uses (13%, 2 Votes)
Accessing and preparing data (13%, 2 Votes)
Deploying the results in operational systems (0%, 0 Votes)
None - we have no barriers to using machine learning (0%, 0 Votes)

Loading ...