ReaderBench - Semantic Models and Topic Mining
- Creator: UPB
- Publisher: Rage project
- Owner: Dascalu Mihai email
Extracts the keywords of a text together with their relevance scores and semantic links between them.
Extracts keywords and topics of a text, together with the corresponding relevance scores and semantic links between them.
This component represents a core constituent within all ReaderBench modules in terms of discourse analysis and text mining.
Given an input text, this component returns the list of concepts, their relevance and the links between them.
The component is available in the following languages: English and French. Dutch and Romanian languages will be available soon.
ReaderBench introduced a generalized model for assessment based on the cohesion graph, applicable to both plain essay- or story-like texts and CSCL conversations, in particular chats, forum discussion threads or blog communities.
Text cohesion, viewed as lexical, grammatical and semantic relationships that link together textual units, is defined within our implemented model in terms of semantic similarity measured through semantic distances in: lexicalized ontologies (e.g. WordNet), Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA).
Additionally, specific natural language processing techniques are applied to reduce noise and to improve the system’s accuracy: tokenizing, splitting, part of speech tagging, parsing stop words elimination, dictionary-only words selection, stemming, lemmatizing, named entity recognition and co-reference resolution.
Moreover, we have developed a topic mining module that integrates the previously defined semantic models (available for English, French and Italian).
Support levels: The component is available "as is" without warranties or conditions of any kind. Reported bugs will be fixed. Continued support for new versions of the OS and game engines. New features will be added according to the developer's roadmap. New features can be added upon request (requires a service contract).
The ReaderBench framework can be either cloned from our GitLab Repository or simply used as deployment library.
The Repository contains three projects:
- The ReaderBench Core
- The ReaderBench Desktop Client
- The ReaderBench API
The ReaderBench Core can be accessed to explore the Natural Language Processing functionalities and operations performed by ReaderBench. You may either clone this project and explore its contents, or you can simply use it as a Maven dependency by cloning it from our Artifactory server.
The ReaderBench Desktop Client can be used to test ReaderBench functionalities with the help of a Java Swing interface. This project uses the ReaderBench Core, so you may use it as a guide into integrating ReaderBench in your projects.
The ReaderBench API can be used to explore how the ReaderBench Application Programming Interface works. Similar to the ReaderBench Desktop Client, you may discover how to integrate the ReaderBench Core into a project.
English, French
https://git.readerbench.com/ReaderBench/ReaderBench.git
ReaderBench-Semantic-Models-and-Topic-Mining.zip
keywords extraction
topic mining
semantic models
topics
- http://readerbench.com/docs/semantic-models/manual
- https://git.readerbench.com/ReaderBench/ReaderBench/blob/v3.0/README.md
- https://git.readerbench.com/ReaderBench/ReaderBench/wikis/home
- http://readerbench.com/docs/api
- http://readerbench.com/docs/semantic-models/sdd
- https://git.readerbench.com/ReaderBench/ReaderBench/wikis/how-to/how-to-install-and-run-readerbench
- https://owncloud.readerbench.com/index.php/s/w33mnCcpH1Bp1zs/download?path=/&files=README.txt
Other
Other
Java
3.0
Stable version after major project split.
Completed
https://git.readerbench.com/ReaderBench/ReaderBench/tags/v2.3.5
Apache 2.0 (Apache License 2.0)