Unitex/GramLab
Lexicon-based Corpus Processing Suite
Lexicon-based Corpus Processing Suite
The automata-oriented technology of the Unitex/GramLab Natural Language Processing engine allows to handle electronic resources such as electronic dictionaries and grammars and apply them to a text for fast processing and analysis
The language resources are the electronic dictionaries and grammars that power Unitex analysis on textual data. Resources for more than 22 languages are currently distributed out-of-the-box with Unitex/GramLab
The Visual Integrated Development Environment of Unitex/GramLab allows users to easily design and apply language resources to text files. Moreover, a project-oriented perspective enables to run projects on a single click
Unitex/GramLab is freely distributed under the terms of the Lesser General Public License (LGPL). This means that everyone can redistribute Unitex freely within the terms of the LGPL license. It also means that you have access to the source code of all the Unitex programs, which is included in the zip file you download. The LGPL license is more permissive than the GPL one, because it allows you to reuse the own code of Unitex/GramLab in non-free software
The Unitex/GramLab Core NLP Engine is written in C++, the Visual IDE is written in Java. This allows to develop Unitex-based applications on any system that supports Java 1.7, compile them with any standard C++ - compliant compiler and run them on your favorite platform: Windows, Linux, MacOS, and several others
Unitex/GramLab conforms to the Unicode 3.0 standard that allows users to handle virtually all the characters of all languages, including Asian languages. The Unitex programs have been designed to work for all writing rules. There is no difficulty in working with Asian languages, in spite of their particular spacing conventions
Unitex/GramLab works with electronic dictionaries built by the members of the RELEX network, an international network of laboratories specialized in Computational Linguistics that was created by Maurice Gross and his LADL team. Members of the RELEX network have built and are building exhaustive dictionaries for many of the LGPLLR-licensed resources distributed with Unitex/GramLab
Local grammars are a powerful formalism for describing syntactic or semantic rules. It consists of finite state automata coupled with electronic dictionaries to perform automatic analysis of textual data. Unitex/GramLab features a rich visual IDE which allows users to easily design, test, debug, maintain and apply local grammars on a text