Unitex/GramLab is an open source, cross-platform, multilingual, lexicon- and grammar-based corpus processing suite
Unitex/GramLab project decision-making is based on a community meritocratic process. Anyone with an interest in Unitex/GramLab can join the community, contribute to the project design and participate in decisions.
Unitex is the C++ Natural Language Processing (NLP) engine of Unitex/GramLab. It is distributed under the terms of the GNU Lesser General Public License version 2.1 (LGPLv2) and contains only few third-party code dependencies (LibYAML, Pstdin, TRE, WinGetOpt) licensed under more-permissive licenses.
GramLab is the Project-oriented integrated development environment (IDE) of Unitex/GramLab. There is also a Classic IDE (Unitex.jar) that we are currently integrating with GramLab. They are distributed under the terms of the GNU Lesser General Public License version 2.1 (LGPLv2) and contains only few third-party dependencies (XAlign, Xerces2-j) licensed under equal or more-permissive licenses.
Language resources released with Unitex/GramLab are distributed under the terms of the Lesser General Public License For Linguistic Resources (LGPLLR). For authors and more information on these language resources, see here.
User’s Manual (in PDF format) is available in English and French (more translations are welcome). You can view and print them with Evince, downloadable here. The latest on-line version of the User’s Manual is accessible here.
Support questions can be posted in the community support forum. You are welcome to ask to join at any time by following this link. Please feel free to submit any suggestions or requests for new features too. Some general advice about asking technical support questions can be found here.
See the Bug Reporting Guide for information on how to report bugs.
Unitex/GramLab project decision-making is based on a community meritocratic process. Anyone with an interest in it can join the community, contribute to the project design and participate in decisions. The Unitex/GramLab Governance Model describes how this participation takes place and how to set about earning merit within the project community.
Unitex/GramLab is spelled with capitals "U" "G" and "L", and with everything else in lower case. Excepting the forward slash, do not put a space or any character between words. When the forward slash is not allowed, you can simply write “UnitexGramLab”
It's common to refer to the Unitex/GramLab Core as "Unitex", and to the Unitex Project-oriented IDE as "GramLab". If you are mentioning the distribution suite (Core, IDE, Linguistic Resources and others bundled tools) always use "Unitex/GramLab".
- Main website: http://unitexgramlab.org
- Binary releases: http://releases.unitexgramlab.org
- User's manual: http://releases.unitexgramlab.org/latest-stable/man
- Bug reporting: http://unitexgramlab.org/how-to-report-a-bug
- User's forum: http://forum.unitexgramlab.org
- Developers list: unitex-devel at univ-mlv.fr
- Code hosting: http://code.unitexgramlab.org
- Your contribution: Contribution rules
- Governance: http://governance.unitexgramlab.org
How to start ?
Thank you for your interest in contributing with the Unitex/GramLab development! You could start downloading a binary release and getting familiar with Unitex/GramLab. The User's Manual is available here.
Unitex/GramLab source code is hosted on https://github.com/UnitexGramLab. An overview of the C++ Core code (v3.0) is reachable here. For an overview of the Java IDE (v3.0) you could check this presentation. There are also some contribution rules here.
To start hacking the code, checkout the sources with git:
git clone https://github.com/UnitexGramLab/unitex-core.git
To compile under Linux use :
make DEBUG=yes UNITEXTOOLLOGGERONLY=yes
make ADDITIONAL_CFLAG+=-DUNITEX_PREVENT_USING_WINRT_API DEBUG=yes UNITEXTOOLLOGGERONLY=yes
Java GramLab IDE
git clone https://github.com/UnitexGramLab/gramlab-ide
To compile use:
git clone https://github.com/UnitexGramLab/lingua
Where to start ?
All contributions are welcome. If you are a new comer and want to help with the Unitex/GramLab
codebase, look the GitHub issues under the label
good first issue.