------------------------------------------------------------------------ r4253 | martinec | 2016-01-14 19:07:05 +0100 (Thu, 14 Jan 2016) | 1 line fix bug when a graph dictionary produces an empty morpho.dic ------------------------------------------------------------------------ r4252 | martinec | 2016-01-13 20:13:03 +0100 (Wed, 13 Jan 2016) | 1 line enhance: version information is now centralized in Version.h ------------------------------------------------------------------------ r4251 | martinec | 2016-01-13 17:22:53 +0100 (Wed, 13 Jan 2016) | 1 line minor: update disclaimer ------------------------------------------------------------------------ r4239 | gvollant | 2016-01-02 10:19:22 +0100 (Sat, 02 Jan 2016) | 1 line like revision 1220 and 2059, 2691, 3390, 3539 and 3753, Updating copyright to 2016 ------------------------------------------------------------------------ r4225 | martinec | 2015-12-08 14:18:38 +0100 (Tue, 08 Dec 2015) | 1 line minor enhance: add build information ------------------------------------------------------------------------ r4220 | martinec | 2015-12-07 17:34:55 +0100 (Mon, 07 Dec 2015) | 1 line minor refactor: update variable name ------------------------------------------------------------------------ r4219 | martinec | 2015-12-04 02:55:40 +0100 (Fri, 04 Dec 2015) | 1 line minor update: harmonize version variable names with build system ------------------------------------------------------------------------ r4215 | martinec | 2015-12-04 01:27:50 +0100 (Fri, 04 Dec 2015) | 1 line minor enhance: add version release definition ------------------------------------------------------------------------ r4212 | martinec | 2015-12-03 21:43:31 +0100 (Thu, 03 Dec 2015) | 1 line rename LICENSE.md to LICENSE ------------------------------------------------------------------------ r4200 | martinec | 2015-12-03 02:54:47 +0100 (Thu, 03 Dec 2015) | 1 line minor refactor: use only @ character to surround template variables ------------------------------------------------------------------------ r4193 | martinec | 2015-12-02 21:12:23 +0100 (Wed, 02 Dec 2015) | 1 line minor disclaimer refactor ------------------------------------------------------------------------ r4191 | martinec | 2015-11-30 19:24:31 +0100 (Mon, 30 Nov 2015) | 1 line add a templated-disclaimer about the Unitex/GramLab distribution ------------------------------------------------------------------------ r4190 | martinec | 2015-11-30 18:17:27 +0100 (Mon, 30 Nov 2015) | 1 line minor: update Unitex description ------------------------------------------------------------------------ r4188 | gupta | 2015-11-30 15:46:39 +0100 (Mon, 30 Nov 2015) | 1 line Generic graph bug fix: escape slash before calling grf2fst2 ------------------------------------------------------------------------ r4187 | martinec | 2015-11-30 15:16:04 +0100 (Mon, 30 Nov 2015) | 12 lines update stdint from v0.1.14 to v0.1.15 This version add the following next macros PRINTF_UINT8_DEC_WIDTH PRINTF_UINTMAX_DEC_WIDTH PRINTF_UINT64_DEC_WIDTH PRINTF_UINT32_DEC_WIDTH PRINTF_UINT16_DEC_WIDTH PRINTF_UINT8_DEC_WIDTH ------------------------------------------------------------------------ r4182 | martinec | 2015-11-27 17:29:28 +0100 (Fri, 27 Nov 2015) | 1 line enhance: add a main license for the Unitex core engine ------------------------------------------------------------------------ r4181 | martinec | 2015-11-27 17:29:18 +0100 (Fri, 27 Nov 2015) | 1 line add missing copyright notices ------------------------------------------------------------------------ r4179 | martinec | 2015-11-27 00:41:23 +0100 (Fri, 27 Nov 2015) | 1 line minor: harmonize variable names with build system ------------------------------------------------------------------------ r4169 | gvollant | 2015-11-24 09:09:42 +0100 (Tue, 24 Nov 2015) | 1 line UnPreprocess minor modification ------------------------------------------------------------------------ r4168 | martinec | 2015-11-23 19:12:44 +0100 (Mon, 23 Nov 2015) | 1 line minor: harmonize variable names with build system ------------------------------------------------------------------------ r4166 | gvollant | 2015-11-23 10:18:01 +0100 (Mon, 23 Nov 2015) | 1 line fix UnPreprocess overlap ------------------------------------------------------------------------ r4163 | gvollant | 2015-11-22 14:17:39 +0100 (Sun, 22 Nov 2015) | 1 line fix leak in UnPreprocess ------------------------------------------------------------------------ r4162 | paumier | 2015-11-22 07:29:27 +0100 (Sun, 22 Nov 2015) | 1 line In Fst2List: stopping when reaching the maximum number of lines specified by the user is not an error ------------------------------------------------------------------------ r4161 | gvollant | 2015-11-21 22:52:35 +0100 (Sat, 21 Nov 2015) | 1 line Introduce UnPreprocess ------------------------------------------------------------------------ r4160 | gvollant | 2015-11-21 20:44:21 +0100 (Sat, 21 Nov 2015) | 1 line unitex long name parameters are tolerant for mismatch - and _ (like char_by_char vs char-by-char, --only_verify_arguments vs --only-verify-arguments), because there was no coherency ------------------------------------------------------------------------ r4159 | gvollant | 2015-11-21 20:41:42 +0100 (Sat, 21 Nov 2015) | 2 lines fix bug in u_fget_unichars_raw which minor buffer size by 1 using UTF8. With this fix, the function has same hehavior in UTF16 and UTF8 ------------------------------------------------------------------------ r4158 | gvollant | 2015-11-21 20:40:10 +0100 (Sat, 21 Nov 2015) | 1 line as discussed in developer mailing list, revert r4154 "Allow output for simple subgraphs." ------------------------------------------------------------------------ r4154 | gupta | 2015-11-19 14:37:44 +0100 (Thu, 19 Nov 2015) | 1 line Allow output for simple subgraphs. ------------------------------------------------------------------------ r4152 | gvollant | 2015-11-17 14:10:27 +0100 (Tue, 17 Nov 2015) | 1 line error in cassys_tokenize did not abort tokenize ------------------------------------------------------------------------ r4145 | martinec | 2015-11-09 20:01:42 +0100 (Mon, 09 Nov 2015) | 1 line add a header regrouping Unitex release information ------------------------------------------------------------------------ r4142 | gupta | 2015-11-05 16:11:27 +0100 (Thu, 05 Nov 2015) | 1 line generic graph bug fix ------------------------------------------------------------------------ r4141 | gvollant | 2015-11-05 14:55:57 +0100 (Thu, 05 Nov 2015) | 1 line minor stack-size modification in UnitexTool ------------------------------------------------------------------------ r4140 | gvollant | 2015-11-05 14:51:33 +0100 (Thu, 05 Nov 2015) | 1 line change in Locate --le*_tolerant behavior ------------------------------------------------------------------------ r4139 | gvollant | 2015-11-05 10:52:02 +0100 (Thu, 05 Nov 2015) | 1 line Default value on somes Locate parameters (change really for stop token) ------------------------------------------------------------------------ r4138 | gvollant | 2015-11-05 10:51:22 +0100 (Thu, 05 Nov 2015) | 1 line prepare for default value on somes Locate parameters ------------------------------------------------------------------------ r4136 | gupta | 2015-11-04 15:54:38 +0100 (Wed, 04 Nov 2015) | 1 line start = pos2 for META_PRE and META_UPPER ------------------------------------------------------------------------ r4135 | gvollant | 2015-11-04 13:35:11 +0100 (Wed, 04 Nov 2015) | 1 line uses SemVer version info copyright info ------------------------------------------------------------------------ r4134 | gvollant | 2015-11-04 13:23:15 +0100 (Wed, 04 Nov 2015) | 1 line uses SemVer version info copyright info ------------------------------------------------------------------------ r4133 | gvollant | 2015-11-04 11:12:34 +0100 (Wed, 04 Nov 2015) | 1 line enhance VersionInfo - forgotten file in commit ------------------------------------------------------------------------ r4132 | gvollant | 2015-11-04 10:41:15 +0100 (Wed, 04 Nov 2015) | 1 line enhance VersionInfo ------------------------------------------------------------------------ r4131 | gvollant | 2015-11-04 10:19:37 +0100 (Wed, 04 Nov 2015) | 1 line enhance VersionInfo ------------------------------------------------------------------------ r4130 | gvollant | 2015-11-04 10:19:19 +0100 (Wed, 04 Nov 2015) | 1 line fix memory leak in cassys ------------------------------------------------------------------------ r4129 | martinec | 2015-11-03 15:06:11 +0100 (Tue, 03 Nov 2015) | 1 line minor: command line and comments refactor ------------------------------------------------------------------------ r4128 | gupta | 2015-11-03 11:09:37 +0100 (Tue, 03 Nov 2015) | 1 line Generic graph searchs current _snt folder for tok_by_alph.txt ------------------------------------------------------------------------ r4127 | martinec | 2015-11-02 18:34:57 +0100 (Mon, 02 Nov 2015) | 1 line minor fix: remove debugging code ------------------------------------------------------------------------ r4126 | martinec | 2015-11-02 18:07:01 +0100 (Mon, 02 Nov 2015) | 74 lines Fixed bugs: - Report wrong line error number when compressing multiple files - Report wrong count of entries when a file has comments - Avoid to compress more than once a file, i.e. Compress foo.dic foo.dic -o foo.bin - Avoid to compress empty dictionaries (without any entry) - Avoid to build a .inf file when a file doesn't exists - Under Linux forbid to open/compress directories - Prevent memory corruption when encounter an entry like 'foo,.bar\' Enhances: - Always report the line number when an error occurs - Report the number of files processed - Report the number of errors found (if any) - Report the number of entries processed (without comments or errors) Command line: - Command line usage rewritten using docopt format http://docopt.org/ - New command line parameter --output_type=TYPE, type could be 'bin1' or 'bin2' - New command line parameter --version, to print the Compress version according to the "Standards for Command Line Interfaces" https://goo.gl/7UgLC8 - Exposing --input_encoding and --output_encoding parameters - Print a message informing the deprecation of --v1, --v2 and --bin2 in favor of --output_type=bin1 and output_type=bin2. For this release, no warnings or errors are printed about using deprecated options New functions: - build_tree_from_dictionary() Builds a tree representation of a DELAF dictionary - build_tree_from_dictionary_list() Builds a tree representation of a list of DELAF dictionaries - create_and_save_inf() Creates an inflectional information file - minimize_and_save_tree_as_bin_classic() Minimizes and save a classic DELAF dictionary into a file - minimize_and_save_tree_as_bin_two() Minimizes and save a bin2 DELAF dictionary into a file Experimental features: - Include support to build DELAF dictionaries that includes other DELAF files, i.e. a meta-dictionary. N.dic house,.N:s church,.N+Conc:s park,.N:s A.dic red,.A blue,.A white,.A dela.dic //! N.dic //! A.dic Here, dela.dic is a meta-dictionary that includes (//!) both N.dic and A.dic. Relative paths are supported (//! foo/N.dic) as well as a mixed of entries, comments and includes in the same file. Include recursions (e.g. A includes B, B includes A; or A includes B, B includes C, C includes A) are detected and avoided. Unfixed bugs: - Unitex doesn't support open files containing unicode characters in their filename, i.e. Under Windows, Compress ?\232?\191?\144?\230?\176?\148.dic will not work. In the case of POSIX systems fopen() accepts UTF8 filenames. ------------------------------------------------------------------------ r4125 | martinec | 2015-11-02 18:04:53 +0100 (Mon, 02 Nov 2015) | 1 line minor enhance: is_in_list() and equal() functions accept a custom comparator ------------------------------------------------------------------------ r4124 | martinec | 2015-11-02 15:16:29 +0100 (Mon, 02 Nov 2015) | 27 lines add UnitexFileType and portable filename handling and test functions This commit add a new type named UnitexFileType and the function get_file_type() to identify specific file types: abstract, regular, directory, etc. It includes also some new filename handling and test functions: - to_unix_path_separators() Converts a filename according to the Unix path separator rules - to_windows_path_separators() Converts a filename according to the Windows path separator rules - to_native_path_separators Converts a filename according to the native system path separator rules - is_directory() Checks if the givenilename corresponds to a directory - is_regular_file() Checks if the given filename corresponds to a regular file - is_abstract_file() Checks if the given filename corresponds to a Unitex abstract file - is_unknown_file() Checks if the given filename corresponds to a unknown file type - get_real_path() Derives from the pathname pointed to by filename, an absolute pathname - file_exists() Checks whether a file or directory exists ------------------------------------------------------------------------ r4122 | martinec | 2015-11-02 13:05:02 +0100 (Mon, 02 Nov 2015) | 1 line add a canonical unitex version string ------------------------------------------------------------------------ r4121 | gvollant | 2015-11-01 17:03:52 +0100 (Sun, 01 Nov 2015) | 1 line add --stack-size=N as first possible option in UnitexTool ------------------------------------------------------------------------ r4120 | gvollant | 2015-11-01 16:01:27 +0100 (Sun, 01 Nov 2015) | 1 line fix in option stack-size ------------------------------------------------------------------------ r4119 | gvollant | 2015-11-01 13:09:08 +0100 (Sun, 01 Nov 2015) | 1 line add option stack-size in RunLog and BatchRunScript ------------------------------------------------------------------------ r4118 | gvollant | 2015-10-28 12:39:47 +0100 (Wed, 28 Oct 2015) | 1 line minor : suppress a warning with old GCC 3.4 and -pedantic ------------------------------------------------------------------------ r4117 | martinec | 2015-10-28 09:09:36 +0100 (Wed, 28 Oct 2015) | 1 line minor fix: avoid a memory leak when function fails ------------------------------------------------------------------------ r4116 | martinec | 2015-10-28 06:42:23 +0100 (Wed, 28 Oct 2015) | 1 line minor: prevent unused parameter compiler warning in non-Windows systems ------------------------------------------------------------------------ r4115 | martinec | 2015-10-28 05:43:03 +0100 (Wed, 28 Oct 2015) | 1 line minor: remove comma at end of enumerator list ------------------------------------------------------------------------ r4114 | martinec | 2015-10-27 20:13:51 +0100 (Tue, 27 Oct 2015) | 1 line minor fix: rename MAX_CLASS_NAME to MAX_DIC_CLASS_NAME to avoid clashing with windows.h macro ------------------------------------------------------------------------ r4113 | gvollant | 2015-10-27 09:13:03 +0100 (Tue, 27 Oct 2015) | 1 line Locate display stop token info as error (STDERR and not STDOUT), and give grammar name ------------------------------------------------------------------------ r4112 | martinec | 2015-10-25 18:16:53 +0100 (Sun, 25 Oct 2015) | 1 line minor update copyright year using update_copyright_year.sh ------------------------------------------------------------------------ r4110 | gvollant | 2015-10-22 22:31:25 +0200 (Thu, 22 Oct 2015) | 1 line fix potential warning ------------------------------------------------------------------------ r4109 | gvollant | 2015-10-22 14:14:36 +0200 (Thu, 22 Oct 2015) | 1 line minor warning fix ------------------------------------------------------------------------ r4107 | gvollant | 2015-10-19 17:38:22 +0200 (Mon, 19 Oct 2015) | 1 line minor error handling enhancemend in code from miniz/tinfl.c ------------------------------------------------------------------------ r4106 | martinec | 2015-10-19 02:36:53 +0200 (Mon, 19 Oct 2015) | 1 line add support to use const char* values with ustring lists ------------------------------------------------------------------------ r4105 | martinec | 2015-10-19 02:34:01 +0200 (Mon, 19 Oct 2015) | 1 line minor: add missing cast ------------------------------------------------------------------------ r4104 | martinec | 2015-10-19 01:55:34 +0200 (Mon, 19 Oct 2015) | 1 line add a miscellaneous script to update copyright year ------------------------------------------------------------------------ r4103 | gvollant | 2015-10-19 00:13:10 +0200 (Mon, 19 Oct 2015) | 1 line add code from miniz/tinfl.c (very compact inflater) in FileUnpack to have a support of normal zip lingpackage ------------------------------------------------------------------------ r4102 | martinec | 2015-10-17 01:04:27 +0200 (Sat, 17 Oct 2015) | 1 line add a function to dumps the first n values of a given hash of strings ------------------------------------------------------------------------ r4094 | martinec | 2015-10-13 11:56:46 +0200 (Tue, 13 Oct 2015) | 1 line fix bug with output variables concatenation in morphological mode ------------------------------------------------------------------------ r4093 | martinec | 2015-10-12 18:19:51 +0200 (Mon, 12 Oct 2015) | 1 line fix bug introduced with r4088 ------------------------------------------------------------------------ r4092 | gvollant | 2015-10-12 15:57:06 +0200 (Mon, 12 Oct 2015) | 1 line save stack in BatchRunScript ------------------------------------------------------------------------ r4091 | martinec | 2015-10-10 21:02:06 +0200 (Sat, 10 Oct 2015) | 1 line minor: add missing cast ------------------------------------------------------------------------ r4090 | martinec | 2015-10-10 21:01:11 +0200 (Sat, 10 Oct 2015) | 1 line refactor comment ------------------------------------------------------------------------ r4089 | gvollant | 2015-10-10 08:17:35 +0200 (Sat, 10 Oct 2015) | 1 line fix minor compilation problem introduced 4088 ------------------------------------------------------------------------ r4088 | gvollant | 2015-10-09 23:04:08 +0200 (Fri, 09 Oct 2015) | 1 line minor cast rewrite on u_to_char_n ------------------------------------------------------------------------ r4087 | martinec | 2015-10-09 19:14:31 +0200 (Fri, 09 Oct 2015) | 1 line add u_to_char_n function ------------------------------------------------------------------------ r4086 | gvollant | 2015-10-07 17:36:22 +0200 (Wed, 07 Oct 2015) | 1 line protect DumpOffset denormalization against infinite loop ------------------------------------------------------------------------ r4085 | gvollant | 2015-10-07 14:26:12 +0200 (Wed, 07 Oct 2015) | 1 line optimize cassys with insert_at_end_of_list_with_latest ------------------------------------------------------------------------ r4084 | gvollant | 2015-10-07 14:25:51 +0200 (Wed, 07 Oct 2015) | 1 line add insert_at_end_of_list_with_latest function in List_ustring ------------------------------------------------------------------------ r4083 | gvollant | 2015-10-07 11:39:19 +0200 (Wed, 07 Oct 2015) | 1 line remove recursive loop ------------------------------------------------------------------------ r4082 | gvollant | 2015-10-07 11:18:05 +0200 (Wed, 07 Oct 2015) | 1 line protect Concord again fatal_error with bad token file ------------------------------------------------------------------------ r4081 | martinec | 2015-10-06 19:44:15 +0200 (Tue, 06 Oct 2015) | 1 line add a Unitex implementation of realpath ------------------------------------------------------------------------ r4078 | gvollant | 2015-10-03 08:42:48 +0200 (Sat, 03 Oct 2015) | 1 line replace tabs by space (or invert) to coherency inside file. No real code change ------------------------------------------------------------------------ r4077 | gvollant | 2015-10-03 00:16:40 +0200 (Sat, 03 Oct 2015) | 1 line misc warning fix ------------------------------------------------------------------------ r4076 | gvollant | 2015-10-03 00:15:15 +0200 (Sat, 03 Oct 2015) | 1 line remove warning: enumeration value not handled in switch ------------------------------------------------------------------------ r4075 | gvollant | 2015-10-02 23:59:54 +0200 (Fri, 02 Oct 2015) | 1 line fix in Fst2TxtAsRoutine.cpp ------------------------------------------------------------------------ r4074 | gupta | 2015-10-02 14:26:58 +0200 (Fri, 02 Oct 2015) | 1 line Update u_len_possible_match ------------------------------------------------------------------------ r4073 | gupta | 2015-10-02 14:23:56 +0200 (Fri, 02 Oct 2015) | 1 line Fix r4072 ------------------------------------------------------------------------ r4072 | gupta | 2015-10-02 11:24:58 +0200 (Fri, 02 Oct 2015) | 1 line Adding new lexical masks: LETTER, WORD, UPPER, LOWER and FIRST ------------------------------------------------------------------------ r4069 | gupta | 2015-10-01 15:50:58 +0200 (Thu, 01 Oct 2015) | 1 line Denormalize bug fix (error at tag boundry) ------------------------------------------------------------------------ r4068 | gvollant | 2015-09-30 10:14:01 +0200 (Wed, 30 Sep 2015) | 1 line BatchRunScript can display progress info ------------------------------------------------------------------------ r4067 | martinec | 2015-09-29 14:34:42 +0200 (Tue, 29 Sep 2015) | 1 line refactor: split disclaimer files to support r4062 ------------------------------------------------------------------------ r4065 | martinec | 2015-09-28 12:10:38 +0200 (Mon, 28 Sep 2015) | 1 line minor: fix download page urls and spell information ------------------------------------------------------------------------ r4064 | gvollant | 2015-09-27 23:26:40 +0200 (Sun, 27 Sep 2015) | 1 line UniRunScript can display final report ------------------------------------------------------------------------ r4063 | gvollant | 2015-09-27 22:26:01 +0200 (Sun, 27 Sep 2015) | 1 line Optimize tags analysis ------------------------------------------------------------------------ r4060 | martinec | 2015-09-25 16:58:13 +0200 (Fri, 25 Sep 2015) | 1 line refactor: word-wrap line-breaks ------------------------------------------------------------------------ r4059 | gupta | 2015-09-25 15:42:51 +0200 (Fri, 25 Sep 2015) | 1 line Introducing a new meta tag: . This is will only work in the morphological mode. This tag will match a letter (i.e. it behaves the way the existing tag does in the morphological mode). ------------------------------------------------------------------------ r4058 | martinec | 2015-09-24 23:16:29 +0200 (Thu, 24 Sep 2015) | 1 line update LGPLLR link ------------------------------------------------------------------------ r4057 | martinec | 2015-09-24 23:12:12 +0200 (Thu, 24 Sep 2015) | 1 line enhance README information ------------------------------------------------------------------------ r4056 | gvollant | 2015-09-24 22:10:35 +0200 (Thu, 24 Sep 2015) | 1 line minor warning fix ------------------------------------------------------------------------ r4055 | gvollant | 2015-09-24 18:09:57 +0200 (Thu, 24 Sep 2015) | 1 line warning Fix ------------------------------------------------------------------------