corrlex & corrlexalgos modules).
More...Data Structures | |
| struct | SolutionPart |
| Lexical correction solution part. More... | |
| struct | Solution |
| Lexical correction solution. More... | |
| struct | SolutionSet |
| Lexical correction solutions set. More... | |
Defines | |
| #define | SOLUTION_SET_ALLOC_INCREMENT 30 |
| Number of added elements during a SolutionSet (re)allocation. | |
| #define | DEFAULT_CORRECTION WEIGHTED_CORRECTION |
| Default correction mode. | |
| #define | COST_RANGE 0.5 |
Cost range of the solutions to include during an spellCorrectFlat call when max_solutions parameter is set to 0. | |
| #define | COST_FORMAT "%4.2f" |
Format specification to provide to printf family functions when printing a Weight type value. | |
Typedefs | |
| typedef double | Weight |
| Cost of a lexical transformation. | |
| typedef GSList | PositionList |
| Singly-linked list of string index. | |
Enumerations | |
| enum | CorrectionMode { BASIC_CORRECTION = 0, WEIGHTED_CORRECTION, SPLITED_CORRECTION } |
| Lexical correction mode. More... | |
Functions | |
| void | solutionSetEnlarge (SolutionSet **solution_set, size_t *current_size, size_t *allocated_size) |
| void | parsingChartGetMaxLexemes (ParsingChart *chart, const char *head, StringArray *words) |
| void | getCorrection (const Lexicon *lexicon, const LexicalEntryIndex entry_index, const Weight cost, const gboolean get_word, GString *output) |
| void | positionListAdd (PositionList **position_list, const short int position) |
| ParsingChart * | spellCorrectChart (const char *input_string, const Lexicon *lexicon, const Weight max_cost, const Weight mark_cost, const Weight capital_cost, const Weight blank_cost) |
| ParsingChart * | lexematize (const char *input_string, const Lexicon *lexicon, int(*is_space)(int), int(*is_never_delimiter)(int), int(*is_glueable)(int)) |
| void | solutionSetFree (SolutionSet *solutions_set) |
| void | spellCorrectFlat (const char *input_string, LexicalAccessTable *lexical_access_table, const int max_solutions, const CorrectionMode mode, const Weight max_cost, const Weight mark_cost, const Weight capital_cost, const Weight blank_cost, SolutionSet *solutions_set) |
| void | solutionGetString (const LexicalAssocMem *lam, const SolutionSet *solution_set, size_t index, const char *delimiter, GString *output) |
corrlex & corrlexalgos modules).
SlpTK Library 0.6.0
<lexicalcorrection.h> Antonin Merçay (revision on 14.12.2004)
| #define DEFAULT_CORRECTION WEIGHTED_CORRECTION |
| enum CorrectionMode |
Lexical correction mode.
Specifiy the lexical correction mode to apply during a spellCorrectFlat call
| BASIC_CORRECTION | Lexical correction that use only insertion, deletion and substituion operations. All operations have an unitary cost |
| WEIGHTED_CORRECTION | Lexical correction similar to BASIC_CORRECTION, but that also take into account accenting, capital/small letter conversion and blank characters insertion/deletion operations. Each one of this three operations can have its own (not necessary whole number) cost |
| SPLITED_CORRECTION | Lexical correction similar to WEIGHTED_CORRECTION, but where the insertion/deletion of blank characters can occur between words, i.e. the correction result may consist of a sequence of several words |
| void getCorrection | ( | const Lexicon * | lexicon, | |
| const LexicalEntryIndex | entry_index, | |||
| const Weight | cost, | |||
| const gboolean | get_word, | |||
| GString * | output | |||
| ) |
Dump a lexical correction in an output string buffer
| [in] | lexicon | The reference vocabulary lexicon |
| [in] | entry_index | The index of the corrected word in the lexicon |
| [in] | cost | The cost of the lexical required correction (0 to avoid cost printing) |
| [in] | get_word | Set if the correct word graphy must be extracted from the vocabulary lexicon |
| output | The string buffer where to append the correction |
affiche_correction | ParsingChart * lexematize | ( | const char * | input_string, | |
| const Lexicon * | lexicon, | |||
| int(*)(int) | is_space, | |||
| int(*)(int) | is_never_delimiter, | |||
| int(*)(int) | is_glueable | |||
| ) |
Lexematization algorithm (in other words, lexical correction with null cost) that cuts up an input string in lexical tokens.
| [in] | input_string | The input string to lexematize |
| [in] | lexicon | The reference lexicon containing the reference lexemes |
| [in] | is_space | The blank character classification routine |
| [in] | is_never_delimiter | The classification routine for characters that are never delimiter |
| [in] | is_glueable | The glueable character classification routine |
Correction_Zero & Lexematise | void parsingChartGetMaxLexemes | ( | ParsingChart * | chart, | |
| const char * | head, | |||
| StringArray * | words | |||
| ) |
Extract (from the left to the right) the lexemes sequence that cover a sentence processed by lexematize. The lexemes are outputted in a StringArray where each unknow words are prefixed by provided head parameter.
| [in] | chart | The considered parsing chart |
| [in] | head | The prefix to insert before unknown words |
| [out] | words | The array where to output the solution |
Solution_Max_Treillis (from Christophe de Benoit's project) | void positionListAdd | ( | PositionList ** | position_list, | |
| const short int | position | |||
| ) |
Add a value to a position list sorted in ascending order
| position_list | The position list where to add | |
| position | The value to add to the list |
ajoute_liste_pos_chaine | void solutionGetString | ( | const LexicalAssocMem * | lam, | |
| const SolutionSet * | solution_set, | |||
| size_t | index, | |||
| const char * | delimiter, | |||
| GString * | output | |||
| ) |
Convert a lexical correction solution into its equivalent string representation
| [in] | lam | The LexicalAssocMem that contains the strings to convert to. |
| [in] | solution_set | The solution set that contains the solutions to convert from. |
| [in] | index | The index of the solution inside solution_set |
| [in] | delimiter | The string to insert between each words of the solution. If NULL is specified, a single space is used. |
| [out] | output | The string where to output the result |
Solution_Vers_String | void solutionSetEnlarge | ( | SolutionSet ** | solution_set, | |
| size_t * | current_size, | |||
| size_t * | allocated_size | |||
| ) |
Enlarge from one element the size of a solution set
| solution_set | The solution set to enlarge | |
| current_size | The number of elements currently used (incremented after function completion) | |
| allocated_size | The number of elements currently allocated (may be increased after function completion) |
augmente_ens_sol | void solutionSetFree | ( | SolutionSet * | solutions_set | ) |
Free the memory allocated to a solution set
| solutions_set | The solution set to free |
Libere_Ensemble_Solutions | ParsingChart ** spellCorrectChart | ( | const char * | input_string, | |
| const Lexicon * | lexicon, | |||
| const Weight | max_cost, | |||
| const Weight | mark_cost, | |||
| const Weight | capital_cost, | |||
| const Weight | blank_cost | |||
| ) |
Correct a string using the words stored in a lexicon up to a given lexical transformation cost. The operation returns a lattice (stored in a parsing chart) that contains all the words sequences found.
| [in] | input_string | The input string to lexematize |
| [in] | lexicon | The reference lexicon containing the reference lexemes |
| [in] | max_cost | The maximal allowed correction cost between the input and a solution |
| [in] | mark_cost | The cost of an accenting/desaccenting transformation |
| [in] | capital_cost | The cost of a capital/small letter transformation |
| [in] | blank_cost | The cost of a blank character insertion/deletion transformation |
Correction_Treillis | void spellCorrectFlat | ( | const char * | input_string, | |
| LexicalAccessTable * | lexical_access_table, | |||
| const int | max_solutions, | |||
| const CorrectionMode | mode, | |||
| const Weight | max_cost, | |||
| const Weight | mark_cost, | |||
| const Weight | capital_cost, | |||
| const Weight | blank_cost, | |||
| SolutionSet * | solutions_set | |||
| ) |
Correct a string using the words stored in a lexical access table up to a given lexical transformation cost.The operation returns a solutions set where each (flat) solution is a sequence of recognized words.
| [in] | input_string | The string to correct |
| [in] | lexical_access_table | The lexical memory containing the recognized words |
| [in] | max_solutions | The maximal number of solutions to output:
|
| [in] | mode | The correction mode used |
| [in] | max_cost | The maximal allowed correction cost between the input and a solution |
| [in] | mark_cost | The cost of an accenting/desaccenting transformation |
| [in] | capital_cost | The cost of a capital/small letter transformation |
| [in] | blank_cost | The cost of a blank character insertion/deletion transformation |
| [out] | solutions_set | The set of solutions found |
Correction_Lexico
1.4.7