Token - Tokenization module
[Lexical tools level]

Tokenization algorithm and related data structures (former token module). More...

Data Structures

struct  Token
 Token. More...
struct  Tokenization
 Tokenization. More...

Typedefs

typedef GSList TokenList
 List of tokens.

Functions

Tokenization tokenize (const char *text, const gboolean text_follows, int(*is_delimiter)(int), int(*is_never_delimiter)(int), int(*is_glueable)(int))
void tokenizationFree (Tokenization *tokenization)
void tokenizationDump (Tokenization *tokenization, const char *delimiter, int(*print)(const char *,...))
void tokenDump (const Token *token, const char *delimiter, int(*print)(const char *,...))
gboolean tokenMerge (TokenList *token_list1, TokenList *token_list2)
void tokenGetString (const Token *token, GString *output)

Detailed Description

Tokenization algorithm and related data structures (former token module).

SlpTK Library 0.6.0

Required header
<token.h>
Author:
Jean-Cédric Chappelier (creation on 13.02.1997)

Antonin Merçay (revision on 15.12.2004)

Date:
2 March 2005
Version:
0.6.0

Typedef Documentation

typedef GSList TokenList

List of tokens.

Singly-linked list of Token


Function Documentation

void tokenDump ( const Token token,
const char *  delimiter,
int(*)(const char *,...)  print 
)

Dump a token

Parameters:
[in] token The token to dump
[in] delimiter The string to dump at the end of the token
[in] print The printing function used to dump
Former(s) function(s):
Affiche_Token

void tokenGetString ( const Token token,
GString *  output 
)

Convert a token into its string representation

Parameters:
token The token to convert
output The string where to output
Former(s) function(s):
Token2String

void tokenizationDump ( Tokenization tokenization,
const char *  delimiter,
int(*)(const char *,...)  print 
)

Dump a tokenization

Parameters:
[in] tokenization The tokenization to dump
[in] delimiter The string to dump at the end of each token
[in] print The printing function used to dump
Former(s) function(s):
Affiche_Tokenisation

void tokenizationFree ( Tokenization tokenization  ) 

Free the memory allowed to a tokenization

Parameters:
[in] tokenization The tokenization to free
Former(s) function(s):
Libere_Tokenisation

Tokenization tokenize ( const char *  text,
const gboolean  text_follows,
int(*)(int)  is_delimiter,
int(*)(int)  is_never_delimiter,
int(*)(int)  is_glueable 
)

Tokenize an input string using the given character classification routines.

Parameters:
[in] text The input string to tokenize
[in] text_follows A flag that tells if text that carries on the treated one may come. If text_follows is set and the tokenization of text ends in the middle of a word, the last token is ignored because it may be complete at the next treatment.
[in] is_delimiter The routine recognizing the delimiter characters
[in] is_never_delimiter The routine recognizing the characters that are never delimiters
[in] is_glueable The routine recognizing the glueable characters
Returns:
The resulting tokenization
See also:
tokenizationFree()
Former(s) function(s):
Tokenise

gboolean tokenMerge ( TokenList token_list1,
TokenList token_list2 
)

Merge two tokens

Parameters:
token_list1 The token list element of the first token to merge
token_list2 The token list element of the second token to merge
Returns:
The flag telling if operation could be performed (TRUE) or not (FALSE, i.e. the tokens are not joinable)
Former(s) function(s):
Join_Token


Generated on Thu Mar 22 17:46:31 2007 for SlpTk by  doxygen 1.4.7