Token - Tokenization module
[Lexical tools level]

Tokenization algorithm and related data structures (former token module). More...


Data Structures
struct	Token
	Token. More...
struct	Tokenization
	Tokenization. More...
Typedefs
typedef GSList	TokenList
	List of tokens.
Functions
Tokenization	tokenize (const char text, const gboolean text_follows, int(is_delimiter)(int), int(is_never_delimiter)(int), int(is_glueable)(int))
void	tokenizationFree (Tokenization *tokenization)
void	tokenizationDump (Tokenization tokenization, const char delimiter, int(print)(const char ,...))
void	tokenDump (const Token token, const char delimiter, int(print)(const char ,...))
gboolean	tokenMerge (TokenList token_list1, TokenList token_list2)
void	tokenGetString (const Token token, GString output)

Detailed Description

Tokenization algorithm and related data structures (former token module).

SlpTK Library 0.6.0

Required header: <token.h>

Author:: Jean-Cédric Chappelier (creation on 13.02.1997)
Antonin Merçay (revision on 15.12.2004)

Date:: 2 March 2005

Version:: 0.6.0

Typedef Documentation

typedef GSList TokenList

List of tokens.

Singly-linked list of Token

Function Documentation

void tokenDump	(	const Token *	token,
		const char *	delimiter,
		int()(const char ,...)	print
	)

Dump a token

Parameters:

`[in]`	token	The token to dump
`[in]`	delimiter	The string to dump at the end of the token
`[in]`	print	The printing function used to dump

Former(s) function(s):: Affiche_Token

void tokenGetString	(	const Token *	token,
		GString *	output
	)

Convert a token into its string representation

Parameters:

	token	The token to convert
	output	The string where to output

Former(s) function(s):: Token2String

void tokenizationDump	(	Tokenization *	tokenization,
		const char *	delimiter,
		int()(const char ,...)	print
	)

Dump a tokenization

Parameters:

`[in]`	tokenization	The tokenization to dump
`[in]`	delimiter	The string to dump at the end of each token
`[in]`	print	The printing function used to dump

Former(s) function(s):: Affiche_Tokenisation

void tokenizationFree ( Tokenization * tokenization )

Free the memory allowed to a tokenization

Parameters:

[in] tokenization The tokenization to free

Former(s) function(s):: Libere_Tokenisation

Tokenization tokenize	(	const char *	text,
		const gboolean	text_follows,
		int(*)(int)	is_delimiter,
		int(*)(int)	is_never_delimiter,
		int(*)(int)	is_glueable
	)

Tokenize an input string using the given character classification routines.

Parameters:

`[in]`	text	The input string to tokenize
`[in]`	text_follows	A flag that tells if text that carries on the treated one may come. If `text_follows` is set and the tokenization of `text` ends in the middle of a word, the last token is ignored because it may be complete at the next treatment.
`[in]`	is_delimiter	The routine recognizing the delimiter characters
`[in]`	is_never_delimiter	The routine recognizing the characters that are never delimiters
`[in]`	is_glueable	The routine recognizing the glueable characters

Returns:: The resulting tokenization

See also:: tokenizationFree()

Former(s) function(s):: Tokenise

gboolean tokenMerge	(	TokenList *	token_list1,
		TokenList *	token_list2
	)

Merge two tokens

Parameters:

	token_list1	The token list element of the first token to merge
	token_list2	The token list element of the second token to merge

Returns:: The flag telling if operation could be performed (TRUE) or not (FALSE, i.e. the tokens are not joinable)

Former(s) function(s):: Join_Token

Generated on Thu Mar 22 17:46:31 2007 for SlpTk by

1.4.7

Token - Tokenization module [Lexical tools level]

Data Structures

Typedefs

Functions

Detailed Description

Typedef Documentation

Function Documentation

Token - Tokenization module
[Lexical tools level]