ESLA

Embeddable Scriting LAnguage

frantz@pangea.stanford.edu

Stanford University, Rock Fracture Project research group

© 2003

Tokenizer Class Reference


Detailed Description

A usefull class witch detect tokens inside a string This class allow the definition of special traitments while detecting tokens, such as endline, separators etc...

Definition at line 46 of file tokenizer.h.

Public Types

typedef std::vector< std::string > ListTokens
enum  { max_token_length = 1000 }

Public Methods

 Tokenizer ()
 Tokenizer (const std::string &seps, bool allowEmpToks, const std::string &ignore, const std::string &endL, const std::string &terminals, bool useSeps)
 Tokenizer (const Tokenizer &tokenizer)
Tokenizer & operator= (const Tokenizer &tokenizer)
ListTokens tokenize (const std::string &str, const bool tolowercase=false)
void separators (const std::string &str)
const std::string & separators () const
void use_separators (bool flag)
bool use_separators () const
void allow_empty_tokens (bool flag)
bool allow_empty_tokens () const
void ignore (const std::string &str)
const std::string & ignore () const
void endline (const std::string &str)
const std::string & endline () const
void terminals (const std::string &str)
const std::string & terminals () const

Protected Methods

void add_token (std::vector< std::string > &tokens, char *token, int &index, const bool tolowercase)
void tokenize (const std::string &str, std::vector< std::string > &tokens, const bool tolowercase)
std::string lower_case (const std::string &s)


Member Typedef Documentation

typedef std::vector< std::string> Tokenizer::ListTokens
 

Definition at line 48 of file tokenizer.h.


Member Enumeration Documentation

anonymous enum
 

Enumeration values:
max_token_length 

Definition at line 49 of file tokenizer.h.


Constructor & Destructor Documentation

Tokenizer::Tokenizer  
 

Tokenizer::Tokenizer const std::string &    seps,
bool    allowEmpToks,
const std::string &    ignore,
const std::string &    endL,
const std::string &    terminals,
bool    useSeps
 

Tokenizer::Tokenizer const Tokenizer &    tokenizer
 


Member Function Documentation

void Tokenizer::add_token std::vector< std::string > &    tokens,
char *    token,
int &    index,
const bool    tolowercase
[protected]
 

bool Tokenizer::allow_empty_tokens   const
 

Definition at line 118 of file tokenizer.h.

void Tokenizer::allow_empty_tokens bool    flag
 

Definition at line 122 of file tokenizer.h.

const std::string & Tokenizer::endline   const
 

Definition at line 138 of file tokenizer.h.

void Tokenizer::endline const std::string &    str
 

Definition at line 134 of file tokenizer.h.

const std::string & Tokenizer::ignore   const
 

Definition at line 130 of file tokenizer.h.

void Tokenizer::ignore const std::string &    str
 

Definition at line 126 of file tokenizer.h.

std::string Tokenizer::lower_case const std::string &    s [protected]
 

Tokenizer& Tokenizer::operator= const Tokenizer &    tokenizer
 

const std::string & Tokenizer::separators   const
 

Definition at line 106 of file tokenizer.h.

void Tokenizer::separators const std::string &    str
 

Definition at line 102 of file tokenizer.h.

const std::string & Tokenizer::terminals   const
 

Definition at line 146 of file tokenizer.h.

void Tokenizer::terminals const std::string &    str
 

Definition at line 142 of file tokenizer.h.

void Tokenizer::tokenize const std::string &    str,
std::vector< std::string > &    tokens,
const bool    tolowercase
[protected]
 

ListTokens Tokenizer::tokenize const std::string &    str,
const bool    tolowercase = false
 

bool Tokenizer::use_separators   const
 

Definition at line 110 of file tokenizer.h.

void Tokenizer::use_separators bool    flag
 

Definition at line 114 of file tokenizer.h.


Generated on Wed May 14 11:42:34 2003 for Esla-lib by doxygen1.3-rc1