This process usually uses highly efficient regular expression-based techniques to identify Lexemes in input strings and map these lexemes to their corresponding tokens.
A lexeme is a substring that occurs in the string representation of the input program. Lexical analysis maps this substring to a token.
A token is an atomic element in the input program that corresponds to a Terminal Symbol in a Grammar.
For example, a lexical analyser for Java might read the following Java fragment:
Java
int x = (42 + 23); // Lexing usually removes/ignores comments
and identify the following lexemes and corresponding tokens:
| Lexeme | Token | 
|---|---|
"int" | 
INT | 
"x" | 
IDENTIFIER | 
"=" | 
EQ | 
"(" | 
LPAREN | 
"42" | 
INTEGER_LITERAL | 
"+" | 
PLUS | 
"23" | 
INTEGER_LITERAL | 
")" | 
RPAREN | 
";" | 
SEMICOLON | 
In a compiler frontend, the lexical analyser would then pass these tokens to the Parser.
Tokens often maintain a copy (or pointer to) their corresponding lexemes for later passes; for instance,
the IDENTIFIER token in the above would carry a reference to the string "x", and similarly
the INTEGER_LITERAL tokens.