Forth是一种基于栈的编程语言,以其简洁性和高效性而闻名。它采用逆波兰表示法(后缀表示法),所有操作都通过操作栈来完成。在 Exercism 的 “forth” 练习中,我们需要实现一个简化版的Forth解释器,支持基本的算术运算、栈操作和用户自定义词汇。这不仅能帮助我们掌握解释器设计,还能深入学习Rust中的错误处理、数据结构和解析技术。
什么是Forth?
Forth是一种基于栈的编程语言,具有以下特点:
- 基于栈:所有操作都通过操作栈完成
- 逆波兰表示法:操作符后置于操作数(如:1 2 +)
- 交互式:支持即时执行命令
- 可扩展:允许用户定义新词汇(函数)
例如:
1 2 + // 将1和2压入栈,然后执行加法操作
3 4 - // 将3和4压入栈,然后执行减法操作,结果为-1
让我们先看看练习提供的结构和函数签名:
pub type Value = i32;
pub type Result = std::result::Result<(), Error>;pub struct Forth;#[derive(Debug, PartialEq)]pub enum Error {DivisionByZero,StackUnderflow,UnknownWord,InvalidWord,}impl Forth {pub fn new() -> Forth {unimplemented!()}pub fn stack(&self) -> &[Value] {unimplemented!()}pub fn eval(&mut self, input: &str) -> Result {unimplemented!("result of evaluating '{}'", input)}}
我们需要实现一个完整的Forth解释器,支持基本操作和用户自定义词汇。
设计分析
1. 核心组件
- 栈:存储数据值
- 词汇表:存储内置和用户定义的词汇
- 解析器:解析输入字符串
- 执行器:执行解析后的命令
2. 错误处理
需要处理以下错误情况:
- DivisionByZero:除零错误
- StackUnderflow:栈下溢错误
- UnknownWord:未知词汇错误
- InvalidWord:无效词汇错误
完整实现
1. 基础结构实现
use std::collections::HashMap;
pub type Value = i32;
pub type Result = std::result::Result<(), Error>;#[derive(Debug, PartialEq)]pub enum Error {DivisionByZero,StackUnderflow,UnknownWord,InvalidWord,}pub struct Forth {stack: Vec<Value>,words: HashMap<String, Vec<String>>,}impl Forth {pub fn new() -> Forth {Forth {stack: Vec::new(),words: HashMap::new(),}}pub fn stack(&self) -> &[Value] {&self.stack}pub fn eval(&mut self, input: &str) -> Result {let tokens: Vec<&str> = input.split_whitespace().collect();self.eval_tokens(&tokens)}fn eval_tokens(&mut self, tokens: &[&str]) -> Result {let mut i = 0;while i < tokens.len() {let token = tokens[i];// 检查是否是定义词汇的开始if token.eq_ignore_ascii_case(":") {i = self.define_word(&tokens[i+1..])?;continue;}// 检查是否是数字if let Ok(num) = token.parse::<Value>() {self.stack.push(num);i += 1;continue;}// 检查是否是内置命令if self.eval_builtin(token)? {i += 1;continue;}// 检查是否是用户定义的词汇if let Some(word_def) = self.words.get(&token.to_ascii_lowercase()) {let word_tokens: Vec<&str> = word_def.iter().map(|s| s.as_str()).collect();self.eval_tokens(&word_tokens)?;i += 1;continue;}// 未知词汇return Err(Error::UnknownWord);}Ok(())}fn define_word(&mut self, tokens: &[&str]) -> Result<usize> {if tokens.is_empty() {return Err(Error::InvalidWord);}// 获取词汇名称let word_name = tokens[0];// 检查名称是否是数字if word_name.parse::<Value>().is_ok() {return Err(Error::InvalidWord);}// 查找分号let semicolon_pos = tokens.iter().position(|&t| t == ";").ok_or(Error::InvalidWord)?;// 获取定义内容let definition: Vec<String> = tokens[1..semicolon_pos].iter().map(|s| s.to_string()).collect();// 存储定义self.words.insert(word_name.to_ascii_lowercase(), definition);// 返回下一个位置Ok(semicolon_pos + 1)}fn eval_builtin(&mut self, token: &str) -> std::result::Result<bool, Error> {match token.to_ascii_lowercase().as_str() {"+" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = self.stack.pop().unwrap();let a = self.stack.pop().unwrap();self.stack.push(a + b);Ok(true)}"-" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = self.stack.pop().unwrap();let a = self.stack.pop().unwrap();self.stack.push(a - b);Ok(true)}"*" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = self.stack.pop().unwrap();let a = self.stack.pop().unwrap();self.stack.push(a * b);Ok(true)}"/" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = self.stack.pop().unwrap();let a = self.stack.pop().unwrap();if b == 0 {return Err(Error::DivisionByZero);}self.stack.push(a / b);Ok(true)}"dup" => {if self.stack.is_empty() {return Err(Error::StackUnderflow);}let val = self.stack[self.stack.len() - 1];self.stack.push(val);Ok(true)}"drop" => {if self.stack.is_empty() {return Err(Error::StackUnderflow);}self.stack.pop();Ok(true)}"swap" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let len = self.stack.len();self.stack.swap(len - 1, len - 2);Ok(true)}"over" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let val = self.stack[self.stack.len() - 2];self.stack.push(val);Ok(true)}_ => Ok(false), // 不是内置命令}}}
测试用例分析
通过查看测试用例,我们可以更好地理解需求:
#[test]
fn no_input_no_stack() {
assert_eq!(Vec::<Value>::new(), Forth::new().stack());}
新创建的Forth实例应该有空栈。
#[test]
fn numbers_just_get_pushed_onto_the_stack() {
let mut f = Forth::new();
assert!(f.eval("1 2 3 4 5").is_ok());
assert_eq!(vec![1, 2, 3, 4, 5], f.stack());
}
数字应该被压入栈中。
#[test]
fn can_add_two_numbers() {
let mut f = Forth::new();
assert!(f.eval("1 2 +").is_ok());
assert_eq!(vec![3], f.stack());
}
加法运算应该正确执行。
#[test]
fn addition_error() {
let mut f = Forth::new();
assert_eq!(Err(Error::StackUnderflow), f.eval("1 +"));
assert_eq!(Err(Error::StackUnderflow), f.eval("+"));
}
栈下溢时应该返回错误。
#[test]
fn errors_if_dividing_by_zero() {
let mut f = Forth::new();
assert_eq!(Err(Error::DivisionByZero), f.eval("4 0 /"));
}
除零时应该返回错误。
#[test]
fn can_consist_of_built_in_words() {
let mut f = Forth::new();
assert!(f.eval(": dup-twice dup dup ;").is_ok());
assert!(f.eval("1 dup-twice").is_ok());
assert_eq!(vec![1, 1, 1], f.stack());
}
应该支持用户自定义词汇。
#[test]
fn can_use_different_words_with_the_same_name() {
let mut f = Forth::new();
assert!(f.eval(": foo 5 ;").is_ok());
assert!(f.eval(": bar foo ;").is_ok());
assert!(f.eval(": foo 6 ;").is_ok());
assert!(f.eval("bar foo").is_ok());
assert_eq!(vec![5, 6], f.stack());
}
应该支持词汇的重新定义。
#[test]
fn defining_a_number() {
let mut f = Forth::new();
assert_eq!(Err(Error::InvalidWord), f.eval(": 1 2 ;"));
}
不应该允许用数字作为词汇名称。
性能优化版本
考虑性能的优化实现:
use std::collections::HashMap;
pub type Value = i32;
pub type Result = std::result::Result<(), Error>;#[derive(Debug, PartialEq)]pub enum Error {DivisionByZero,StackUnderflow,UnknownWord,InvalidWord,}pub struct Forth {stack: Vec<Value>,words: HashMap<String, Vec<String>>,}impl Forth {pub fn new() -> Forth {Forth {stack: Vec::with_capacity(32), // 预分配容量words: HashMap::new(),}}pub fn stack(&self) -> &[Value] {&self.stack}pub fn eval(&mut self, input: &str) -> Result {// 预分配tokens向量let mut tokens = Vec::with_capacity(input.len() / 2);for token in input.split_whitespace() {tokens.push(token);}self.eval_tokens(&tokens)}fn eval_tokens(&mut self, tokens: &[&str]) -> Result {let mut i = 0;while i < tokens.len() {let token = tokens[i];// 检查是否是定义词汇的开始if token.eq_ignore_ascii_case(":") {i = self.define_word(&tokens[i+1..])?;continue;}// 检查是否是数字if let Ok(num) = token.parse::<Value>() {self.stack.push(num);i += 1;continue;}// 检查是否是内置命令match self.eval_builtin(token) {Ok(true) => {i += 1;continue;}Ok(false) => {// 继续检查用户定义的词汇}Err(e) => return Err(e),}// 检查是否是用户定义的词汇if let Some(word_def) = self.words.get(&token.to_ascii_lowercase()) {// 为了避免递归调用导致栈溢出,我们直接展开定义let mut expanded_tokens = Vec::with_capacity(word_def.len());for word in word_def {expanded_tokens.push(word.as_str());}self.eval_tokens(&expanded_tokens)?;i += 1;continue;}// 未知词汇return Err(Error::UnknownWord);}Ok(())}fn define_word(&mut self, tokens: &[&str]) -> Result<usize> {if tokens.is_empty() {return Err(Error::InvalidWord);}// 获取词汇名称let word_name = tokens[0];// 检查名称是否是数字if word_name.parse::<Value>().is_ok() {return Err(Error::InvalidWord);}// 查找分号let semicolon_pos = tokens.iter().position(|&t| t == ";").ok_or(Error::InvalidWord)?;// 获取定义内容let definition: Vec<String> = tokens[1..semicolon_pos].iter().map(|s| s.to_string()).collect();// 存储定义self.words.insert(word_name.to_ascii_lowercase(), definition);// 返回下一个位置Ok(semicolon_pos + 1)}fn eval_builtin(&mut self, token: &str) -> std::result::Result<bool, Error> {match token.to_ascii_lowercase().as_str() {"+" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = self.stack.pop().unwrap();let a = self.stack.pop().unwrap();self.stack.push(a + b);Ok(true)}"-" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = self.stack.pop().unwrap();let a = self.stack.pop().unwrap();self.stack.push(a - b);Ok(true)}"*" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = self.stack.pop().unwrap();let a = self.stack.pop().unwrap();self.stack.push(a * b);Ok(true)}"/" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = self.stack.pop().unwrap();let a = self.stack.pop().unwrap();if b == 0 {return Err(Error::DivisionByZero);}self.stack.push(a / b);Ok(true)}"dup" => {if self.stack.is_empty() {return Err(Error::StackUnderflow);}let val = self.stack[self.stack.len() - 1];self.stack.push(val);Ok(true)}"drop" => {if self.stack.is_empty() {return Err(Error::StackUnderflow);}self.stack.pop();Ok(true)}"swap" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let len = self.stack.len();self.stack.swap(len - 1, len - 2);Ok(true)}"over" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let val = self.stack[self.stack.len() - 2];self.stack.push(val);Ok(true)}_ => Ok(false), // 不是内置命令}}}
内存优化版本(防止分配攻击)
考虑内存使用的优化实现:
use std::collections::HashMap;
pub type Value = i32;
pub type Result = std::result::Result<(), Error>;#[derive(Debug, PartialEq)]pub enum Error {DivisionByZero,StackUnderflow,UnknownWord,InvalidWord,}pub struct Forth {stack: Vec<Value>,words: HashMap<String, WordDefinition>,}// 使用枚举来存储词汇定义,避免在定义时展开#[derive(Debug, Clone)]enum WordDefinition {BuiltIn(BuiltInWord),Custom(Vec<Token>),}#[derive(Debug, Clone)]enum Token {Number(Value),Word(String),}#[derive(Debug, Clone)]enum BuiltInWord {Add,Subtract,Multiply,Divide,Dup,Drop,Swap,Over,}impl Forth {pub fn new() -> Forth {let mut forth = Forth {stack: Vec::with_capacity(32),words: HashMap::new(),};// 初始化内置词汇forth.init_builtins();forth}pub fn stack(&self) -> &[Value] {&self.stack}pub fn eval(&mut self, input: &str) -> Result {let tokens = self.tokenize(input)?;self.eval_tokens(&tokens)}fn init_builtins(&mut self) {self.words.insert("+".to_string(), WordDefinition::BuiltIn(BuiltInWord::Add));self.words.insert("-".to_string(), WordDefinition::BuiltIn(BuiltInWord::Subtract));self.words.insert("*".to_string(), WordDefinition::BuiltIn(BuiltInWord::Multiply));self.words.insert("/".to_string(), WordDefinition::BuiltIn(BuiltInWord::Divide));self.words.insert("dup".to_string(), WordDefinition::BuiltIn(BuiltInWord::Dup));self.words.insert("drop".to_string(), WordDefinition::BuiltIn(BuiltInWord::Drop));self.words.insert("swap".to_string(), WordDefinition::BuiltIn(BuiltInWord::Swap));self.words.insert("over".to_string(), WordDefinition::BuiltIn(BuiltInWord::Over));}fn tokenize(&self, input: &str) -> Result<Vec<Token>> {let mut tokens = Vec::new();for word in input.split_whitespace() {if let Ok(num) = word.parse::<Value>() {tokens.push(Token::Number(num));} else {tokens.push(Token::Word(word.to_ascii_lowercase()));}}Ok(tokens)}fn eval_tokens(&mut self, tokens: &[Token]) -> Result {for token in tokens {match token {Token::Number(value) => {self.stack.push(*value);}Token::Word(word) => {if word == ":" {// 这里应该处理定义,但在tokenize阶段已经处理了// 实际实现中需要更复杂的解析逻辑return Err(Error::InvalidWord);}self.eval_word(word)?;}}}Ok(())}fn eval_word(&mut self, word: &str) -> Result {// 处理定义命令if word == ":" {return Err(Error::InvalidWord); // 定义应该在更高层处理}// 查找词汇定义match self.words.get(word) {Some(WordDefinition::BuiltIn(builtin)) => {self.eval_builtin(builtin.clone())}Some(WordDefinition::Custom(tokens)) => {// 递归执行自定义词汇for token in tokens {match token {Token::Number(value) => {self.stack.push(*value);}Token::Word(word) => {self.eval_word(word)?;}}}Ok(())}None => Err(Error::UnknownWord),}}fn define_word(&mut self, tokens: &[&str]) -> Result<usize> {if tokens.is_empty() {return Err(Error::InvalidWord);}// 获取词汇名称let word_name = tokens[0];// 检查名称是否是数字if word_name.parse::<Value>().is_ok() {return Err(Error::InvalidWord);}// 查找分号let semicolon_pos = tokens.iter().position(|&t| t == ";").ok_or(Error::InvalidWord)?;// 获取定义内容并转换为Tokenlet mut definition = Vec::new();for token in &tokens[1..semicolon_pos] {if let Ok(num) = token.parse::<Value>() {definition.push(Token::Number(num));} else {definition.push(Token::Word(token.to_ascii_lowercase()));}}// 存储定义self.words.insert(word_name.to_ascii_lowercase(), WordDefinition::Custom(definition));// 返回下一个位置Ok(semicolon_pos + 1)}fn eval_builtin(&mut self, builtin: BuiltInWord) -> Result {match builtin {BuiltInWord::Add => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = self.stack.pop().unwrap();let a = self.stack.pop().unwrap();self.stack.push(a + b);Ok(())}BuiltInWord::Subtract => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = self.stack.pop().unwrap();let a = self.stack.pop().unwrap();self.stack.push(a - b);Ok(())}BuiltInWord::Multiply => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = self.stack.pop().unwrap();let a = self.stack.pop().unwrap();self.stack.push(a * b);Ok(())}BuiltInWord::Divide => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = self.stack.pop().unwrap();let a = self.stack.pop().unwrap();if b == 0 {return Err(Error::DivisionByZero);}self.stack.push(a / b);Ok(())}BuiltInWord::Dup => {if self.stack.is_empty() {return Err(Error::StackUnderflow);}let val = self.stack[self.stack.len() - 1];self.stack.push(val);Ok(())}BuiltInWord::Drop => {if self.stack.is_empty() {return Err(Error::StackUnderflow);}self.stack.pop();Ok(())}BuiltInWord::Swap => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let len = self.stack.len();self.stack.swap(len - 1, len - 2);Ok(())}BuiltInWord::Over => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let val = self.stack[self.stack.len() - 2];self.stack.push(val);Ok(())}}}}
错误处理和边界情况
考虑更多边界情况的实现:
use std::collections::HashMap;
pub type Value = i32;
pub type Result = std::result::Result<(), Error>;#[derive(Debug, PartialEq)]pub enum Error {DivisionByZero,StackUnderflow,UnknownWord,InvalidWord,}pub struct Forth {stack: Vec<Value>,words: HashMap<String, Vec<ParsedToken>>,state: ParserState,}#[derive(Debug, Clone)]enum ParsedToken {Number(Value),Word(String),}#[derive(Debug)]enum ParserState {Normal,Defining { name: String, body: Vec<ParsedToken> },}impl Forth {pub fn new() -> Forth {Forth {stack: Vec::with_capacity(32),words: HashMap::new(),state: ParserState::Normal,}}pub fn stack(&self) -> &[Value] {&self.stack}pub fn eval(&mut self, input: &str) -> Result {// 重置状态self.state = ParserState::Normal;let tokens = self.tokenize(input)?;self.eval_tokens(&tokens)}fn tokenize(&self, input: &str) -> Result<Vec<ParsedToken>> {let mut tokens = Vec::new();for word in input.split_whitespace() {if let Ok(num) = word.parse::<Value>() {tokens.push(ParsedToken::Number(num));} else {tokens.push(ParsedToken::Word(word.to_string()));}}Ok(tokens)}fn eval_tokens(&mut self, tokens: &[ParsedToken]) -> Result {for token in tokens {match &self.state {ParserState::Normal => {match token {ParsedToken::Number(value) => {self.stack.push(*value);}ParsedToken::Word(word) => {if word.eq_ignore_ascii_case(":") {self.state = ParserState::Defining { name: String::new(), body: Vec::new() };continue;}self.eval_word(word)?;}}}ParserState::Defining { name, body } => {if name.is_empty() {// 这是定义的名称if let Ok(_) = word.parse::<Value>() {return Err(Error::InvalidWord);}self.state = ParserState::Defining {name: word.clone(),body: Vec::new()};} else if word == ";" {// 定义结束let definition = body.clone();self.words.insert(name.to_ascii_lowercase(), definition);self.state = ParserState::Normal;} else {// 添加到定义体中let mut new_body = body.clone();match token {ParsedToken::Number(value) => {new_body.push(ParsedToken::Number(*value));}ParsedToken::Word(word) => {new_body.push(ParsedToken::Word(word.clone()));}}self.state = ParserState::Defining {name: name.clone(),body: new_body};}}}}// 检查是否仍在定义状态if matches!(self.state, ParserState::Defining { .. }) {return Err(Error::InvalidWord);}Ok(())}fn eval_word(&mut self, word: &str) -> Result {// 查找词汇定义if let Some(definition) = self.words.get(&word.to_ascii_lowercase()) {// 执行自定义词汇for token in definition {match token {ParsedToken::Number(value) => {self.stack.push(*value);}ParsedToken::Word(word) => {self.eval_builtin(word)?;}}}return Ok(());}// 执行内置命令self.eval_builtin(word)}fn eval_builtin(&mut self, word: &str) -> Result {match word.to_ascii_lowercase().as_str() {"+" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = self.stack.pop().unwrap();let a = self.stack.pop().unwrap();self.stack.push(a + b);Ok(())}"-" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = self.stack.pop().unwrap();let a = self.stack.pop().unwrap();self.stack.push(a - b);Ok(())}"*" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = self.stack.pop().unwrap();let a = self.stack.pop().unwrap();self.stack.push(a * b);Ok(())}"/" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = self.stack.pop().unwrap();let a = self.stack.pop().unwrap();if b == 0 {return Err(Error::DivisionByZero);}self.stack.push(a / b);Ok(())}"dup" => {if self.stack.is_empty() {return Err(Error::StackUnderflow);}let val = self.stack[self.stack.len() - 1];self.stack.push(val);Ok(())}"drop" => {if self.stack.is_empty() {return Err(Error::StackUnderflow);}self.stack.pop();Ok(())}"swap" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let len = self.stack.len();self.stack.swap(len - 1, len - 2);Ok(())}"over" => {if self.stack.len() < 2 {return Err(Error::StackUnderflow);}let val = self.stack[self.stack.len() - 2];self.stack.push(val);Ok(())}_ => Err(Error::UnknownWord),}}}
扩展功能
基于基础实现,我们可以添加更多功能:
use std::collections::HashMap;
pub type Value = i32;
pub type Result = std::result::Result<(), Error>;#[derive(Debug, PartialEq)]pub enum Error {DivisionByZero,StackUnderflow,UnknownWord,InvalidWord,}pub struct Forth {stack: Vec<Value>,words: HashMap<String, WordDefinition>,debug_mode: bool,}#[derive(Debug, Clone)]enum WordDefinition {BuiltIn(fn(&mut Forth) -> Result),Custom(Vec<String>),}impl Forth {pub fn new() -> Forth {let mut forth = Forth {stack: Vec::with_capacity(32),words: HashMap::new(),debug_mode: false,};// 初始化内置词汇forth.init_builtins();forth}pub fn stack(&self) -> &[Value] {&self.stack}pub fn eval(&mut self, input: &str) -> Result {let tokens: Vec<&str> = input.split_whitespace().collect();self.eval_tokens(&tokens)}pub fn set_debug_mode(&mut self, enabled: bool) {self.debug_mode = enabled;}pub fn clear(&mut self) {self.stack.clear();// 保留用户定义的词汇}fn init_builtins(&mut self) {self.words.insert("+".to_string(), WordDefinition::BuiltIn(add));self.words.insert("-".to_string(), WordDefinition::BuiltIn(subtract));self.words.insert("*".to_string(), WordDefinition::BuiltIn(multiply));self.words.insert("/".to_string(), WordDefinition::BuiltIn(divide));self.words.insert("dup".to_string(), WordDefinition::BuiltIn(dup));self.words.insert("drop".to_string(), WordDefinition::BuiltIn(drop));self.words.insert("swap".to_string(), WordDefinition::BuiltIn(swap));self.words.insert("over".to_string(), WordDefinition::BuiltIn(over));self.words.insert(".".to_string(), WordDefinition::BuiltIn(print_top));self.words.insert(".s".to_string(), WordDefinition::BuiltIn(print_stack));}fn eval_tokens(&mut self, tokens: &[&str]) -> Result {let mut i = 0;while i < tokens.len() {let token = tokens[i];if self.debug_mode {println!("Executing: {}", token);println!("Stack: {:?}", self.stack);}// 检查是否是定义词汇的开始if token.eq_ignore_ascii_case(":") {i = self.define_word(&tokens[i+1..])?;continue;}// 检查是否是数字if let Ok(num) = token.parse::<Value>() {self.stack.push(num);i += 1;continue;}// 查找并执行词汇match self.words.get(&token.to_ascii_lowercase()) {Some(WordDefinition::BuiltIn(func)) => {func(self)?;i += 1;continue;}Some(WordDefinition::Custom(definition)) => {let def_tokens: Vec<&str> = definition.iter().map(|s| s.as_str()).collect();self.eval_tokens(&def_tokens)?;i += 1;continue;}None => return Err(Error::UnknownWord),}}Ok(())}fn define_word(&mut self, tokens: &[&str]) -> Result<usize> {if tokens.is_empty() {return Err(Error::InvalidWord);}// 获取词汇名称let word_name = tokens[0];// 检查名称是否是数字if word_name.parse::<Value>().is_ok() {return Err(Error::InvalidWord);}// 查找分号let semicolon_pos = tokens.iter().position(|&t| t == ";").ok_or(Error::InvalidWord)?;// 获取定义内容let definition: Vec<String> = tokens[1..semicolon_pos].iter().map(|s| s.to_string()).collect();// 存储定义self.words.insert(word_name.to_ascii_lowercase(), WordDefinition::Custom(definition));// 返回下一个位置Ok(semicolon_pos + 1)}}// 内置函数实现fn add(forth: &mut Forth) -> Result {if forth.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = forth.stack.pop().unwrap();let a = forth.stack.pop().unwrap();forth.stack.push(a + b);Ok(())}fn subtract(forth: &mut Forth) -> Result {if forth.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = forth.stack.pop().unwrap();let a = forth.stack.pop().unwrap();forth.stack.push(a - b);Ok(())}fn multiply(forth: &mut Forth) -> Result {if forth.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = forth.stack.pop().unwrap();let a = forth.stack.pop().unwrap();forth.stack.push(a * b);Ok(())}fn divide(forth: &mut Forth) -> Result {if forth.stack.len() < 2 {return Err(Error::StackUnderflow);}let b = forth.stack.pop().unwrap();let a = forth.stack.pop().unwrap();if b == 0 {return Err(Error::DivisionByZero);}forth.stack.push(a / b);Ok(())}fn dup(forth: &mut Forth) -> Result {if forth.stack.is_empty() {return Err(Error::StackUnderflow);}let val = forth.stack[forth.stack.len() - 1];forth.stack.push(val);Ok(())}fn drop(forth: &mut Forth) -> Result {if forth.stack.is_empty() {return Err(Error::StackUnderflow);}forth.stack.pop();Ok(())}fn swap(forth: &mut Forth) -> Result {if forth.stack.len() < 2 {return Err(Error::StackUnderflow);}let len = forth.stack.len();forth.stack.swap(len - 1, len - 2);Ok(())}fn over(forth: &mut Forth) -> Result {if forth.stack.len() < 2 {return Err(Error::StackUnderflow);}let val = forth.stack[forth.stack.len() - 2];forth.stack.push(val);Ok(())}fn print_top(forth: &mut Forth) -> Result {if forth.stack.is_empty() {return Err(Error::StackUnderflow);}let val = forth.stack[forth.stack.len() - 1];print!("{} ", val);std::io::stdout().flush().unwrap();Ok(())}fn print_stack(forth: &mut Forth) -> Result {print!("Stack: ");for val in &forth.stack {print!("{} ", val);}println!();std::io::stdout().flush().unwrap();Ok(())}use std::io::Write;
实际应用场景
Forth在实际开发中有以下应用:
- 嵌入式系统:由于其小巧和高效,常用于嵌入式系统
- 实时系统:Forth的交互性和高效性使其适用于实时系统
- 教育工具:用于教授计算机科学和编程概念
- 系统编程:早期的操作系统和引导程序使用Forth
- 硬件控制:用于控制硬件设备和机器人
- 脚本语言:作为应用程序的扩展语言
- 游戏开发:某些游戏使用Forth作为脚本语言
算法复杂度分析
时间复杂度:
- 数字压栈:O(1)
- 基本运算:O(1)
- 用户定义词汇:O(n),其中n是词汇定义的长度
- 整体执行:O(m),其中m是所有tokens的数量
空间复杂度:
- 栈空间:O(n),其中n是栈中元素数量
- 词汇表:O(w×d),其中w是词汇数量,d是平均定义长度
与其他实现方式的比较
// 使用宏实现的简化版本
macro_rules! forth {
($($token:tt)*) => {
{
let mut stack = Vec::new();
forth_impl!(stack, $($token)*);
stack
}
};
}
macro_rules! forth_impl {
// 数字
($stack:expr, $num:expr $($rest:tt)*) => {
$stack.push($num);
forth_impl!($stack, $($rest)*);
};
// 加法
($stack:expr, + $($rest:tt)*) => {
let b = $stack.pop().unwrap();
let a = $stack.pop().unwrap();
$stack.push(a + b);
forth_impl!($stack, $($rest)*);
};
// 结束
($stack:expr,) => {};
}
// 使用外部库实现
use nom::{IResult, branch::alt, combinator::map, character::complete::{digit1, space1}, multi::many0};#[derive(Debug, Clone)]
enum ForthToken {
Number(i32),
Word(String),
}
fn parse_number(input: &str) -> IResult<&str, ForthToken> {
map(digit1, |s: &str| {
ForthToken::Number(s.parse().unwrap())
})(input)
}
fn parse_word(input: &str) -> IResult<&str, ForthToken> {
map(nom::character::complete::alpha1, |s: &str| {
ForthToken::Word(s.to_string())
})(input)
}
fn parse_token(input: &str) -> IResult<&str, ForthToken> {
alt((parse_number, parse_word))(input)
}
// 函数式风格实现
use std::collections::HashMap;
pub struct ForthFunctional {
stack: Vec<i32>,definitions: HashMap<String, Box<dyn Fn(&mut Vec<i32>) -> Result<(), Error>>>,}impl ForthFunctional {pub fn new() -> Self {let mut forth = ForthFunctional {stack: Vec::new(),definitions: HashMap::new(),};forth.definitions.insert("+".to_string(), Box::new(|stack: &mut Vec<i32>| {if stack.len() < 2 {return Err(Error::StackUnderflow);}let b = stack.pop().unwrap();let a = stack.pop().unwrap();stack.push(a + b);Ok(())}));forth}pub fn eval(&mut self, input: &str) -> Result<()> {for token in input.split_whitespace() {if let Ok(num) = token.parse::<i32>() {self.stack.push(num);} else if let Some(func) = self.definitions.get(token) {func(&mut self.stack)?;} else {return Err(Error::UnknownWord);}}Ok(())}}
总结
通过 forth 练习,我们学到了:
- 解释器设计:掌握了构建简单解释器的基本原理
- 栈操作:理解了基于栈的语言如何工作
- 词法分析:学会了如何解析输入文本
- 错误处理:熟练处理各种边界情况和错误
- 数据结构:了解了如何使用HashMap存储词汇定义
- 内存管理:学会了如何避免内存分配攻击
- 递归处理:理解了如何处理递归定义
这些技能在实际开发中非常有用,特别是在构建DSL(领域特定语言)、脚本引擎和配置处理器时。Forth虽然是一个古老的语言,但其设计思想在现代编程中仍然具有重要价值。通过这个练习,我们不仅实现了Forth解释器,还深入理解了解释器的工作原理。
通过这个练习,我们也看到了Rust在构建安全且高效的语言解释器方面的强大能力。这种结合了安全性和性能的语言特性正是Rust的魅力所在。