如何在Rust中重写代码

上一篇文章中,我们讨论了如何避免在不需要时在Rust中重写库。但是,当您真正需要它时呢?



在大多数语言中,您将不得不从头开始重写整个库,并且第一个结果将在项目结束时出现。这些端口往往非常昂贵且容易出错,并且通常会中途失败。乔尔•斯波斯基(Joel Spolsky)的解释比我在他的文章中所做的解释更好,为何项目的完全返工是个坏主意



但是,Rust在这类事情上有一个杀手twist。它可以调用C代码而不会产生开销(即C#中P / Invoke环境),并且公开了可以在C中使用的功能,就像在C中的其他任何功能一样。这为另一种方法打开了大门:将



库一次移植到Rust一个功能。



注意

本文中的代码在GitHub上可用随时停止借用代码或灵感。



如果您发现文章有帮助或发现错误,请在博客错误跟踪器中告诉我


入门



在执行任何操作之前,您需要创建一个新项目。我有一个模板,用于安装CI和cargo-generate的许可证



$ cargo generate --git https://github.com/Michael-F-Bryan/github-template --name tinyvm-rs
$ cd tinyvm-rs && tree
tree -I 'vendor|target'
.
├── Cargo.toml
├── LICENSE_APACHE.md
├── LICENSE_MIT.md
├── README.md
├── .travis.yml
└── src
    └── lib.rs

1 directory, 6 files


我们的第一个真正的挑战是构建我们要移植的库并对其进行一些计算。



在这种情况下,我们要移植jakogut / tinyvm



TinyVM是使用纯ANSI C编写的小型,快速,轻量级虚拟机。


为了使将来更容易引用它,让我们将存储库作为子模块添加到我们的项目中。



$ git submodule add https://github.com/jakogut/tinyvm vendor/tinyvm


现在让我们看一下源代码。对于初学者,请参见README.md组装说明。



TinyVM是最小的虚拟机。低内存使用量,少量代码和少量二进制代码。



使用make和GCC在类似UNIX的系统上完成构建。



没有外部依赖性,请保留标准的C库,



使用make或make rebuild进行构建。



要构建调试版本,请在make之后添加DEBUG = yes。要构建启用了性能分析的二进制文件,请在make之后添加PROFILE = yes。



您可以通过gmail.com(joseph.kogut)与我联系
(添加了重点)



好吧,让我们看一下目录tinyvm,看看构建是否可行



$ cd vendor/tinyvm
$ make
clang -Wall -pipe -Iinclude/ -std=gnu11 -Werror -pedantic -pedantic-errors -O3 -c libtvm/tvm_program.c -o libtvm/tvm_program.o
clang -Wall -pipe -Iinclude/ -std=gnu11 -Werror -pedantic -pedantic-errors -O3 -c libtvm/tvm_lexer.c -o libtvm/tvm_lexer.o
clang -Wall -pipe -Iinclude/ -std=gnu11 -Werror -pedantic -pedantic-errors -O3 -c libtvm/tvm.c -o libtvm/tvm.o
clang -Wall -pipe -Iinclude/ -std=gnu11 -Werror -pedantic -pedantic-errors -O3 -c libtvm/tvm_htab.c -o libtvm/tvm_htab.o
clang -Wall -pipe -Iinclude/ -std=gnu11 -Werror -pedantic -pedantic-errors -O3 -c libtvm/tvm_memory.c -o libtvm/tvm_memory.o
clang -Wall -pipe -Iinclude/ -std=gnu11 -Werror -pedantic -pedantic-errors -O3 -c libtvm/tvm_preprocessor.c -o libtvm/tvm_preprocessor.o
clang -Wall -pipe -Iinclude/ -std=gnu11 -Werror -pedantic -pedantic-errors -O3 -c libtvm/tvm_parser.c -o libtvm/tvm_parser.o
clang -Wall -pipe -Iinclude/ -std=gnu11 -Werror -pedantic -pedantic-errors -O3 -c libtvm/tvm_file.c -o libtvm/tvm_file.o
ar rcs lib/libtvm.a libtvm/tvm_program.o libtvm/tvm_lexer.o libtvm/tvm.o libtvm/tvm_htab.o libtvm/tvm_memory.o libtvm/tvm_preprocessor.o libtvm/tvm_parser.o libtvm/tvm_file.o
clang src/tvmi.c -ltvm -Wall -pipe -Iinclude/ -std=gnu11 -Werror -pedantic -pedantic-errors -O3 -Llib/ -o bin/tvmi
clang -Wall -pipe -Iinclude/ -std=gnu11 -Werror -pedantic -pedantic-errors -O3 -c tdb/main.c -o tdb/main.o
clang -Wall -pipe -Iinclude/ -std=gnu11 -Werror -pedantic -pedantic-errors -O3 -c tdb/tdb.c -o tdb/tdb.o
clang tdb/main.o tdb/tdb.o -ltvm -Wall -pipe -Iinclude/ -std=gnu11 -Werror -pedantic -pedantic-errors -O3 -Llib/ -o bin/tdb


当开箱*-dev即用地编译C库而不必安装随机包或在构建系统中弄弄时,我真的很喜欢它



不幸的是,该库不包含任何测试,因此我们不能(立即)确保正确地翻译了各个功能,但是它确实包含一个示例解释器,可用于探索高级功能。



因此,我们知道可以从命令行轻松构建它。现在,我们需要确保我们的板条箱tinyvm能够以编程方式组装所有组件。



这是生成脚本的地方。我们的策略是让Rust板条箱使用构建脚本build.rs和板条箱cc来调用与我们的调用等效的命令make... 从那里,我们可以libtvm像其他任何本机库一样从Rust连接到



您将需要将板条箱添加cc为依赖项。



$ cargo add --build cc
    Updating 'https://github.com/rust-lang/crates.io-index' index
      Adding cc v1.0.47 to build-dependencies


并且确保build.rs从源代码进行编译libtvm



// build.rs

use cc::Build;
use std::path::Path;

fn main() {
    let tinyvm = Path::new("vendor/tinyvm");
    let include = tinyvm.join("include");
    let src = tinyvm.join("libtvm");

    Build::new()
        .warnings(false)
        .file(src.join("tvm_file.c"))
        .file(src.join("tvm_htab.c"))
        .file(src.join("tvm_lexer.c"))
        .file(src.join("tvm_memory.c"))
        .file(src.join("tvm_parser.c"))
        .file(src.join("tvm_preprocessor.c"))
        .file(src.join("tvm_program.c"))
        .file(src.join("tvm.c"))
        .include(&include)
        .compile("tvm");
}


注意

如果您查看了crate文档cc,则可能会注意到一个Build::files()接受路径迭代器的方法。我们可以以编程方式发现的所有文件*.c里面vendor/tinyvm/libtvm,但因为我们是在一个时间代码移植一个功能,它更容易去除各个呼叫.file(),我们的端口。


我们还需要一种方法来告诉Rust可以从中调用什么函数libtvm通常,这是通过在extern块中编写每个函数的定义来完成的,但是幸运的是,有一个名为bindgen的工具可以读取C样式的头文件并为我们生成定义。



让我们从生成绑定vendor/tinyvm/include/tvm/tvm.h



$ cargo install bindgen
$ bindgen vendor/tinyvm/include/tvm/tvm.h -o src/ffi.rs
$ wc --lines src/ffi.rs
992 src/ffi.rs


您将需要向我们的板条箱中添加模块ffi



// src/lib.rs

#[allow(non_camel_case_types, non_snake_case)]
pub mod ffi;


在其中的目录src/tinyvm,我们找到了解释器的源代码tinyvm



// vendor/tinyvm/src/tvmi.c

#include <stdlib.h>
#include <stdio.h>

#include <tvm/tvm.h>

int main(int argc, char **argv)
{
	struct tvm_ctx *vm = tvm_vm_create();

	if (vm != NULL && tvm_vm_interpret(vm, argv[1]) == 0)
		tvm_vm_run(vm);

	tvm_vm_destroy(vm);

	return 0;
}


这非常简单。考虑到我们将使用此解释器作为示例之一,这非常好。



现在,让我们将其直接转换为Rust并将其放在目录中examples/



// examples/tvmi.rs

use std::{env, ffi::CString};
use tinyvm::ffi;

fn main() {
    let filename = CString::new(env::args().nth(1).unwrap()).unwrap();
    // cast away the `const` because that's what libtvm expects
    let filename = filename.as_ptr() as *mut _;

    unsafe {
        let vm = ffi::tvm_vm_create();

        if !vm.is_null() && ffi::tvm_vm_interpret(vm, filename) == 0 {
            ffi::tvm_vm_run(vm);
        }

        ffi::tvm_vm_destroy(vm);
    }
}


作为检查,我们还可以启动虚拟机并确保一切正常。



$ cargo run --example tvmi -- vendor/tinyvm/programs/tinyvm/fact.vm
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s
     Running `target/debug/examples/tvmi vendor/tinyvm/programs/tinyvm/fact.vm`
1
2
6
24
120
720
5040
40320
362880
3628800


类!



低挂水果



当您从这样的事情开始时,很容易潜入最重要的功能并首先进行迁移。尝试抵制这种冲动。您可以轻易地咬下去,而不是咀嚼而已,最终浪费时间或使自己士气低落并使您放弃。



相反,让我们搜索最简单的。



$ ls libtvm
tvm.c  tvm_file.c  tvm_htab.c  tvm_lexer.c  tvm_memory.c  tvm_parser.c
tvm_preprocessor.c  tvm_program.c


该文件tvm_htab.看起来很有希望。我很确定它htab代表“哈希表”,并且Rust标准库已经包含了高质量的实现。我们应该能够足够容易地更改它。



让我们看一下头文件tvm_htab.h并检查我们要处理的内容。



// vendor/tinyvm/include/tvm/tvm_htab.h

#ifndef TVM_HTAB_H_
#define TVM_HTAB_H_

#define KEY_LENGTH 64
#define HTAB_SIZE 4096

struct tvm_htab_node {
	char *key;
	int value;
	void *valptr;
	struct tvm_htab_node *next;
};

struct tvm_htab_ctx {
	unsigned int num_nodes;
	unsigned int size;
	struct tvm_htab_node **nodes;
};

struct tvm_htab_ctx *tvm_htab_create();
void tvm_htab_destroy(struct tvm_htab_ctx *htab);

int tvm_htab_add(struct tvm_htab_ctx *htab, const char *key, int value);
int tvm_htab_add_ref(struct tvm_htab_ctx *htab,
	const char *key, const void *valptr, int len);
int tvm_htab_find(struct tvm_htab_ctx *htab, const char *key);
char *tvm_htab_find_ref(struct tvm_htab_ctx *htab, const char *key);

#endif


看起来很容易实现。唯一的问题是定义tvm_htab_ctxtvm_htab_node包含在头文件中,这意味着某些代码可以直接访问哈希表的内部,而不通过发布的接口。



我们可以通过临时移入结构定义来检查是否有任何东西可以访问哈希表的内部,tvm_htab.c并查看所有内容是否仍在编译。



diff --git a/include/tvm/tvm_htab.h b/include/tvm/tvm_htab.h
index 9feb7a9..e7346b7 100644
--- a/include/tvm/tvm_htab.h
+++ b/include/tvm/tvm_htab.h
@@ -4,18 +4,8 @@
 #define KEY_LENGTH 64
 #define HTAB_SIZE 4096

-struct tvm_htab_node {
-       char *key;
-       int value;
-       void *valptr;
-       struct tvm_htab_node *next;
-};
-
-struct tvm_htab_ctx {
-       unsigned int num_nodes;
-       unsigned int size;
-       struct tvm_htab_node **nodes;
-};
+struct tvm_htab_node;
+struct tvm_htab_ctx;

 struct tvm_htab_ctx *tvm_htab_create();
 void tvm_htab_destroy(struct tvm_htab_ctx *htab);


然后再次运行make



$ make
make
clang -Wall -pipe -Iinclude/ -std=gnu11 -Werror -pedantic -pedantic-errors -O3 -c libtvm/tvm_htab.c -o libtvm/tvm_htab.o
ar rcs lib/libtvm.a libtvm/tvm_program.o libtvm/tvm_lexer.o libtvm/tvm.o libtvm/tvm_htab.o libtvm/tvm_memory.o libtvm/tvm_preprocessor.o libtvm/tvm_parser.o libtvm/tvm_file.o
clang src/tvmi.c -ltvm -Wall -pipe -Iinclude/ -std=gnu11 -Werror -pedantic -pedantic-errors -O3 -Llib/ -o bin/tvmi
clang tdb/main.o tdb/tdb.o -ltvm -Wall -pipe -Iinclude/ -std=gnu11 -Werror -pedantic -pedantic-errors -O3 -Llib/ -o bin/tdb


看来一切仍在进行,现在我们开始第二阶段。我们创建了在引擎盖下使用的一组相同的功能HashMap<K, V>



将自己限制在一个最低限度的存根中,我们得到:



// src/htab.rs

use std::{
    collections::HashMap,
    ffi::CString,
    os::raw::{c_char, c_int, c_void},
};

#[derive(Debug, Default, Clone, PartialEq)]
pub struct HashTable(pub(crate) HashMap<CString, Item>);

#[derive(Debug, Clone, PartialEq)]
pub(crate) struct Item {
    // not sure what to put here yet
}

#[no_mangle]
pub unsafe extern "C" fn tvm_htab_create() -> *mut HashTable {
    unimplemented!()
}

#[no_mangle]
pub unsafe extern "C" fn tvm_htab_destroy(htab: *mut HashTable) {
    unimplemented!()
}

#[no_mangle]
pub unsafe extern "C" fn tvm_htab_add(
    htab: *mut HashTable,
    key: *const c_char,
    value: c_int,
) -> c_int {
    unimplemented!()
}

#[no_mangle]
pub unsafe extern "C" fn tvm_htab_add_ref(
    htab: *mut HashTable,
    key: *const c_char,
    value_ptr: *mut c_void,
    length: c_int,
) -> c_int {
    unimplemented!()
}

#[no_mangle]
pub unsafe extern "C" fn tvm_htab_find(
    htab: *mut HashTable,
    key: *const c_char,
) -> c_int {
    unimplemented!()
}

#[no_mangle]
pub unsafe extern "C" fn tvm_htab_find_ref(
    htab: *mut HashTable,
    key: *const c_char,
) -> *mut c_char {
    unimplemented!()
}


您还需要声明模块htab并从中重新导出其功能lib.rs



// src/lib.rs

mod htab;
pub use htab::*;


现在,我们需要确保原始文件tvm_htab.c没有被编译或链接到最终库,否则链接器将遇到一堆重复的符号错误。



错误墙重复符号
error: linking with `/usr/bin/clang` failed: exit code: 1
  |
  = note: "/usr/bin/clang" "-Wl,--as-needed" "-Wl,-z,noexecstack" "-m64" "-L" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.17q5thi94e1eoj5i.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.19e8sqirbm56nu8g.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.1g6ljku8dwzpfvhi.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.1h5e5mxmiptpb7iz.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.1herotdop66zv9ot.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.1qbfxpvgd885u6o.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.21psdg8ni4vgdrzk.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.2albhpxlxxvc0ccu.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.2btm2dc9rhjhhna1.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.2kct5ftnkrqqr0mf.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.2lwgg3uosup4mkh0.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.2xduj46e9sw5vuan.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.35h8y7f23ua1qnz0.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.3cgfdtku63ltd8oc.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.3ot768hzkzzy7r76.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.3u2xnetcch8f2o02.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.4ldrdjvfzk58myrv.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.4omnum6bdjqsrq8b.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.4s8ch4ccmewulj22.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.4syl3x2rb8328h8x.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.532awiysf0h9r50f.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.5b2qwmmtc5pvnbh.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.dfjs079cp9si4o5.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.qxp6yb2gjpj0v6n.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.xz7ld20yvprst1r.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.z35ukhvchmmby1c.rcgu.o" "-o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.1d7wvlwdjap8p3g4.rcgu.o" "-Wl,--gc-sections" "-pie" "-Wl,-zrelro" "-Wl,-znow" "-nodefaultlibs" "-L" "/home/michael/Documents/tinyvm-rs/target/debug/deps" "-L" "/home/michael/Documents/tinyvm-rs/target/debug/build/tinyvm-3f1a2766f78b5580/out" "-L" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-Wl,-Bstatic" "-Wl,--whole-archive" "-ltvm" "-Wl,--no-whole-archive" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libtest-a39a3e9a77b17f55.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libterm-97a69cd310ff0925.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libgetopts-66a42b1d94e3e6f9.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libunicode_width-dd7761d848144e0d.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_std-f722acdb78755ba0.rlib" "-Wl,--start-group" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-974c3c08f6def4b3.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libpanic_unwind-eb49676f33a2c8a6.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libhashbrown-7ae0446feecc60f2.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_alloc-2de299b65d7f5721.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libbacktrace-64514775bc06309a.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libbacktrace_sys-1ed8aa185c63b9a5.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_demangle-a839df87f563fba5.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libunwind-8e726bdc2018d836.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcfg_if-5285f42cbadf207d.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-b0362d20f8aa58fa.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-f3dd7051708453a4.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_core-83744846c43307ce.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-d5565a3a0f4cfe21.rlib" "-Wl,--end-group" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcompiler_builtins-ea790e85415e3bbf.rlib" "-Wl,-Bdynamic" "-ldl" "-lrt" "-lpthread" "-lgcc_s" "-lc" "-lm" "-lrt" "-lpthread" "-lutil" "-lutil" "-fuse-ld=lld"
  = note: ld.lld: error: duplicate symbol: tvm_htab_create
          >>> defined at htab.rs:14 (src/htab.rs:14)
          >>>            /home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.5b2qwmmtc5pvnbh.rcgu.o:(tvm_htab_create)
          >>> defined at tvm_htab.c:23 (vendor/tinyvm/libtvm/tvm_htab.c:23)
          >>>            tvm_htab.o:(.text.tvm_htab_create+0x0) in archive /home/michael/Documents/tinyvm-rs/target/debug/build/tinyvm-3f1a2766f78b5580/out/libtvm.a

          ld.lld: error: duplicate symbol: tvm_htab_destroy
          >>> defined at htab.rs:17 (src/htab.rs:17)
          >>>            /home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.5b2qwmmtc5pvnbh.rcgu.o:(tvm_htab_destroy)
          >>> defined at tvm_htab.c:35 (vendor/tinyvm/libtvm/tvm_htab.c:35)
          >>>            tvm_htab.o:(.text.tvm_htab_destroy+0x0) in archive /home/michael/Documents/tinyvm-rs/target/debug/build/tinyvm-3f1a2766f78b5580/out/libtvm.a

          ld.lld: error: duplicate symbol: tvm_htab_add_ref
          >>> defined at htab.rs:29 (src/htab.rs:29)
          >>>            /home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.5b2qwmmtc5pvnbh.rcgu.o:(tvm_htab_add_ref)
          >>> defined at tvm_htab.c:160 (vendor/tinyvm/libtvm/tvm_htab.c:160)
          >>>            tvm_htab.o:(.text.tvm_htab_add_ref+0x0) in archive /home/michael/Documents/tinyvm-rs/target/debug/build/tinyvm-3f1a2766f78b5580/out/libtvm.a

          ld.lld: error: duplicate symbol: tvm_htab_add
          >>> defined at htab.rs:20 (src/htab.rs:20)
          >>>            /home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.5b2qwmmtc5pvnbh.rcgu.o:(tvm_htab_add)
          >>> defined at tvm_htab.c:147 (vendor/tinyvm/libtvm/tvm_htab.c:147)
          >>>            tvm_htab.o:(.text.tvm_htab_add+0x0) in archive /home/michael/Documents/tinyvm-rs/target/debug/build/tinyvm-3f1a2766f78b5580/out/libtvm.a

          ld.lld: error: duplicate symbol: tvm_htab_find
          >>> defined at htab.rs:39 (src/htab.rs:39)
          >>>            /home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.5b2qwmmtc5pvnbh.rcgu.o:(tvm_htab_find)
          >>> defined at tvm_htab.c:189 (vendor/tinyvm/libtvm/tvm_htab.c:189)
          >>>            tvm_htab.o:(.text.tvm_htab_find+0x0) in archive /home/michael/Documents/tinyvm-rs/target/debug/build/tinyvm-3f1a2766f78b5580/out/libtvm.a

          ld.lld: error: duplicate symbol: tvm_htab_find_ref
          >>> defined at htab.rs:47 (src/htab.rs:47)
          >>>            /home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-599d57f523fdb1a4.5b2qwmmtc5pvnbh.rcgu.o:(tvm_htab_find_ref)
          >>> defined at tvm_htab.c:199 (vendor/tinyvm/libtvm/tvm_htab.c:199)
          >>>            tvm_htab.o:(.text.tvm_htab_find_ref+0x0) in archive /home/michael/Documents/tinyvm-rs/target/debug/build/tinyvm-3f1a2766f78b5580/out/libtvm.a
          clang: error: linker command failed with exit code 1 (use -v to see invocation)


error: aborting due to previous error

error: could not compile `tinyvm`.


修复实际上非常简单。



diff --git a/build.rs b/build.rs
index 6f274c8..af9d467 100644
--- a/build.rs
+++ b/build.rs
@@ -9,7 +9,6 @@ fn main() {
     Build::new()
         .warnings(false)
         .file(src.join("tvm_file.c"))
-        .file(src.join("tvm_htab.c"))
         .file(src.join("tvm_lexer.c"))
         .file(src.join("tvm_memory.c"))
         .file(src.join("tvm_parser.c"))


tvmi再次 尝试运行该示例失败,就像您希望从完整的程序中获得的那样unimplemented!()



$ cargo run --example tvmi -- vendor/tinyvm/programs/tinyvm/fact.vm
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s
     Running `target/debug/examples/tvmi vendor/tinyvm/programs/tinyvm/fact.vm`
thread 'main' panicked at 'not yet implemented', src/htab.rs:14:57
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.


为新类型添加FFI支持时,最简单的起点是构造函数和析构函数。



信息

C代码只能通过指针访问哈希表,因此我们需要在堆上分配它们之一,然后将该堆分配的对象的所有权转让给调用者。


// src/htab.rs

#[no_mangle]
pub unsafe extern "C" fn tvm_htab_create() -> *mut HashTable {
    let hashtable = Box::new(HashTable::default());
    Box::into_raw(hashtable)
}

#[no_mangle]
pub unsafe extern "C" fn tvm_htab_destroy(htab: *mut HashTable) {
    if htab.is_null() {
        // nothing to free
        return;
    }

    let hashtable = Box::from_raw(htab);
    // explicitly destroy the hashtable
    drop(hashtable);
}


警告

重要的是,呼叫者HashTable只能使用该函数杀死tvm_htab_destroy ()



如果他们不这样做,而是尝试free()直接致电,我们几乎肯定会遇到麻烦。充其量,它会导致大量内存泄漏,但也很有可能我们Box在锈不使用相同的一堆malloc()free (),这意味着锈对象从C的释放可能会损坏桩留在一个破碎的状态。


将项目添加到哈希图中几乎很容易实现。



// src/hmap.rs

#[derive(Debug, Clone, PartialEq)]
pub(crate) struct Item {
    /// An integer value.
    value: c_int,
    /// An opaque value used with [`tvm_htab_add_ref()`].
    ///
    /// # Safety
    ///
    /// Storing the contents of a `void *` in a `Vec<u8>` *would* normally
    /// result in alignment issues, but we've got access to the `libtvm` source
    /// code and know it will only ever store `char *` strings.
    opaque_value: Vec<u8>,
}

impl Item {
    pub(crate) fn integer(value: c_int) -> Item {
        Item {
            value,
            opaque_value: Vec::new(),
        }
    }

    pub(crate) fn opaque<V>(opaque_value: V) -> Item
    where
        V: Into<Vec<u8>>,
    {
        Item {
            value: 0,
            opaque_value: opaque_value.into(),
        }
    }

    pub(crate) fn from_void(pointer: *mut c_void, length: c_int) -> Item {
        // we need to create an owned copy of the value
        let opaque_value = if pointer.is_null() {
            Vec::new()
        } else {
            unsafe {
                std::slice::from_raw_parts(pointer as *mut u8, length as usize)
                    .to_owned()
            }
        };

        Item::opaque(opaque_value)
    }
}

#[no_mangle]
pub unsafe extern "C" fn tvm_htab_add(
    htab: *mut HashTable,
    key: *const c_char,
    value: c_int,
) -> c_int {
    let hashtable = &mut *htab;
    let key = CStr::from_ptr(key).to_owned();

    hashtable.0.insert(key, Item::integer(value));

    // the only time insertion can fail is if allocation fails. In that case
    // we'll abort the process anyway, so if this function returns we can
    // assume it was successful (0 = success).
    0
}

#[no_mangle]
pub unsafe extern "C" fn tvm_htab_add_ref(
    htab: *mut HashTable,
    key: *const c_char,
    value_ptr: *mut c_void,
    length: c_int,
) -> c_int {
    let hashtable = &mut *htab;
    let key = CStr::from_ptr(key).to_owned();

    hashtable.0.insert(key, Item::from_void(value_ptr, length));

    0
}




, CString, String, -, *const c_char , String Rust , UTF-8.



, CStr &str, String , ASCII, , unwrap(), CString.


*_find()可以将 两个函数直接委派给内部函数HashMap<CString, Item>



唯一要注意的地方是确保在找不到元素时返回正确的值。在这种情况下,看一下tvm_htab.c我们可以看到tvm_htab_find()return−1tvm_htab_find_ref()return NULL



// src/hmap.rs

#[no_mangle]
pub unsafe extern "C" fn tvm_htab_find(
    htab: *mut HashTable,
    key: *const c_char,
) -> c_int {
    let hashtable = &mut *htab;
    let key = CStr::from_ptr(key);

    match hashtable.get(key) {
        Some(item) => item.value,
        None => -1,
    }
}

#[no_mangle]
pub unsafe extern "C" fn tvm_htab_find_ref(
    htab: *mut HashTable,
    key: *const c_char,
) -> *mut c_char {
    let hashtable = &mut *htab;
    let key = CStr::from_ptr(key);

    match hashtable.0.get(key) {
        Some(item) => item.value_ptr as *mut c_char,
        None => ptr::null_mut(),
    }
}


现在,我们已经实际实现了存根功能,一切都应该再次正常工作。



测试此问题的最简单方法是运行我们的示例。



cargo run --example tvmi -- vendor/tinyvm/programs/tinyvm/fact.vm
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s
     Running `target/debug/examples/tvmi vendor/tinyvm/programs/tinyvm/fact.vm`
1
2
6
24
120
720
5040
40320
362880
3628800


再次检查,我们可以运行它valgrind以确保没有内存泄漏或任何棘手的指针。



$ valgrind target/debug/examples/tvmi vendor/tinyvm/programs/tinyvm/fact.vm
==1492== Memcheck, a memory error detector
==1492== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1492== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==1492== Command: target/debug/examples/tvmi vendor/tinyvm/programs/tinyvm/fact.vm
==1492==
1
2
6
24
120
720
5040
40320
362880
3628800
==1492==
==1492== HEAP SUMMARY:
==1492==     in use at exit: 0 bytes in 0 blocks
==1492==   total heap usage: 270 allocs, 270 frees, 67,129,392 bytes allocated
==1492==
==1492== All heap blocks were freed -- no leaks are possible
==1492==
==1492== For lists of detected and suppressed errors, rerun with: -s
==1492== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)


成功!



数据预处理实现



虚拟机tinyvm使用类似于传统Intel x86汇编器简化形式的汇编器。解析tinyvm的汇编器的第一步是运行预处理器,该预处理器解释语句%include filename和语句%define identifier value



使用&strRust中的类型更容易进行这种文本操作,因此让我们看一下我们的箱子需要实现的接口。



// vendor/tinyvm/include/tvm/tvm_preprocessor.h

#ifndef TVM_PREPROCESSOR_H_
#define TVM_PREPROCESSOR_H_

#include "tvm_htab.h"

int tvm_preprocess(char **src, int *src_len, struct tvm_htab_ctx *defines);

#endif


同时使用char **这两个int *变量srcsrc_len乍一看似乎有些奇怪,但是如果用Rust编写等效变量最终将得到如下结果:



fn tvm_preprocess(
    src: String,
    defines: &mut HashTable,
) -> Result<String, PreprocessorError> {
    ...
}


C代码仅使用输出参数来替换字符串src,因为它无法返回换行符或错误代码。



在进行其他操作之前,您需要为编写测试tvm_preprocess()这样,我们可以确保Rust函数在功能上与原始函数相同。



我们与文件系统进行交互,因此我们需要拉出tempfile板条箱



$ cargo add --dev tempfile
    Updating 'https://github.com/rust-lang/crates.io-index' index
      Adding tempfile v3.1.0 to dev-dependencies


我们还将需要一个板条箱,libc因为我们将传递libtvm可能需要释放的线。



cargo add libc
    Updating 'https://github.com/rust-lang/crates.io-index' index
      Adding libc v0.2.66 to dev-dependencies


查看源代码,我们可以看到该函数tvm_preprocess()将继续允许%include%define只要它们为no。



首先,让我们创建一个测试以确保预处理程序能够处理%define我们知道这段代码已经可以工作了(毕竟这是代码tinyvm),所以应该不会感到惊讶。



// src/preprocessing.rs

#[cfg(test)]
mod tests {
    use crate::ffi;
    use std::{
        ffi::{CStr, CString},
        io::Write,
        os::raw::c_int,
    };

    #[test]
    fn find_all_defines() {
        let src = "%define true 1\nsome random text\n%define FOO_BAR -42\n";
        let original_length = src.len();
        let src = CString::new(src).unwrap();

        unsafe {
            // get a copy of `src` that was allocated using C's malloc
            let mut src = libc::strdup(src.as_ptr());
            let mut len = original_length as c_int;
            let defines = ffi::tvm_htab_create();

            let ret = ffi::tvm_preprocess(&mut src, &mut len, defines);

            // preprocessing should have been successful
            assert_eq!(ret, 0);

            // make sure the define lines were removed
            let preprocessed = CStr::from_ptr(src).to_bytes();
            let preprocessed =
                std::str::from_utf8(&preprocessed[..len as usize]).unwrap();
            assert_eq!(preprocessed, "\nsome random text\n\n");

            // make sure the "true" and "FOO_BAR" defines were set
            let true_define =
                ffi::tvm_htab_find_ref(defines, b"true\0".as_ptr().cast());
            let got = CStr::from_ptr(true_define).to_str().unwrap();
            assert_eq!(got, "1");
            let foo_bar =
                ffi::tvm_htab_find_ref(defines, b"FOO_BAR\0".as_ptr().cast());
            let got = CStr::from_ptr(foo_bar).to_str().unwrap();
            assert_eq!(got, "-42");

            // clean up our hashtable and copied source text
            ffi::tvm_htab_destroy(defines);
            libc::free(src.cast());
        }
    }
}


45行比我通常在测试中喜欢的要多得多,但是要在C行之间来回转换需要大量额外的代码。



我们还需要检查,包括另外一个文件。



// src/preprocessing.rs

#[cfg(test)]
mod tests {
    ...

    #[test]
    fn include_another_file() {
        const TOP_LEVEL: &str = "first line\n%include nested\nlast line\n";
        const NESTED: &str = "nested\n";

        // the preprocessor imports files from the filesystem, so we need to
        // copy NESTED to a temporary location
        let mut nested = NamedTempFile::new().unwrap();
        nested.write_all(NESTED.as_bytes()).unwrap();
        let nested_filename = nested.path().display().to_string();

        // substitute the full path to the "nested" file
        let top_level_src = TOP_LEVEL.replace("nested", &nested_filename);
        std::fs::write(&nested, NESTED).unwrap();

        unsafe {
            let top_level_src = CString::new(top_level_src).unwrap();
            // create a copy of the top_level_src which can be freed by C
            let mut src = libc::strdup(top_level_src.as_ptr());
            let mut len = libc::strlen(src) as c_int;
            let defines = ffi::tvm_htab_create();

            // after all that setup code we can *finally* call the preprocessor
            let ret = ffi::tvm_preprocess(&mut src, &mut len, defines);

            assert_eq!(ret, 0);

            // make sure the define and import lines were removed
            let preprocessed = CStr::from_ptr(src).to_bytes();
            let got =
                std::str::from_utf8(&preprocessed[..len as usize]).unwrap();

            // after preprocessing, all include and define lines should have
            // been removed
            assert_eq!(got, "first line\nnested\nlast line\n");

            ffi::tvm_htab_destroy(defines);
            libc::free(src.cast());
        }
    }


注意

作为一个旁注,该测试最初是为了将所有内容嵌套在三层之内(例如,top_level.vm包括nested.vm,其中包括true_nested.vm)以确保它处理多个级别%include,但是在撰写本文时,测试继续存在段错误。



然后我尝试运行原始的C二进制文件tvmi...



$ cd vendor/tinyvm/
$ cat top_level.vm
  %include nested
$ cat nested.vm
  %include really_nested
$ cat really_nested.vm
  Hello World
$ ./bin/tvmi top_level.vm
  [1]    10607 segmentation fault (core dumped)  ./bin/tvmi top_level.vm


原来,当您有多层时,原始的tinyvm由于某种原因而崩溃include...


现在我们进行了一些测试,因此可以开始实施tvm_preprocess()

首先,您需要确定错误的类型。



// src/preprocessing.rs

#[derive(Debug)]
pub enum PreprocessingError {
    FailedInclude {
        name: String,
        inner: IoError,
    },
    DuplicateDefine {
        name: String,
        original_value: String,
        new_value: String,
    },
    EmptyDefine,
    DefineWithoutValue(String),
}


查看process_includes()和process_derives()函数,似乎它们扫描字符串以查找特定指令,然后将该字符串替换为其他内容(文件内容,或者如果应删除该行则为空)。



我们需要能够将此逻辑提取到帮助程序中,并避免不必要的重复。



// src/preprocessing.rs

/// Scan through the input string looking for a line starting with some
/// directive, using a callback to figure out what to replace the directive line
/// with.
fn process_line_starting_with_directive<F>(
    mut src: String,
    directive: &str,
    mut replace_line: F,
) -> Result<(String, usize), PreprocessingError>
where
    F: FnMut(&str) -> Result<String, PreprocessingError>,
{
    // try to find the first instance of the directive
    let directive_delimiter = match src.find(directive) {
        Some(ix) => ix,
        None => return Ok((src, 0)),
    };

    // calculate the span from the directive to the end of the line
    let end_ix = src[directive_delimiter..]
        .find('\n')
        .map(|ix| ix + directive_delimiter)
        .unwrap_or(src.len());

    // the rest of the line after the directive
    let directive_line =
        src[directive_delimiter + directive.len()..end_ix].trim();

    // use the callback to figure out what we should replace the line with
    let replacement = replace_line(directive_line)?;

    // remove the original line
    let _ = src.drain(directive_delimiter..end_ix);
    // then insert our replacement
    src.insert_str(directive_delimiter, &replacement);

    Ok((src, 1))
}


现在我们有了一个助手process_line_starting_with_directive(),因此我们可以实现一个解析器%include



// src/preprocessing.rs

fn process_includes(
    src: String,
) -> Result<(String, usize), PreprocessingError> {
    const TOK_INCLUDE: &str = "%include";

    process_line_starting_with_directive(src, TOK_INCLUDE, |line| {
        std::fs::read_to_string(line).map_err(|e| {
            PreprocessingError::FailedInclude {
                name: line.to_string(),
                inner: e,
            }
        })
    })
}


不幸的是,%define解析器要复杂一些。



// src/preprocessing.rs

n process_defines(
    src: String,
    defines: &mut HashTable,
) -> Result<(String, usize), PreprocessingError> {
    const TOK_DEFINE: &str = "%define";

    process_line_starting_with_directive(src, TOK_DEFINE, |line| {
        parse_define(line, defines)?;
        Ok(String::new())
    })
}

fn parse_define(
    line: &str,
    defines: &mut HashTable,
) -> Result<(), PreprocessingError> {
    if line.is_empty() {
        return Err(PreprocessingError::EmptyDefine);
    }

    // The syntax is "%define key value", so after removing the leading
    // "%define" everything after the next space is the value
    let first_space = line.find(' ').ok_or_else(|| {
        PreprocessingError::DefineWithoutValue(line.to_string())
    })?;

    // split the rest of the line into key and value
    let (key, value) = line.split_at(first_space);
    let value = value.trim();

    match defines.0.entry(
        CString::new(key).expect("The text shouldn't contain null bytes"),
    ) {
        // the happy case, this symbol hasn't been defined before so we can just
        // insert it.
        Entry::Vacant(vacant) => {
            vacant.insert(Item::opaque(value));
        },
        // looks like this key has already been defined, report an error
        Entry::Occupied(occupied) => {
            return Err(PreprocessingError::DuplicateDefine {
                name: key.to_string(),
                original_value: occupied
                    .get()
                    .opaque_value_str()
                    .unwrap_or("<invalid>")
                    .to_string(),
                new_value: value.to_string(),
            });
        },
    }

    Ok(())
}


要访问哈希表中的文本,我们需要为元素提供Item一些辅助方法:



// src/htab.rs

impl Item {
    ...

    pub(crate) fn opaque_value(&self) -> &[u8] { &self.opaque_value }

    pub(crate) fn opaque_value_str(&self) -> Option<&str> {
        std::str::from_utf8(self.opaque_value()).ok()
    }
}


这时最好添加更多测试。



// src/preprocessing.rs

#[cfg(test)]
mod tests {
    ...

    #[test]
    fn empty_string() {
        let src = String::from("");
        let mut hashtable = HashTable::default();

        let (got, replacements) = process_defines(src, &mut hashtable).unwrap();

        assert!(got.is_empty());
        assert_eq!(replacements, 0);
        assert!(hashtable.0.is_empty());
    }

    #[test]
    fn false_percent() {
        let src = String::from("this string contains a % symbol");
        let mut hashtable = HashTable::default();

        let (got, replacements) =
            process_defines(src.clone(), &mut hashtable).unwrap();

        assert_eq!(got, src);
        assert_eq!(replacements, 0);
        assert!(hashtable.0.is_empty());
    }

    #[test]
    fn define_without_key_and_value() {
        let src = String::from("%define\n");
        let mut hashtable = HashTable::default();

        let err = process_defines(src.clone(), &mut hashtable).unwrap_err();

        match err {
            PreprocessingError::EmptyDefine => {},
            other => panic!("Expected EmptyDefine, found {:?}", other),
        }
    }

    #[test]
    fn define_without_value() {
        let src = String::from("%define key\n");
        let mut hashtable = HashTable::default();

        let err = process_defines(src.clone(), &mut hashtable).unwrap_err();

        match err {
            PreprocessingError::DefineWithoutValue(key) => {
                assert_eq!(key, "key")
            },
            other => panic!("Expected DefineWithoutValue, found {:?}", other),
        }
    }

    #[test]
    fn valid_define() {
        let src = String::from("%define key value\n");
        let mut hashtable = HashTable::default();

        let (got, num_defines) = process_defines(src.clone(), &mut hashtable).unwrap();

        assert_eq!(got, "\n");
        assert_eq!(num_defines, 1);
        assert_eq!(hashtable.0.len(), 1);
        let key = CString::new("key").unwrap();
        let item = hashtable.0.get(&key).unwrap();
        assert_eq!(item.opaque_value_str().unwrap(), "value");
    }
}


至此,我们已经重现了大多数预处理逻辑,所以现在我们只需要一个函数,该函数将继续扩展运算符%include和过程,%define直到没有更多的为止。



// src/preprocessing.rs

pub fn preprocess(
    src: String,
    defines: &mut HashTable,
) -> Result<String, PreprocessingError> {
    let mut src = src;

    loop {
        let (modified, num_includes) = process_includes(src)?;
        let (modified, num_defines) = process_defines(modified, defines)?;

        if num_includes + num_defines == 0 {
            return Ok(modified);
        }

        src = modified;
    }
}


当然,此功能preprocess()仅适用于Rust。我们需要创建extern "C" fn一个将参数从C类型转换为Rust可以处理的东西,然后再转换回C的参数。



// src/preprocessing.rs

#[no_mangle]
pub unsafe extern "C" fn tvm_preprocess(
    src: *mut *mut c_char,
    src_len: *mut c_int,
    defines: *mut tvm_htab_ctx,
) -> c_int {
    if src.is_null() || src_len.is_null() || defines.is_null() {
        return -1;
    }

    // Safety: This assumes the tvm_htab_ctx is actually our ported HashTable
    let defines = &mut *(defines as *mut HashTable);

    // convert the input string to an owned Rust string so it can be
    // preprocessed
    let rust_src = match CStr::from_ptr(*src).to_str() {
        Ok(s) => s.to_string(),
        // just error out if it's not valid UTF-8
        Err(_) => return -1,
    };

    match preprocess(rust_src, defines) {
        Ok(s) => {
            let preprocessed = CString::new(s).unwrap();
            // create a copy of the preprocessed string that can be free'd by C
            // and use the output arguments to pass it to the caller
            *src = libc::strdup(preprocessed.as_ptr());
            // the original C implementation didn't add a null terminator to the
            // preprocessed string, so we're required to set the length as well.
            *src_len = libc::strlen(*src) as c_int;

            // returning 0 indicates success
            0
        },
        // tell the caller "an error occurred"
        Err(_) => -1,
    }
}


提示您

可能已经注意到,我们的tvm_preprocess()函数没有任何预处理逻辑,更像是用于转换参数和返回值并确保正确的错误传播的适配器。



这不是巧合。



对FFI进行编码的秘诀是编写尽可能少的内容,并避免使用巧妙的技巧。与大多数Rust代码不同,此类互操作函数中的错误可能导致逻辑和内存错误。



在函数周围创建一个薄包装器preprocess()也使下一个任务更加容易:当大多数代码库都是用Rust编写时,我们可以删除包装器并preprocess()直接调用


现在已经tvm_preprocess()定义了函数,我们应该准备好了。



 Compiling tinyvm v0.1.0 (/home/michael/Documents/tinyvm-rs)
error: linking with `/usr/bin/clang` failed: exit code: 1
  |
  = note: "/usr/bin/clang" "-Wl,--as-needed" "-Wl,-z,noexecstack" "-m64" "-L" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.13h6j6k0dzqf6zi2.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.13l2b4uvr7p3ht4k.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.14bdbjhozo3id49g.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.14fw2gyd6mrq5730.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.19xc7n0bb25uaxgk.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.1duzy573vjvyihco.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.1e0yejy24qufh7ie.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.1k4xuir9ezt4vkzp.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.1mqdnrarww1zjlt.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.1ubflbxzxkx7grpn.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.1vtvcpzzusyku3mk.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.1wal3ebwyfg4qllf.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.235k75fk09i43ba3.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.253rt7mnjcp3n8ex.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.27phuscrye2lmkyq.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2bwv51h7gucjizh0.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2ghuai4hs88aroml.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2gqnd9h4nmhvgxbn.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2hjvtf620gtog0qz.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2hq7kc2w3vix8i5q.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2ibwag4iedx494ft.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2jdt9fes53g5mxlp.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2kv4bwega1wfr8z6.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2lja418hz58xlryz.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2o0foimqe73p8ujt.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2ouuhyii88vg8tqs.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2tnynvvdxge4sv9a.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2u1hzhj3v0d8kn4s.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2v1ii2legejcp3ir.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2vkkoofkb7zs04v1.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2w5mgql1gpr1f9uz.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2wdyioq7lxh9uxu7.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2wokgurbjsmgz12r.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.2wwcrmvusj07mx2n.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.310mxv7piqfbf4tr.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.3352aele91geo33m.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.36f4wrjtv0x5y00b.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.38f6o2m900r5q63j.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.3b67z5wg30f9te4l.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.3gyajmii4500y81t.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.3ovwslgcz03sp0ov.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.3vwhwp967j90qfpp.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.41ox17npnikcezii.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4472ut4qn508rg19.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4bbcvjauqmyr7tjc.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4c9lrc1xbvaru030.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4fzwdkjhjtwv5uik.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4gy2dy14zw2o60sh.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4i8qxpi0pmwn8d2e.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4isstj7ytb9d9yep.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4isz4o5d1flv8pme.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4lnnaom9zd4u3xmv.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4mgvbbhn4jewmy60.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4q7wf9d53jp9j6y6.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4qimnegzmsif2zbr.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4scm7492lh4yspgt.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4ten9b8okg10ap4i.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4vrj7dhlet4j6oe.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4wtf4i2ggbrvqt63.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4zsqxnhj8yusiplh.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.50o8i1bmvqwd5eg7.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.50urmck1r52hucuw.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.51w3uc6agh3gynn3.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.55o6ad6nlq4o2zyt.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.57gih8p2bu1jbo0l.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.57rpuf5wpgkfmf1z.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.5920w55mlosqy9aj.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.5c1ra5cheein740g.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.5cuuq0m7tzehyrti.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.5e85z18y46lhofte.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.6yu7c01lw47met2.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.cn69np51jgriev2.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.d224rq9cs4mbv0q.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.e0vaqgnhc25c4ox.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.edm0ce3nfzegp4d.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.elxjhifv4wlzkc2.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.ifqyaukx6gnbb0a.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.kr8s9rcy6ux2d02.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.ley637x8c2etn66.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.njyqsm0frvb1j4d.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.r9ttxk3s5kacz9k.rcgu.o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.xrorvssabbgfjqz.rcgu.o" "-o" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88" "/home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.1iplfu0pt8fy07e4.rcgu.o" "-Wl,--gc-sections" "-pie" "-Wl,-zrelro" "-Wl,-znow" "-nodefaultlibs" "-L" "/home/michael/Documents/tinyvm-rs/target/debug/deps" "-L" "/home/michael/Documents/tinyvm-rs/target/debug/build/tinyvm-3f1a2766f78b5580/out" "-L" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-Wl,-Bstatic" "-Wl,--whole-archive" "-ltvm" "-Wl,--no-whole-archive" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libtest-a39a3e9a77b17f55.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libterm-97a69cd310ff0925.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libgetopts-66a42b1d94e3e6f9.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libunicode_width-dd7761d848144e0d.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_std-f722acdb78755ba0.rlib" "/home/michael/Documents/tinyvm-rs/target/debug/deps/libtempfile-b08849d192e5c2e1.rlib"
 "/home/michael/Documents/tinyvm-rs/target/debug/deps/librand-c85ceffb304c7385.rlib" "/home/michael/Documents/tinyvm-rs/target/debug/deps/librand_chacha-4e4839e3036afe89.rlib" "/home/michael/Documents/tinyvm-rs/target/debug/deps/libc2_chacha-7555b62a53de8bdf.rlib" "/home/michael/Documents/tinyvm-rs/target/debug/deps/libppv_lite86-0097c0f425957d6e.rlib" "/home/michael/Documents/tinyvm-rs/target/debug/deps/librand_core-de2208c863d15e9b.rlib" "/home/michael/Documents/tinyvm-rs/target/debug/deps/libgetrandom-c696cd809d660e17.rlib" "/home/michael/Documents/tinyvm-rs/target/debug/deps/liblibc-d52d0b97a33a5f02.rlib" "/home/michael/Documents/tinyvm-rs/target/debug/deps/libremove_dir_all-4035fb46dbd6fb92.rlib" "/home/michael/Documents/tinyvm-rs/target/debug/deps/libcfg_if-6adeb646d05b676c.rlib" "-Wl,--start-group" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-974c3c08f6def4b3.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libpanic_unwind-eb49676f33a2c8a6.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libhashbrown-7ae0446feecc60f2.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_alloc-2de299b65d7f5721.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libbacktrace-64514775bc06309a.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libbacktrace_sys-1ed8aa185c63b9a5.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_demangle-a839df87f563fba5.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libunwind-8e726bdc2018d836.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcfg_if-5285f42cbadf207d.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-b0362d20f8aa58fa.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-f3dd7051708453a4.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_core-83744846c43307ce.rlib" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-d5565a3a0f4cfe21.rlib" "-Wl,--end-group" "/home/michael/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcompiler_builtins-ea790e85415e3bbf.rlib" "-Wl,-Bdynamic" "-lutil" "-lutil" "-ldl" "-lrt" "-lpthread" "-lgcc_s" "-lc" "-lm" "-lrt" "-lpthread" "-lutil" "-lutil" "-fuse-ld=lld"
  = note: ld.lld: error: duplicate symbol: tvm_preprocess
          >>> defined at preprocessing.rs:13 (src/preprocessing.rs:13)
          >>>            /home/michael/Documents/tinyvm-rs/target/debug/deps/tinyvm-8eca24ff9a1cde88.4mgvbbhn4jewmy60.rcgu.o:(tvm_preprocess)
          >>> defined at tvm_preprocessor.c:135 (vendor/tinyvm/libtvm/tvm_preprocessor.c:135)
          >>>            tvm_preprocessor.o:(.text.tvm_preprocess+0x0) in archive /home/michael/Documents/tinyvm-rs/target/debug/build/tinyvm-3f1a2766f78b5580/out/libtvm.a
          clang: error: linker command failed with exit code 1 (use -v to see invocation)


error: aborting due to previous error

error: could not compile `tinyvm`.

To learn more, run the command again with --verbose.


糟糕,链接器抱怨和preprocessing.rs,并tvm_preprocessor.c定义了一个功能tvm_preprocess()看来我们忘了tvm_preprocessor.c从组装件中取出...



diff --git a/build.rs b/build.rs
index 0ed012c..42b8fa0 100644
--- a/build.rs
+++ b/build.rs
@@ -14,6 +14,7 @@ fn main() {
         .file(src.join("tvm_memory.c"))
         .file(src.join("tvm_parser.c"))
         .file(src.join("tvm_program.c"))
-        .file(src.join("tvm_preprocessor.c"))
         .file(src.join("tvm.c"))
         .include(&include)
         .compile("tvm");
(END)


让我们再试一次。



cargo run --example tvmi -- vendor/tinyvm/programs/tinyvm/fact.vm
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s
     Running `target/debug/examples/tvmi vendor/tinyvm/programs/tinyvm/fact.vm`
1
2
6
24
120
720
5040
40320
362880
3628800


好多了!



还记得最后一个例子,您tvmi陷入了三个层次的代码深度?作为一个很好的副作用,在将代码移植到Rust之后,嵌套层就可以了



注意您

可能还注意到,该函数preprocess()未使用中的任何哈希表函数tvm_htab.h将模块移植到Rust后,我们直接使用Rust类型。



这就是这个过程的美。将某些内容移植到Rust后,您可以将其直接应用以使用类型/函数-并立即从错误处理和人体工程学中受益。


结论



祝贺您,如果您仍在阅读本文,我们刚刚将两个模块从移植tinyvm到Rust。



不幸的是,本文已经很长了。但我希望到现在为止您已经了解了大局。



  1. 浏览应用程序头文件并找到一个简单的功能/模块

  2. 编写一些测试以了解现有功能应如何工作

  3. 在Rust中编写等效的函数,并确保它们通过相同的测试

  4. 创建一个薄垫片,以相同的C接口导出Rust函数,并记住从程序集中删除原始函数/模块,以便链接器使用Rust代码代替C

  5. 转到步骤1


关于此方法的最好之处是,您逐渐改进了代码库,应用程序保持运行状态,并且没有从头到尾重写所有内容。



这就像在动态地改变方向盘。





将应用程序从C移植到Rust的首选方法



也可以看看:






All Articles