×


Help understanding difference in c to wasm compiler output(s)(Read 1326 times)
I started experimenting this morning with this tool I found https://wasdk.github.io/WasmFiddle/?q0u7d

and the c code there builds to a 171 byte buffer as far as I can tell is the closest thing that tool outputs to emcc's wasm output, and which when run doesn't require anything but an empty buffer in the WebAssembly.instantiate "importObject" argument

emcc, however, for the same c code, run with "emcc primeGen.c -s WASM=1 -o primeGen.html" produces a 24481 byte file, and requires a crapload of new stuff in the instantiate options* that wasn't (obviously) required previously, though it could just be obfuscated, but the "wasmImports" wasmFiddle produces (select on bottom left after build) is an empty object, and if logged through js, is still so when used...

so my question is, where is all the bulk of this .wasm output of emcc coming from, why is it creating all these extra import requirements, and how can i get rid of them if i just want to do basic things ( if that's even possible or wise )

*so far, by trial and error, I've built up this importsObject based on chrome console errors about things that must be numbers, callables, memory objects, etc... and I'm sure there's plenty more if I kept bruteforcing it:

var wasmImports = { env: { DYNAMICTOPPTR: 1000, tempDoublePtr: 7, ABORT: 2, STACKTOP: 8, STACK_MAX: 8, enlargeMemory: x => 0, getTotalMemory: x => 0, abortOnCannotGrowMemory: x => 0, abortStackOverflow: x => 0, nullFunc_ii: x => 0, nullFunc_iiii: x => 0, _lock: x => 0, _syscall6: x => 0, _setErrNo: x => 0, _abort: x => 0, _syscall140: x => 0, _emscripten_memcpy_big: x=>0, _syscall54: x=>0, _unlock: x=>0, __syscall146: x=>0, memory: new WebAssembly.Memory({initial:256, maximum: 256}) }, global: { NaN: NaN, Infinity: Infinity } };



Re: Help understanding difference in c to wasm compiler output(s) Reply #1 on: October 04, 2017, 05:26:23 am
Just for fun, here's the decompiled WASM binary so you can see where all the bytes are going:

http://i.imgur.com/jOMRDcI.png

Try it out here
https://pierrerossouw.github.io/dwasm/



Re: Help understanding difference in c to wasm compiler output(s) Reply #2 on: October 04, 2017, 12:38:23 pm
Just for fun, here's the decompiled WASM binary so you can see where all the bytes are going:

http://i.imgur.com/jOMRDcI.png

Try it out here
https://pierrerossouw.github.io/dwasm/

that's rad, thanks.



Re: Help understanding difference in c to wasm compiler output(s) Reply #3 on: October 04, 2017, 07:50:23 pm
I was talking to someone about compiling C to Wasm yesterday and we discovered that the -O3 optimised compiler output from Clang is pretty good, while the unoptimised output is so full of crap that it won't even run in the browsers at all.
Could you try adding -O3 to your emcc command and see what happens?



Re: Help understanding difference in c to wasm compiler output(s) Reply #4 on: October 05, 2017, 03:02:23 am
I was talking to someone about compiling C to Wasm yesterday and we discovered that the -O3 optimised compiler output from Clang is pretty good, while the unoptimised output is so full of crap that it won't even run in the browsers at all.
Could you try adding -O3 to your emcc command and see what happens?

that makes a lot of sense, I'll give it a shot



Re: Help understanding difference in c to wasm compiler output(s) Reply #5 on: October 05, 2017, 10:14:23 am
neglecting that seems to only double the expected output, while I'm otherwise seeing about 143 times the expected output... I'm new here, so I'm sure there are reasons and years-long git threads as to why, but it doesn't seem to fit wasm's stated size and load-time efficiency mission as it stands now.... (though it looks like building as a side module gets you down to only a little cruft, so... progress)



Re: Help understanding difference in c to wasm compiler output(s) Reply #6 on: October 05, 2017, 05:26:23 pm
neglecting that seems to only double the expected output, while I'm otherwise seeing about 143 times the expected output... I'm new here, so I'm sure there are reasons and years-long git threads as to why, but it doesn't seem to fit wasm's stated size and load-time efficiency mission as it stands now.... (though it looks like building as a side module gets you down to only a little cruft, so... progress)

Could you please share those wasm files with me? I'm working on a disassembler / optimiser and these examples help me a lot!



Re: Help understanding difference in c to wasm compiler output(s) Reply #7 on: October 06, 2017, 12:38:23 am
neglecting that seems to only double the expected output, while I'm otherwise seeing about 143 times the expected output... I'm new here, so I'm sure there are reasons and years-long git threads as to why, but it doesn't seem to fit wasm's stated size and load-time efficiency mission as it stands now.... (though it looks like building as a side module gets you down to only a little cruft, so... progress)

Could you please share those wasm files with me? I'm working on a disassembler / optimiser and these examples help me a lot!

just go over to the linked wasmFiddle page, select settings and build with any optimization state you like. the emcc one, w/ all that cruft, idk what is inside and as this is a work computer, I'd rather not, but if you install emsdk or emscripten, llvm, etc ( cloning from their github pages seemed to work best for me, especially since you need to install from the incoming branch ) you should get similar results with the supplied emcc command



Re: Help understanding difference in c to wasm compiler output(s) Reply #8 on: October 06, 2017, 07:50:23 am
neglecting that seems to only double the expected output, while I'm otherwise seeing about 143 times the expected output... I'm new here, so I'm sure there are reasons and years-long git threads as to why, but it doesn't seem to fit wasm's stated size and load-time efficiency mission as it stands now.... (though it looks like building as a side module gets you down to only a little cruft, so... progress)

Could you please share those wasm files with me? I'm working on a disassembler / optimiser and these examples help me a lot!

just go over to the linked wasmFiddle page, select settings and build with any optimization state you like. the emcc one, w/ all that cruft, idk what is inside and as this is a work computer, I'd rather not, but if you install emsdk or emscripten, llvm, etc ( cloning from their github pages seemed to work best for me, especially since you need to install from the incoming branch ) you should get similar results with the supplied emcc command

Yea I've got the wasmFiddle outputs. -O3 works well, there's only a small number of useless bits in there. It seems acceptable to me. This version is 172 bytes.
-O0 doesn't work at all. Attempting to run the resulting wasm just gives you 'VM121-0 wasm-f1084df6-0:9 Uncaught RuntimeError: memory access out of bounds at <WASM UNNAMED> (<WASM>
  • +15)' and this is purely due to buggy output, not missing JS imports or whatever. However even this version is only 227 bytes in size.

I'm interested in seeing the huge 24481 bytes generated by emcc but last time I tried I was not able to get emcc working on my Windows machine. I'll try on a mac and see what happens.
For comparison the hand-compiled version I posted before is 140 bytes which is pretty close to the minimum possible for this piece of C code.



Re: Help understanding difference in c to wasm compiler output(s) Reply #9 on: October 06, 2017, 03:02:23 pm
By default emscripten includes a lot of extra JavaScript glue to provide a POSIX-like environment, as well as various helper functions to interact with the module.

When you compile and use an html output, you're not meant to use the .wasm file directly; as you've found, there are many import functions that are provided by the emscripten glue. In this case you should be able to just load the .html file and that will run your module's main function.

If you want to generate a module that you can call into, you can specify a .js extension (e.g. -o primeGen.js) to generate a JavaScript wrapper that will instantiate the WebAssembly module for you.

If you just want to compile one function, it's probably easiest to use the -s SIDE_MODULE=1 flag:
Code: [Select]
emcc -O3 -s WASM=1 -s SIDE_MODULE=1 prime.c -o prime.wasmThis gives me a 316 byte wasm file as output. The module requires 4 imports:
Code: [Select]
(import "env" "memoryBase" (global $g0 i32))
(import "env" "memory" (memory $M0 256))
(import "env" "table" (table $T0 0 anyfunc))
(import "env" "tableBase" (global $g1 i32))
In this example, none of them are required, so they can be removed. It also generates a few extra functions:
Code: [Select]
(func $f1 (export "runPostSets") (type $t1) ...)
(func $f2 (export "__post_instantiate") (type $t1) ...)
But these are not needed for this simple example either.
You can read more about side modules here: https://github.com/kripken/emscripten/wiki/WebAssembly-Standalone



Re: Help understanding difference in c to wasm compiler output(s) Reply #10 on: October 06, 2017, 10:14:23 pm
By default emscripten includes a lot of extra JavaScript glue to provide a POSIX-like environment, as well as various helper functions to interact with the module.

When you compile and use an html output, you're not meant to use the .wasm file directly; as you've found, there are many import functions that are provided by the emscripten glue. In this case you should be able to just load the .html file and that will run your module's main function.

If you want to generate a module that you can call into, you can specify a .js extension (e.g. -o primeGen.js) to generate a JavaScript wrapper that will instantiate the WebAssembly module for you.

If you just want to compile one function, it's probably easiest to use the -s SIDE_MODULE=1 flag:
Code: [Select]
emcc -O3 -s WASM=1 -s SIDE_MODULE=1 prime.c -o prime.wasmThis gives me a 316 byte wasm file as output. The module requires 4 imports:
Code: [Select]
(import "env" "memoryBase" (global $g0 i32))
(import "env" "memory" (memory $M0 256))
(import "env" "table" (table $T0 0 anyfunc))
(import "env" "tableBase" (global $g1 i32))
In this example, none of them are required, so they can be removed. It also generates a few extra functions:
Code: [Select]
(func $f1 (export "runPostSets") (type $t1) ...)
(func $f2 (export "__post_instantiate") (type $t1) ...)
But these are not needed for this simple example either.
You can read more about side modules here: https://github.com/kripken/emscripten/wiki/WebAssembly-Standalone

how would one go about compiling into only what is required ( ideally without hand-editing the wast or wasm output which you seem to be suggesting, otherwise I don't know how one would remove those imports )? I really just want a wasm file with the wasm binary(hex) equivalent to what I wrote in c. I want to build the js loading code myself so I can see what is happening in what order.
even so, I get why memory might be required (though basic examples work without explicitly addressing memory) but I don't know what to table is about... I'm guessing this is common knowledge to c/c++ devs, but I haven't seen it spelled out anywhere
For that matter, the wasm buffer provided by wasmFiddle, which seems to be the meat of the code, is different from what wast2wasm provides even if you strip out the hex line numbering and leading zeros, and I'd like to understand that too since you seem to have a pretty good grasp on this



Re: Help understanding difference in c to wasm compiler output(s) Reply #11 on: October 07, 2017, 05:26:23 am
By default emscripten includes a lot of extra JavaScript glue to provide a POSIX-like environment, as well as various helper functions to interact with the module.

When you compile and use an html output, you're not meant to use the .wasm file directly; as you've found, there are many import functions that are provided by the emscripten glue. In this case you should be able to just load the .html file and that will run your module's main function.

If you want to generate a module that you can call into, you can specify a .js extension (e.g. -o primeGen.js) to generate a JavaScript wrapper that will instantiate the WebAssembly module for you.

If you just want to compile one function, it's probably easiest to use the -s SIDE_MODULE=1 flag:
Code: [Select]
emcc -O3 -s WASM=1 -s SIDE_MODULE=1 prime.c -o prime.wasmThis gives me a 316 byte wasm file as output. The module requires 4 imports:
Code: [Select]
(import "env" "memoryBase" (global $g0 i32))
(import "env" "memory" (memory $M0 256))
(import "env" "table" (table $T0 0 anyfunc))
(import "env" "tableBase" (global $g1 i32))
In this example, none of them are required, so they can be removed. It also generates a few extra functions:
Code: [Select]
(func $f1 (export "runPostSets") (type $t1) ...)
(func $f2 (export "__post_instantiate") (type $t1) ...)
But these are not needed for this simple example either.
You can read more about side modules here: https://github.com/kripken/emscripten/wiki/WebAssembly-Standalone

how would one go about compiling into only what is required ( ideally without hand-editing the wast or wasm output which you seem to be suggesting, otherwise I don't know how one would remove those imports )? I really just want a wasm file with the wasm binary(hex) equivalent to what I wrote in c. I want to build the js loading code myself so I can see what is happening in what order.
even so, I get why memory might be required (though basic examples work without explicitly addressing memory) but I don't know what to table is about... I'm guessing this is common knowledge to c/c++ devs, but I haven't seen it spelled out anywhere
For that matter, the wasm buffer provided by wasmFiddle, which seems to be the meat of the code, is different from what wast2wasm provides even if you strip out the hex line numbering and leading zeros, and I'd like to understand that too since you seem to have a pretty good grasp on this

AFAIK the SIDE_MODULE approach gets you the closest. I mentioned removing the imports only because it was possible, but the simplest thing to do is to provide the necessary imports, even if they aren't used:
Code: [Select]
let moduleBytes = ...;  // This is an ArrayBuffer of the WebAssembly module.
let memory = new WebAssembly.Memory({initial: 0});
let table = new WebAssembly.Table({initial: 0});
let imports = {
  env : {
    memoryBase: 0,
    tableBase: 0,
    memory: memory,
    table: table,
  }
};
let instance = WebAssembly.instantiate(moduleBytes, imports).then(...);
but I don't know what to table is about... I'm guessing this is common knowledge to c/c++ devs, but I haven't seen it spelled out anywhere
No, a Table is a WebAssembly-specific concept. It currently is used to handle indirect function pointers. In the future it may be used for other features as well. You can read the spec for it here: http://webassembly.github.io/spec/syntax/modules.html#tables. There's also a blog post about it here: https://hacks.mozilla.org/2017/07/webassembly-table-imports-what-are-they/