Rendered at 12:22:55 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
hexer303 20 hours ago [-]
I just finished a similar project for fun and education.
It was a 20-year-old codebase from my old game in win32 and DirectX 9.
I first ported it to native and also switched to bgfx for rendering. This was the bulk of the work - converting all of the old DirectX fixed function pipeline code to shaders. Luckily all modern shaders can simulate all of the old fixed-function DX pipeline features with little effort. Including the coordinate system. Loading DDS textures didn't present a major challenge either.
Had similar native asset loading as yours - no deserializer. It loaded an entire asset file into a preallocated memory block, used packed structures and converted file offsets to pointers after loading. I had to convert it to 64bit for native first.
The most surprising thing: I had no idea WASM is 32bit until I read your article! Once I ported to 64bit, I then ported to WASM and I didn't even encounter any arch related bugs. In hindsight I guess it's because most of the original code was 32bit and the asset file format is still 32bit format. When I ported to 64bit I used a deserializer, so I guess that's why it all worked out in the end.
For native audio I ended up using SoLoud library, but for emscripten I #ifdef'd it out to use inline JS instead. I figured there is no point in having all that extra audio library code compiling to WASM when modern browsers natively support playing audio, oggvorbis, etc. It worked out ok, but there's still a minor bug where the music doesn't loop perfectly. You can hear a split second gap between end/start. I haven't looked deeply into it yet.
Originally when we wrote the game we had banned ourselves from using C++ Exception handling and RTTI. The decision likely paid off as it makes the generated binary smaller and faster. Although I haven't had time to measure. Supposedly C++ exceptions introduce a much heavier overhead in Emscripten.
> I had no idea WASM is 32bit until I read your article!
WASM(32) is a hybrid 32/64 bit architecture. The address range (and thus pointer size) is 32 bits, but it has native 64-bit integers. E.g. it's similar to the Linux x32 ABI.
There is also a 'true' 64-bit wasm, but that's still too recent to be used in real-world code:
(but wasm64 doesn't really make sense unless you really need an address space greater than 32 bits, because the downside is slower performance)
jltsiren 17 hours ago [-]
> (but wasm64 doesn't really make sense unless you really need an address space greater than 32 bits, because the downside is slower performance)
Or unless you need to use integer types that depend on pointer size (such as size_t or usize), but your integers are too large to fit in 32 bits. That's a pretty common occurrence in bioinformatics. I've been waiting for years for Wasm to become usable, but it looks like Apple is still holding it back.
senfiaj 15 hours ago [-]
Actually, WASM32 usually runs in a 64-bit browser, so it can utilize most of the 64-bit instructions and general purpose registers. The problem is it's much harder to ensure that 64-bit pointer don't access browser critical data structures such as stack or heap contents. At least for WASM32, the pointer arithmetic is easily forced to be 32-bit in natively running JIT-ed code without dramatic performance loses. So, it's guaranteed that the WASM32 module won't access anything further than 4 or 8GiB (4GiB + 4GiB, when doing more complicated effective memory addressing instructions), thus, it's easy to cage a 4GiB memory block and surround it with guard buffers of comparable size inside a 64-bit virtual address space. With WASM64 you can't do this optimization because 64-bit pointers easily access your browser's memory, you have to add a lot of runtime bound checks if the compiler static checker can't guarantee landing inside the safe address range.
diath 23 hours ago [-]
With regards to 1), do not write/read structs directly to/from files. Instead write a proper serializer/deserializer. Without it, you may encounter another breakage soon when a different compiler/compiler options insert different struct padding bytes, which will then once again make your data non-portable, and a maliciously crafted save file with no length/size field validation on the deserializer level can lead to a variety of memory bugs.
jstimpfle 23 hours ago [-]
struct layout is well specified, it should be possible to avoid any padding issues by just aligning and by padding (with dummy members) correctly. The problem in practice is mostly integer representation (big-endian vs little-endian).
leni536 23 hours ago [-]
Specified by whom? Not the C standard for sure. It is indeed soecified by individual ABIs, and ABIs don't tend to do anything too weird, but that's another question.
jstimpfle 22 hours ago [-]
looks like I was wrong, but here is the de-facto standard I was relying on over the years ;-). Not that I've memcpied many structs to file directly btw. http://www.catb.org/esr/structure-packing/
jcranmer 20 hours ago [-]
The general struct layout algorithm is that you lay out the first member at the address of the struct (this is guaranteed by C), and then subsequent fields in order (also guaranteed by C). What isn't guaranteed is how fields get their alignment, in particular shenanigans you can do with allocating fields in the padding of their prior field, and bitfields in general are horribly underspecified.
In practice, C doesn't do any padding shenanigans, but C++ does (but only for non-POD structs, and then you discover there's several slightly different definitions that mean basically "POD", so have fun predicting which one is the one that actually matters for your use case).
RossBencina 15 hours ago [-]
If you sort your fields by size or manually pad them with natural alignment, and use #pragma pack or equivalent non-standard directives that gets you most of the way there. But yes, avoid bitfields.
C++ "standard layout type" is the modern equivalent of "POD" I think.
flohofwoe 17 hours ago [-]
> struct layout is well specified
Technically that's not true at least for booleans and enums, the C standard doesn't define specific sizes for those (bools are commonly 1 byte though, but for enums at least MSVC likes to disagree with Clang and GCC).
Using a direct struct memory layout for persistency and then expecting it to work across compilers, CPUs and ABIs is almost guaranteed to cause problems.
DanielHB 22 hours ago [-]
If you modify or even just move fields around the struct that also changes the way they are serialized...
You really need a serializer for this sort of thing because it can also include forwards compatibility of your data structures.
jstimpfle 22 hours ago [-]
sure, if you change the struct, it will now be different.
edflsafoiewq 17 hours ago [-]
It's typical to only append fields when you do this.
arcadialeak 1 days ago [-]
I love how WASM is the thing that finally blurred the line between Web and Native programming, formely two realms isolated from each other for a long time. This both develops better awareness of how the code is executed by the hardware, which JavaScript devs often lack, and also brings skilled folks from the Native platforms who seem to be not so against WASM as they were against JavaScript (and all other parts of the Web, really). Maybe this will bear fruit in that people will make more Native user interfaces again.
pjmlp 24 hours ago [-]
ActiveX, Alchemy, PNaCL,...
genxy 23 hours ago [-]
JVM, Z-Machine, P-Code.
yjftsjthsd-h 21 hours ago [-]
I remember Java applets; when did z-machine and p-code make it to the browser?
DonHopkins 21 hours ago [-]
Everything made it to the browser through emulation.
Saying ActiveX here shows a fundamental misunderstanding of what both WASM and ActiveX are.
Is ActiveX platform independent? No. it's exclusive to windows. Is it sandboxed? Nope, digital signing and prayer, does it implement a virtual machine? Nope. Compromises out the wazoo? efficiency, data orientation, or predictable performance? You betcha. ActiveX is closer to a DOM sandbox escape exploit than a real piece of engineering. Why do we need WASM when we've have GET since 1990?
Don't confuse the map for the territory, implementation details matter, just labeling something "Mars Colonial Transporter" doesn't mean it actually flew to mars.
pjmlp 8 hours ago [-]
What is a fundamental misunderstanding is selling WebAssembly as the first time bleeding Web and native code has been achieved.
All those "look Python on the browser!" were already done by ActiveState with Perl, Python and Tcl.
frollogaston 20 hours ago [-]
Wasm still doesn't let you make native user interfaces, the UI is in the web browser. You can put native UI components into a React Native or Electron app though.
flohofwoe 17 hours ago [-]
> Wasm still doesn't let you make native user interfaces
That's currently only not possible because nobody wants to do the work to create something like wasi-gfx (https://wasi-gfx.dev/), but for native UI frameworks instead of 3D APIs.
The inconvenient truth is that even "native" cross-platform applications hardly ever go through the trouble to target the platform-native UI framework (and instead they go through non-native frameworks like Qt or a webview wrapper).
tracker1 16 hours ago [-]
For that matter, there's so much diversity in actual rendering anymore, very few apps on any platform are really native feeling. Especially with a few electron apps in the mix.
Would be cool to get some standardization on at least a few APIs for default fonts, light/dark mode, background and accent colors, etc... so that apps are a little less alien in practice. I'm really not even the idea of Tauri or similar to use a native browser engine, but better skinning APIs so you can get something like Material, but tuned to better match the desktop you're on.
For that matter, a wasi component package would be nice as well. Harder for accessibility though.
frollogaston 14 hours ago [-]
I meant, in-browser wasm can't do native things like creating a blank non-browser window on my Mac like a Swift app could, no matter what libraries it uses.
6 hours ago [-]
gspr 24 hours ago [-]
I wanted to love it. As someone who hasn't done any web stuff since I was a child, I thought it'd amazing for it to be "just another platform".
I'm a bit disappointed though:
* There's still no way to do DOM manipulation. So then it's tempting to just grab a canvas and draw everything yourself, which of course wreaks on things like accessibility. I'm no fan of the web, but at least it comes with a somewhat agreed-upon way to display graphical stuff – it's a bit of a shame if we're all gonna just treat it like a surface for pixels.
* WASI still leaves something to be desired. Why can't I have raw sockets and file access and stuff, in a POSIX-like way? I understand that sandboxing is important, so this can all be on a per-request-basis, but still. This "just another platform" is still too far from just that.
* The amount of JS glue needed to actually load WASM stuff in the browser is annoying. The idea of needing a bunch of magic "bundlers" is sad.
flohofwoe 17 hours ago [-]
Using WASM to make cross-platform code run in browsers isn't any more weird/esoteric than targeting Android via the Android NDK. The least painful way is to do some things in the 'native' platform language (e.g. Javascript or Java/Kotlin). Emscripten's FFI features for calling out into Javascript code snippets (even JS snippets that are directly embedded in C/C++ source files) is actually really nice, much better than any other FFI solution I've seen so far (and light years ahead of anything offered by the Android NDK).
In the end the web is just another platform, but a platform that is quite a bit different from the UNIX/Windows duopoly we're used to.
samiv 24 hours ago [-]
You can call JS in which you can manipulate the DOM.
Of course architecturally (also regarding your file access) it's better to use the wasm for logic as much as possible where the web (HTML/JS) provides the UI and IO, data flows into wasm for work and results flow back to the web.
This also has the benefit that you can keep your original C/C++ source code much more platform agnostic which helps reusability and testing.
frollogaston 20 hours ago [-]
Trying wasm is still on my todo list, but this sounds like how I'd expect it to work
gspr 24 hours ago [-]
> You can call JS in which you can manipulate the DOM.
Well sure. But for me, the promise of WASM was to make the browser "just another platform". Now it's "this special platform where you have to access some of the most important functionality through FFI interop with a very high-level, very opinionated language".
> Of course architecturally (also regarding your file access) it's better to use the wasm for logic as much as possible where the web (HTML/JS) provides the UI and IO, data flows into wasm for work and results flow back to the web.
OK, but like, I wanted the browser to be "just another platform". I don't want to use JS, and I consider HTML orthogonal to my logic. I realize that's not where we're at, but that's what I dreamt of. Hence my disappointment. Which is OK, I don't matter :)
> This also has the benefit that you can keep your original C/C++ source code much more platform agnostic which helps reusability and testing.
It feels the opposite to me.
jayd16 21 hours ago [-]
Hmm well I guess I don't quite get what counts as "just another platform." Surely every platform is going to have the native APIs that you need to abstract over. Why is WASM different?
Is it just a matter of WASM being too new to have full featured wrappers and APIs for your language of choice?
frollogaston 16 hours ago [-]
Yeah it's like how if you want a cross-platform UI in C, you kinda need to use something like SDL that abstracts away the platform specifics. Even then might be hard for everything to work.
Web is "just another platform" with its own specifics, and the advantage is multiple OSes can run that platform pretty much the same way.
trumpdong 23 hours ago [-]
JS in the web context is what C or assembly is in the native context: something you have to use, because it's the foundation the platform is built on, and every language needs a way to interact with it, and good languages support it inline when you need it.
postalrat 23 hours ago [-]
If enough people adopt identical or similar js glue then they can use that for a new standard. If people dont care about a standard interface then why both creaing a new standard? Look what happened with jquery selectors and ajax. People loved it and it became the new standard built into browsers.
tracker1 16 hours ago [-]
FWIW, the various Rust react-like libraries (Yew, Dioxus, Leptos) are all reasonably fast for many/most applications. Even if the DOM goes through a JS interpreted layer.
Something akin to raw sockets over a host interface (or WSS bridge) could be cool... similar for sandboxed FS access, which browsers are starting to improve upon.
Yes, fully WASI/WASM would be nicer than some of the JS glue... but it's still useful all the same.
muvlon 23 hours ago [-]
> WASI still leaves something to be desired. Why can't I have raw sockets and file access and stuff, in a POSIX-like way?
FWIW, that's exactly what they shipped first, with WASI preview 1 (wasip1). You can still use this today, and all runtimes with any level of WASI support will be able to run it.
phickey 18 hours ago [-]
Wasip1 did not specify sockets. Some implementations have made non-standard additions to add them, but sockets were not added to the standard until wasip2.
Notably, listen and connect are missing. But sockets themselves were in there.
trumpdong 23 hours ago [-]
There's no way to draw on a canvas in WASM either. You just decided to write JS wrapper functions for that. But you didn't write wrapper functions for DOM manipulation.
gspr 23 hours ago [-]
You're right. But at least the JS wrapper for the canvas is just used for setting up the shared memory, if I remember correctly?
At any rate: this doubly makes my point.
thewavelength 1 days ago [-]
Why is a relatively new technology like WASM being limited to 32-bit pointers? Why repeat the same mistake again?
> Web is 32-bit. Your 64-bit structs will break.
This was the root cause of most of my bugs. WASM is 32-bit address space, pointers are 4 bytes not 8.
whizzter 1 days ago [-]
1: Letting your code break on pointer size changes is a quite bad sign imho (it's a sign that many other things are probably done with aliasing,etc and has a high risk of breaking due to undefined behaviour once gcc/clang gets around to utilizing it for an optimization).
2: iirc WASM was initially designed to be shimmable via Asm.JS to force laggards(Apple, Google) to implement it, Asm.JS in turn relied on specific rules in JS to get reliable 32bit arithmetic (but impossible for 64bit).
Wasm64 is implemented and works in Chrome and Firefox.. Apple is lagging again with Safari.
thewavelength 24 hours ago [-]
Thanks!
1: True, although it also limits the addressable memory and the typical 4GB limit seems less these days. I’m thinking of large apps like Figma running in the browser.
2: Will existing 32-bit WASM binaries break on WASM64 engines or does the binary have a flag for compatibility?
whizzter 22 hours ago [-]
1: Something like Figma could probably offload some of the memory pressure to GPU textures. (But they'd probably run into safety browser limits before that).
2: Most runtimes are 64bit already, A runtime detecting a wasm32 binary will just continue to generate code with the current JIT compiler whilst WASM64 will require another JIT (and perhaps memory system since WASM32 runtimes are often based on "hacks" where 4gb of address space is reserved but not given real memory so that the JIT compiler gets an easier job without security implications).
dathinab 16 hours ago [-]
32bit WASM doesn't have a strict 4GiB limit (if the runtime and OS it runs on is 64bit, which is normally the case)
the thing is in WASM "memory" is more or less a resizable ArrayBuffer
and while each has an effective 4GiB limit wasm does allow passing more then one such buffer to any specific wasm "execution/thread"(1) you can then reference them in load/store instructions to load/store from other "memories" then the default one
As general purpose languages tend to not model that this isn't that easy to take advantage of but it is still useful for all kind of "tricks", like (non exhaustive):
- working around 4GiB size limit
- persistent memory between otherwise clean restarts and/or software updates (like what you can get from systemds file descriptor store and other means)
- easier handling of pre-populated memory (think large perfect hashmaps, trie, or similar)
- memory isolation, WASM memory can be shared, but for security and fault tolerance reasons it is often preferable if different workers have their own memory array as well as an additional shared memory array.
- This also allows stuff like security proxies where A->B have a shared memory IPC mechanism and B->C have that too, but A->C can directly communicate at all. Not that relevant in the browser and more for server side WASM usage.
- and more
Anyway IMHO the main point for WASM64 is more the convenience benefits then the 32bit memory limitations. Like porting is easier, most software is 64bit today. Like it's what people are used to. There are a lot of ways where overflows can happen with 32bit but are practically impossible for 64bit. E.g. overflowing 0u64 with +=1 at 6e9 ops/s takes decades, but for 0u32 it's <1s. Stuff like that means you need far more sanity&safety checks in 32bit and it's easier to mess up edge cases.
koolala 24 hours ago [-]
what would make it break? i think the program just calls a 64 bit wasm memory function if it uses the capability
PhilipRoman 24 hours ago [-]
I believe 32-bit was chosen partially due to implementation efficiency reasons. It makes sense because you can allocate a 4GB mapping, so there is no need for a second software virtual memory layer. Also perhaps they internally require tagged pointers, which are much cheaper, especially if aligned, if the pointer is only 32 bits
Findecanor 23 hours ago [-]
WASM has a (pointer + i32) address mode, and the effective address is 33 bits.
So WASM implementations use 8GB mappings ...
flohofwoe 17 hours ago [-]
Because 64-bit WASM can be quite a bit slower than 32-bit WASM:
TL;DR: wasm64 requires explicit heap bounds checks, while in wasm32 the memory mapping hardware does it for free.
E.g. quote:
"The only reason to use Memory64 is if you actually need more than 4GB of memory.
Memory64 won’t make your code faster or more “modern”. 64-bit pointers in WebAssembly simply allow you to address more memory, at the cost of slower loads and stores."
ape4 22 hours ago [-]
64 bit was added in WebAssembly 2.0 (finished in 2022 according to Wikipedia). I know what doesn't answer any it wasn't there in the first place.
koolala 24 hours ago [-]
32 is better for a lot of things like simd. the strength of it is wasm can do both types now and js can't unfortunately. a number in js is strictly 64.
senfiaj 18 hours ago [-]
Not sure about SIMD. If you mean WASM, the main advantage of WASM32 over WASM64 is execution speed if it runs on a 64-bit runtime. This is because pointer accesses are simply 64-bit base pointer + 32/33-bit offset (the 32-bit pointer value in WASM program + some 32-bit optional offset). Since the offset in the memory access is already trimmed to 32/33-bit (in a 32-bit half of the register) at machine instruction level there is no possibility to escape the 8GB virtual memory cage that Chrome allocates, thus no need for additional runtime checks. WASM64, on the other hand, can escape without such checks.
koolala 12 hours ago [-]
I just meant getting to use 32-bit numbers in general and not just for memory. It's nice to be able to use them for speed but also when things like GPUs only support them. I wish JS supported using them too.
tracker1 16 hours ago [-]
deleted
flohofwoe 16 hours ago [-]
> WASM kind of inherited some of that legacy.
It didn't. WASM has true 64 bit integers (or specifically, the base types of WASM are: i32, i64, f32 and f64 - where the integer types are 'sign agnostic' like CPU registers).
groundzeros2015 23 hours ago [-]
Because a web page shouldn’t use 4 GB of ram, and the win is that each pointer can be half the size (better for memory and cache).
The real mistake is requiring pointer to be 64 bit when most programs don’t use it.
DonHopkins 22 hours ago [-]
You sounds like the misattributed Bill Gates of 2026.
groundzeros2015 17 hours ago [-]
No? Most consumer desktops have 8 or 16 GB and phones less. You want to use more than half for a web page?
For reference 4 GB is 8x more than a ps3.
frollogaston 16 hours ago [-]
Game consoles are weird though, they have less RAM than contemporary PCs. Like PS3 and Xbox360 both had 512MB, while my iMac G5 had 1GB. Maybe cause the console's RAM is unified with the GPU, while the G5 only had 64-128MB VRAM.
frollogaston 20 hours ago [-]
Even before RAM got very expensive recently, it had already plateaued. Like 32 GB was still considered a lot for a PC and was about the same price as a decade prior.
unwind 1 days ago [-]
Meta: a space is missing in the title.
Since this is one of the bugs, I always recommemd writing
It's not 100% better, but it cuts out a few tokens which helps readability and moves the significant asterix further left where I think it's easier to spot.
jstimpfle 23 hours ago [-]
It's totally true, using sizeof like a function is one of my pet peeves. Even the kernel people do it but it's WRONG and you are right.
But ACSHUALLY, how you write allocation is like this
The kernel people seem to finally have figured out this one in 2026.
unwind 3 hours ago [-]
Nah I'm still against repeating the type name all over the place, and the cast adds nothing good imnsho.
jstimpfle 3 hours ago [-]
The cast is at least 50% of why this is useful! You'll now get compile errors in case you did anything wrong.
DonHopkins 19 hours ago [-]
Nothing is sane in a language that lets you say 4["Foo!"]
Array indexing in C is just pointer arithmetic wearing Groucho Marx Glasses.
C combines the flexibility and power of assembly language with the user-friendliness of assembly language.
jstimpfle 18 hours ago [-]
> Nothing is sane in a language that lets you say 4["Foo!"]
I just had a look at your HN profile page and was struck by the irony of seeing your Forth vs Lisp vs Postscript code examples there. Now consider that I've never written code like 4["Foo!"], even though I know it's possible, but in other languages you constantly have to do mental gymnastics to get any real work done, and those are allegedly so much saner !???
DonHopkins 17 hours ago [-]
When they were handing out brains, I though they said suᴉɐɹq, and I said "180 rotate".
quietbritishjim 24 hours ago [-]
Honestly, I think I'm more likely to get your form wrong than the original one. This doesn't obviously look wrong to me:
Maybe I find this harder to parse because I'm not used to sizeof without brackets (though I know it's valid). But I think the bigger deal is that your version has a bug if the star is missing whereas there's has a bug if the star is present; it's easier to spot something extra than it is to spot something missing.
ErroneousBosh 1 days ago [-]
> Meta: a space is missing in the title.
I like the word "everybug" :-D
Joker_vD 19 hours ago [-]
Frankly, "sizeof(T*)" should generate a warning if T is anything other than void, or a function type.
Yes, I know that C technically allows rather heterogenous representations for pointers to different types, but in practice there is difference only between object pointers and function pointers.
Panzerschrek 6 hours ago [-]
I did the same with one of small games I have developed. It wasn't that hard. I only needed to tweak the build script and to fix some minor issues, like changing how main function works and swapping color components in the result picture. I did use SDL2 for it, but without OpenGL, so, I had no problems with shaders or something similar.
Someone 21 hours ago [-]
FTA: I was serializing asset structs directly to disk (pak file) that had raw pointers in them
I’m surprised that that works in WASM. Wouldn’t a tiny change in your memory usage (say if you toggle your “log startup progress” flag) load data at a different address?
mwkaufma 20 hours ago [-]
Usually you do "pointer-fixup" where you convert them to relative-offsets on write and then back to absolute-offsets on read.
xydone 1 days ago [-]
The memory64 proposal was merged into upstream last year, any reason to opt into 32 bit despite that?
sestep 1 days ago [-]
It's slower. Wasm32 can just reserve 8 GiB (32-bit pointer + 32-bit offset) of the virtual address space from the OS for each memory, so checking for out-of-bounds memory accesses imposes no performance penalty. Wasm64 can't do that, so each memory access is a bit slower.
senfiaj 22 hours ago [-]
Sometimes I wonder whether it's possible to run the wasm code in a separate sandboxed process to eliminate a lot of checks. I mean optionally, because normally JS calls wasm code synchronously in the same address space. The bridge will add more latency when there is a transition between JS and wasm. It's obviously complicated because some data structures can also be shared, such as SharedArrayBuffer.
flohofwoe 16 hours ago [-]
> The bridge will add more latency when there is a transition between JS and wasm.
This would be similar to how NaCl/PNaCl communicated with the JS side (via message passing), and that really sucked and would also be prohibitively slow for talking to 'high frequency APIs' like WebGL2 or WebGPU (or the DOM heh).
15 hours ago [-]
xydone 24 hours ago [-]
Oh that's interesting, never noticed it in my experience but I have never written anything in wasm where it would matter. Makes perfect sense now that I think about it though. Thanks!
trumpdong 23 hours ago [-]
You don't need 4GB and it wastes memory to make pointers twice as big? Even Linux supports running 64-bit code in a 32-bit address space ("x32 ABI") for this reason.
Narishma 23 hours ago [-]
> Even Linux supports running 64-bit code in a 32-bit address space ("x32 ABI") for this reason.
I don't think that ever had much, if any, adoption and it looks like it will be removed in the next few releases.
flohofwoe 16 hours ago [-]
I already posted the link in another reply, but this is a good overview why wasm32 is usually the better choice over wasm64:
TL;DR: wasm64 has slower memory load/store operation because it requires 'software bounds checking', so unless you absolutely need more than 4 GB RAM, wasm32 is the better choice.
whizzter 1 days ago [-]
Apple
koolala 24 hours ago [-]
they limit some good things on purpose just for the sake of ecosystem competition. but with this they are slowly implementing it?
hiccuphippo 23 hours ago [-]
Fun game! The demo works great on mobile except for some small font sizes and you can't hover over items to see the tooltip before selecting them.
fyrn_ 20 hours ago [-]
You can get real breakpoints, memory watching, etc in browser with the chrome debugging extension
flohofwoe 15 hours ago [-]
I would recommend the VSCode WASM DWARF debugging extension instead of the Chrome extension nowadays:
which of these vulnerabilities are most concerning to you in wasm programs?
rvz 16 hours ago [-]
All of them.
koolala 12 hours ago [-]
Can you explain why atleast one of them is bad in WASM in your own words? Why be concerned? There are not really that many capabilities inside the WASM program that can be exploited and its hard to imagine a realistic example. An example that paper gave is doing document.write from WASM with unsantized strings but that is bad practice even in Javascript.
The bounds checking story is only on the external limits of linear memory segments.
If memory gets corrupted inside a linear memory segment, it can equally well be exploited to change execution behaviour, which for many scenarios is already good enough for the attacker.
Yet these kind of attack vectors usually are dropped from blog posts selling WebAssembly as a revolutionary bytecode.
It is only yet another one since various others that came and went since UNCOL became an idea.
koolala 12 hours ago [-]
How could any general execution environment guarantee memory like that? That doesn't seem like a realistic expectation. You can write safe Rust code if you want memory guarantees in WASM but would you really want it to block the ability to run unsafe Rust code too?
pjmlp 8 hours ago [-]
Easy, see other bytecodes with bounds checking opcodes, and where use of unsafe bytecodes taint the executable on the verifier, which then requires explicit execution permission.
koolala 7 hours ago [-]
Taint it how? What kind of permissions? Your fix is a pop up warning on unsafe code?
pjmlp 3 hours ago [-]
See, this is where knowing the history of bytecode formats since UNCOL, would be relevant.
"In fact, all unsafe constructs are rejected by the NEWP compiler unless a block is specifically marked to allow those instructions. Such marking of blocks provide a multi-level protection mechanism."
"NEWP programs that contain unsafe constructs are initially non-executable. The security administrator of a system is able to "bless" such programs and make them executable, but normal users are not able to do this. (Even "privileged users", who normally have essentially root privilege, may be unable to do this depending on the configuration chosen by the site.) While NEWP can be used to write general programs and has a number of features designed for large software projects, it does not support everything ALGOL does."
"Normally, code that is not verifiably type safe cannot run, although you can set security policy to allow the execution of trusted but unverifiable code."
"SLIC enforces IBM i’s unique object-based model. Rather than managing raw memory locations or file descriptors, all resources (programs, files, queues, data areas, libraries) are managed as named objects with properties, ownership, and permissions. This object model permeates everything in IBM i, from file systems to program calls."
Aka capabilities, and what CHERI project is pushing for as means to fix C and C++ code at hardware level.
koolala 3 hours ago [-]
Isn't that like rejecting non-safe Rust code? Unsafe code plays an important role in the hot-loops of our ever-slowing computers.
DonHopkins 21 hours ago [-]
I've been porting Micropolis (SimCity Classic) to WASM / WebGPU / Svelte 5. Emscripten + Embind compile the C++ engine and glue it to TypeScript/Svelte/Runes/Reactivity; TypeScript owns UI, rendering, and callback handlers.
I agree with the article's main lessons: wasm32 pointer size, don't serialize structs with pointers, debug native 32-bit when you can, WebGL/WebGPU is stricter than desktop GL, Emscripten export flags still bite. I hit some of the same categories; the parts that were actually tricky for Micropolis are below.
Svelte 5 runes ($state, $derived, etc.) work in plain .ts modules, not just .svelte templates. That matters because the WASM bridge is a reactive module the HUD, command bus, and Vitest all import -- not a component-only trick. The file has to be MicropolisReactive.svelte.ts so runes compile under the same Vite/SvelteKit pipeline as the app; plain .ts breaks in Node with "$state is not defined".
Embind API surface -- what to expose and what to leave out:
// This file uses emscripten's embind to bind C++ classes,
// C structures, functions, enums, and contents into JavaScript,
// so you can even subclass C++ classes in JavaScript,
// for implementing plugins and user interfaces.
//
// Wrapping the entire Micropolis class from the Micropolis (open-source
// version of SimCity) code into Emscripten for JavaScript access is a
// large and complex task, mainly due to the size and complexity of the
// class. The class encompasses almost every aspect of the simulation,
// including map generation, simulation logic, user interface
// interactions, and more.
The comments in that file go on to describe the strategy for wrapping: Core Simulation Logic, Memory and Performance Considerations, Direct Memory Access, User Interface and Rendering, Callbacks and Interactivity, and Optimizations.
The engine callback virtual interface bridged C++ to JS via JSCallback:
In the old NeWS/Hyperlook, TCL/Tk/X11, SWIG/Python/PyGTK, and SWIG/Python/TurboGears/AMF/Flash versions, this callback interface used to be a stringly typed general purpose event callback interface, which I tightened up into a strict C++ interface and corresponding typescript interface, so embind could help me integrate it safely and cleanly with TypeScript and Svelte Runes.
TypeScript handlers that update rune-backed state (sendMessage, didTool, budget hooks, etc.):
The pattern: C++ fires callbacks with enough context for the UI; TS updates $state; components read micropolisReactive (peek / poke / memory / getSnapshot) instead of calling Embind or touching HEAP* directly. That is where the rubber hits the road for interactivity.
Heap access is its own footgun. Emscripten may expose Module.wasmMemory, HEAPU16, or neither until init; some getters throw if you read too early. Centralized helper:
Map rendering: WebGPU tile renderer with canvas fallback (legacy WebGL frozen, now reimplementing in WebGPU). The renderer reads 16 bit flags + tile indices from direct simulator memory views into WASM linear memory (mapData / mopData), not per-frame Embind copies.
City saves are a defined binary format (.cty), not fwrite of engine structs. Live map data is views into WASM linear memory (mapData / mopData), not embedded native pointers -- same idea as the article's side-table fix, but that is how this codebase is already structured.
Why I find this stack interesting: original SimCity engine lineage, narrow Embind surface on purpose, reactive TS facade so automation and UI share one sim without reviving the old Python/SWIG/pyGTK path. Sprites (trains, choppers, generic orange monsters wrecking chaos and havoc -- definitely not Godzilla [TM], but possibly Trump adjacent) simulate in C++; compositing them in the WebGPU path is still work in progress.
The WebGPU renderer is being built as a general stack with pluggable layers, including Sims content rendering (characters, animations, terrain, objects, walls, floors, ui effects, etc).
It was a 20-year-old codebase from my old game in win32 and DirectX 9.
I first ported it to native and also switched to bgfx for rendering. This was the bulk of the work - converting all of the old DirectX fixed function pipeline code to shaders. Luckily all modern shaders can simulate all of the old fixed-function DX pipeline features with little effort. Including the coordinate system. Loading DDS textures didn't present a major challenge either.
Had similar native asset loading as yours - no deserializer. It loaded an entire asset file into a preallocated memory block, used packed structures and converted file offsets to pointers after loading. I had to convert it to 64bit for native first.
The most surprising thing: I had no idea WASM is 32bit until I read your article! Once I ported to 64bit, I then ported to WASM and I didn't even encounter any arch related bugs. In hindsight I guess it's because most of the original code was 32bit and the asset file format is still 32bit format. When I ported to 64bit I used a deserializer, so I guess that's why it all worked out in the end.
For native audio I ended up using SoLoud library, but for emscripten I #ifdef'd it out to use inline JS instead. I figured there is no point in having all that extra audio library code compiling to WASM when modern browsers natively support playing audio, oggvorbis, etc. It worked out ok, but there's still a minor bug where the music doesn't loop perfectly. You can hear a split second gap between end/start. I haven't looked deeply into it yet.
Originally when we wrote the game we had banned ourselves from using C++ Exception handling and RTTI. The decision likely paid off as it makes the generated binary smaller and faster. Although I haven't had time to measure. Supposedly C++ exceptions introduce a much heavier overhead in Emscripten.
You can see the port in action at https://scorchedplanets.com
WASM(32) is a hybrid 32/64 bit architecture. The address range (and thus pointer size) is 32 bits, but it has native 64-bit integers. E.g. it's similar to the Linux x32 ABI.
There is also a 'true' 64-bit wasm, but that's still too recent to be used in real-world code:
https://caniuse.com/wf-wasm-memory64
(but wasm64 doesn't really make sense unless you really need an address space greater than 32 bits, because the downside is slower performance)
Or unless you need to use integer types that depend on pointer size (such as size_t or usize), but your integers are too large to fit in 32 bits. That's a pretty common occurrence in bioinformatics. I've been waiting for years for Wasm to become usable, but it looks like Apple is still holding it back.
In practice, C doesn't do any padding shenanigans, but C++ does (but only for non-POD structs, and then you discover there's several slightly different definitions that mean basically "POD", so have fun predicting which one is the one that actually matters for your use case).
C++ "standard layout type" is the modern equivalent of "POD" I think.
Technically that's not true at least for booleans and enums, the C standard doesn't define specific sizes for those (bools are commonly 1 byte though, but for enums at least MSVC likes to disagree with Clang and GCC).
Using a direct struct memory layout for persistency and then expecting it to work across compilers, CPUs and ABIs is almost guaranteed to cause problems.
You really need a serializer for this sort of thing because it can also include forwards compatibility of your data structures.
UCSD Pascal:
https://archive.org/details/UCSD_Pascal_1.1_1
Wizardry:
https://archive.org/details/WizardryProvingGrounds
https://en.wikipedia.org/wiki/SWEET16
https://techwithdave.davevw.com/2024/05/running-sweet-16-ste...
Is ActiveX platform independent? No. it's exclusive to windows. Is it sandboxed? Nope, digital signing and prayer, does it implement a virtual machine? Nope. Compromises out the wazoo? efficiency, data orientation, or predictable performance? You betcha. ActiveX is closer to a DOM sandbox escape exploit than a real piece of engineering. Why do we need WASM when we've have GET since 1990?
Don't confuse the map for the territory, implementation details matter, just labeling something "Mars Colonial Transporter" doesn't mean it actually flew to mars.
All those "look Python on the browser!" were already done by ActiveState with Perl, Python and Tcl.
That's currently only not possible because nobody wants to do the work to create something like wasi-gfx (https://wasi-gfx.dev/), but for native UI frameworks instead of 3D APIs.
The inconvenient truth is that even "native" cross-platform applications hardly ever go through the trouble to target the platform-native UI framework (and instead they go through non-native frameworks like Qt or a webview wrapper).
Would be cool to get some standardization on at least a few APIs for default fonts, light/dark mode, background and accent colors, etc... so that apps are a little less alien in practice. I'm really not even the idea of Tauri or similar to use a native browser engine, but better skinning APIs so you can get something like Material, but tuned to better match the desktop you're on.
For that matter, a wasi component package would be nice as well. Harder for accessibility though.
I'm a bit disappointed though:
* There's still no way to do DOM manipulation. So then it's tempting to just grab a canvas and draw everything yourself, which of course wreaks on things like accessibility. I'm no fan of the web, but at least it comes with a somewhat agreed-upon way to display graphical stuff – it's a bit of a shame if we're all gonna just treat it like a surface for pixels.
* WASI still leaves something to be desired. Why can't I have raw sockets and file access and stuff, in a POSIX-like way? I understand that sandboxing is important, so this can all be on a per-request-basis, but still. This "just another platform" is still too far from just that.
* The amount of JS glue needed to actually load WASM stuff in the browser is annoying. The idea of needing a bunch of magic "bundlers" is sad.
In the end the web is just another platform, but a platform that is quite a bit different from the UNIX/Windows duopoly we're used to.
Of course architecturally (also regarding your file access) it's better to use the wasm for logic as much as possible where the web (HTML/JS) provides the UI and IO, data flows into wasm for work and results flow back to the web.
This also has the benefit that you can keep your original C/C++ source code much more platform agnostic which helps reusability and testing.
Well sure. But for me, the promise of WASM was to make the browser "just another platform". Now it's "this special platform where you have to access some of the most important functionality through FFI interop with a very high-level, very opinionated language".
> Of course architecturally (also regarding your file access) it's better to use the wasm for logic as much as possible where the web (HTML/JS) provides the UI and IO, data flows into wasm for work and results flow back to the web.
OK, but like, I wanted the browser to be "just another platform". I don't want to use JS, and I consider HTML orthogonal to my logic. I realize that's not where we're at, but that's what I dreamt of. Hence my disappointment. Which is OK, I don't matter :)
> This also has the benefit that you can keep your original C/C++ source code much more platform agnostic which helps reusability and testing.
It feels the opposite to me.
Is it just a matter of WASM being too new to have full featured wrappers and APIs for your language of choice?
Web is "just another platform" with its own specifics, and the advantage is multiple OSes can run that platform pretty much the same way.
Something akin to raw sockets over a host interface (or WSS bridge) could be cool... similar for sandboxed FS access, which browsers are starting to improve upon.
Yes, fully WASI/WASM would be nicer than some of the JS glue... but it's still useful all the same.
FWIW, that's exactly what they shipped first, with WASI preview 1 (wasip1). You can still use this today, and all runtimes with any level of WASI support will be able to run it.
Notably, listen and connect are missing. But sockets themselves were in there.
At any rate: this doubly makes my point.
> Web is 32-bit. Your 64-bit structs will break. This was the root cause of most of my bugs. WASM is 32-bit address space, pointers are 4 bytes not 8.
2: iirc WASM was initially designed to be shimmable via Asm.JS to force laggards(Apple, Google) to implement it, Asm.JS in turn relied on specific rules in JS to get reliable 32bit arithmetic (but impossible for 64bit).
Wasm64 is implemented and works in Chrome and Firefox.. Apple is lagging again with Safari.
1: True, although it also limits the addressable memory and the typical 4GB limit seems less these days. I’m thinking of large apps like Figma running in the browser.
2: Will existing 32-bit WASM binaries break on WASM64 engines or does the binary have a flag for compatibility?
2: Most runtimes are 64bit already, A runtime detecting a wasm32 binary will just continue to generate code with the current JIT compiler whilst WASM64 will require another JIT (and perhaps memory system since WASM32 runtimes are often based on "hacks" where 4gb of address space is reserved but not given real memory so that the JIT compiler gets an easier job without security implications).
the thing is in WASM "memory" is more or less a resizable ArrayBuffer
and while each has an effective 4GiB limit wasm does allow passing more then one such buffer to any specific wasm "execution/thread"(1) you can then reference them in load/store instructions to load/store from other "memories" then the default one
As general purpose languages tend to not model that this isn't that easy to take advantage of but it is still useful for all kind of "tricks", like (non exhaustive):
- working around 4GiB size limit
- persistent memory between otherwise clean restarts and/or software updates (like what you can get from systemds file descriptor store and other means)
- easier handling of pre-populated memory (think large perfect hashmaps, trie, or similar)
- memory isolation, WASM memory can be shared, but for security and fault tolerance reasons it is often preferable if different workers have their own memory array as well as an additional shared memory array.
- This also allows stuff like security proxies where A->B have a shared memory IPC mechanism and B->C have that too, but A->C can directly communicate at all. Not that relevant in the browser and more for server side WASM usage.
- and more
Anyway IMHO the main point for WASM64 is more the convenience benefits then the 32bit memory limitations. Like porting is easier, most software is 64bit today. Like it's what people are used to. There are a lot of ways where overflows can happen with 32bit but are practically impossible for 64bit. E.g. overflowing 0u64 with +=1 at 6e9 ops/s takes decades, but for 0u32 it's <1s. Stuff like that means you need far more sanity&safety checks in 32bit and it's easier to mess up edge cases.
https://spidermonkey.dev/blog/2025/01/15/is-memory64-actuall...
TL;DR: wasm64 requires explicit heap bounds checks, while in wasm32 the memory mapping hardware does it for free.
E.g. quote:
"The only reason to use Memory64 is if you actually need more than 4GB of memory.
Memory64 won’t make your code faster or more “modern”. 64-bit pointers in WebAssembly simply allow you to address more memory, at the cost of slower loads and stores."
It didn't. WASM has true 64 bit integers (or specifically, the base types of WASM are: i32, i64, f32 and f64 - where the integer types are 'sign agnostic' like CPU registers).
The real mistake is requiring pointer to be 64 bit when most programs don’t use it.
For reference 4 GB is 8x more than a ps3.
Since this is one of the bugs, I always recommemd writing
Like this instead: It's not 100% better, but it cuts out a few tokens which helps readability and moves the significant asterix further left where I think it's easier to spot.But ACSHUALLY, how you write allocation is like this
The kernel people seem to finally have figured out this one in 2026.Array indexing in C is just pointer arithmetic wearing Groucho Marx Glasses.
C combines the flexibility and power of assembly language with the user-friendliness of assembly language.
I just had a look at your HN profile page and was struck by the irony of seeing your Forth vs Lisp vs Postscript code examples there. Now consider that I've never written code like 4["Foo!"], even though I know it's possible, but in other languages you constantly have to do mental gymnastics to get any real work done, and those are allegedly so much saner !???
I like the word "everybug" :-D
Yes, I know that C technically allows rather heterogenous representations for pointers to different types, but in practice there is difference only between object pointers and function pointers.
I’m surprised that that works in WASM. Wouldn’t a tiny change in your memory usage (say if you toggle your “log startup progress” flag) load data at a different address?
This would be similar to how NaCl/PNaCl communicated with the JS side (via message passing), and that really sucked and would also be prohibitively slow for talking to 'high frequency APIs' like WebGL2 or WebGPU (or the DOM heh).
I don't think that ever had much, if any, adoption and it looks like it will be removed in the next few releases.
https://spidermonkey.dev/blog/2025/01/15/is-memory64-actuall...
TL;DR: wasm64 has slower memory load/store operation because it requires 'software bounds checking', so unless you absolutely need more than 4 GB RAM, wasm32 is the better choice.
https://marketplace.visualstudio.com/items?itemName=ms-vscod...
This allows to setup an IDE-like 'press F5 to build and start into a debug session' in VSCode, with the debuggee running in Chrome.
E.g. see:
https://floooh.github.io/2023/11/11/emscripten-ide.html
[0] https://soft.vub.ac.be/Publications/2022/vub-tr-soft-22-02.p...
[1] https://www.usenix.org/system/files/sec20-lehmann.pdf
The bounds checking story is only on the external limits of linear memory segments.
If memory gets corrupted inside a linear memory segment, it can equally well be exploited to change execution behaviour, which for many scenarios is already good enough for the attacker.
Yet these kind of attack vectors usually are dropped from blog posts selling WebAssembly as a revolutionary bytecode.
It is only yet another one since various others that came and went since UNCOL became an idea.
Burroungs (1961),
https://en.wikipedia.org/wiki/Burroughs_Large_Systems
"In fact, all unsafe constructs are rejected by the NEWP compiler unless a block is specifically marked to allow those instructions. Such marking of blocks provide a multi-level protection mechanism."
"NEWP programs that contain unsafe constructs are initially non-executable. The security administrator of a system is able to "bless" such programs and make them executable, but normal users are not able to do this. (Even "privileged users", who normally have essentially root privilege, may be unable to do this depending on the configuration chosen by the site.) While NEWP can be used to write general programs and has a number of features designed for large software projects, it does not support everything ALGOL does."
CLR (2001)
https://learn.microsoft.com/en-us/dotnet/framework/tools/pev...
"Normally, code that is not verifiably type safe cannot run, although you can set security policy to allow the execution of trusted but unverifiable code."
IBM i (nee AS/400)
https://medium.com/@dhemanthc/ibm-i-architecture-how-timi-an...
"SLIC enforces IBM i’s unique object-based model. Rather than managing raw memory locations or file descriptors, all resources (programs, files, queues, data areas, libraries) are managed as named objects with properties, ownership, and permissions. This object model permeates everything in IBM i, from file systems to program calls."
Aka capabilities, and what CHERI project is pushing for as means to fix C and C++ code at hardware level.
I agree with the article's main lessons: wasm32 pointer size, don't serialize structs with pointers, debug native 32-bit when you can, WebGL/WebGPU is stricter than desktop GL, Emscripten export flags still bite. I hit some of the same categories; the parts that were actually tricky for Micropolis are below.
Svelte 5 runes ($state, $derived, etc.) work in plain .ts modules, not just .svelte templates. That matters because the WASM bridge is a reactive module the HUD, command bus, and Vitest all import -- not a component-only trick. The file has to be MicropolisReactive.svelte.ts so runes compile under the same Vite/SvelteKit pipeline as the app; plain .ts breaks in Node with "$state is not defined".
Embind API surface -- what to expose and what to leave out:
https://github.com/SimHacker/MicropolisCore/blob/main/packag...
The comments in that file go on to describe the strategy for wrapping: Core Simulation Logic, Memory and Performance Considerations, Direct Memory Access, User Interface and Rendering, Callbacks and Interactivity, and Optimizations.The engine callback virtual interface bridged C++ to JS via JSCallback:
https://github.com/SimHacker/MicropolisCore/blob/main/packag...
In the old NeWS/Hyperlook, TCL/Tk/X11, SWIG/Python/PyGTK, and SWIG/Python/TurboGears/AMF/Flash versions, this callback interface used to be a stringly typed general purpose event callback interface, which I tightened up into a strict C++ interface and corresponding typescript interface, so embind could help me integrate it safely and cleanly with TypeScript and Svelte Runes.
TypeScript handlers that update rune-backed state (sendMessage, didTool, budget hooks, etc.):
https://github.com/SimHacker/MicropolisCore/blob/main/apps/m...
Simulator attach/detach, singleton engine load, wiring JSCallback into Micropolis:
https://github.com/SimHacker/MicropolisCore/blob/main/apps/m...
The pattern: C++ fires callbacks with enough context for the UI; TS updates $state; components read micropolisReactive (peek / poke / memory / getSnapshot) instead of calling Embind or touching HEAP* directly. That is where the rubber hits the road for interactivity.
Heap access is its own footgun. Emscripten may expose Module.wasmMemory, HEAPU16, or neither until init; some getters throw if you read too early. Centralized helper:
https://github.com/SimHacker/MicropolisCore/blob/main/apps/m...
Bridge design, Vitest against real WASM, teardown order with Embind lifetimes:
https://github.com/SimHacker/MicropolisCore/blob/main/docume...
Map rendering: WebGPU tile renderer with canvas fallback (legacy WebGL frozen, now reimplementing in WebGPU). The renderer reads 16 bit flags + tile indices from direct simulator memory views into WASM linear memory (mapData / mopData), not per-frame Embind copies.
https://github.com/SimHacker/MicropolisCore/blob/main/packag...
https://github.com/SimHacker/MicropolisCore/blob/main/docume...
City saves are a defined binary format (.cty), not fwrite of engine structs. Live map data is views into WASM linear memory (mapData / mopData), not embedded native pointers -- same idea as the article's side-table fix, but that is how this codebase is already structured.
Why I find this stack interesting: original SimCity engine lineage, narrow Embind surface on purpose, reactive TS facade so automation and UI share one sim without reviving the old Python/SWIG/pyGTK path. Sprites (trains, choppers, generic orange monsters wrecking chaos and havoc -- definitely not Godzilla [TM], but possibly Trump adjacent) simulate in C++; compositing them in the WebGPU path is still work in progress.
The WebGPU renderer is being built as a general stack with pluggable layers, including Sims content rendering (characters, animations, terrain, objects, walls, floors, ui effects, etc).
Character animation demo:
https://vitamoo.space
VitaMoo code:
https://github.com/SimHacker/MicropolisCore/tree/main/packag...
Unified WebGPU Renderer:
https://github.com/SimHacker/MicropolisCore/blob/main/docume...
Render Core Package:
https://github.com/SimHacker/MicropolisCore/blob/main/docume...
Renderer Plugin Roadmap:
https://github.com/SimHacker/MicropolisCore/blob/main/docume...
Live Micropolis tile renderer and simulator demo (no other ui yet, work in progress):
https://micropolisweb.com
Demo of the simulator, cellular automata, and tile engine to Jerry Martin's music:
https://www.youtube.com/watch?v=319i7slXcbI
Repo:
https://github.com/SimHacker/MicropolisCore