|
yurume
|
|
yurume
Highway on Rust thread, following the comment by veluca:
|
|
2024-04-11 05:19:23
|
|
|
2024-04-11 05:19:24
|
https://discord.com/channels/794206087879852103/803574970180829194/1227556864317984859
|
|
2024-04-11 05:20:41
|
I think <@179701849576833024> may want to hear this, but as I said I have some prototype code for Highway on Rust (which can't be named `highway` btw as it was already taken by HighwayHash, so my working name is `expressway` ๐
|
|
2024-04-11 05:21:57
|
my main concern was an unaligned prefix and suffix, which can be *technically* handled in the same fashion, but needs some type contortion to get it as ergonomic as possible
|
|
2024-04-11 05:23:12
|
I think that has to be somehow addressed even though Highway doesn't handle them anyway, because it will be needed for casting anyway
|
|
2024-04-11 05:24:10
|
for example, you can't blindly convert `&[f32xN]` into `&[f64xM]` where N and M are architecture-dependent
|
|
|
|
veluca
|
2024-04-11 05:34:28
|
how do you deal with dynamic dispatch?
|
|
|
yurume
|
2024-04-11 05:57:49
|
using a procedural macro
|
|
2024-04-11 05:57:56
|
my initial design was something like this:
|
|
2024-04-11 05:58:05
|
```rust
#[expressway::dispatch]
fn mul_add_loop<D: ValueOps<f32>>(
d: D, mul: &[D::Item], add: &[D::Item], x: &mut [D::Item],
) {
for ((mul, add), x) in mul.iter().zip(add).zip(x) {
let mul = mul.load();
let add = add.load();
x.store(mul.mul_add(x.load(), add));
}
}
```
|
|
2024-04-11 06:00:22
|
that should compile to something like this:
|
|
2024-04-11 06:01:04
|
```rust
fn mul_add_loop(...) { /* dispatch logic */ }
mod mul_add_loop {
pub fn x86_64_3(...) { /* x86-64-3-specific code here */ }
// ...
}
```
|
|
2024-04-11 06:01:43
|
(Rust allows a module name to coincide with a function name by the way)
|
|
2024-04-11 06:03:11
|
so that part is actually easy, and you can even try to automatically chain dispatched functions
|
|
2024-04-11 06:03:58
|
(e.g. a function `a` with dispatch may call another function `b` with dispatch, but arch-specific implementations may entirely skip the dispatch logic)
|
|
|
|
veluca
|
|
yurume
(e.g. a function `a` with dispatch may call another function `b` with dispatch, but arch-specific implementations may entirely skip the dispatch logic)
|
|
2024-04-11 09:08:18
|
I thought of that, but then I'd have needed to use a macro for such calls in my mind
|
|
|
yurume
|
2024-04-11 09:26:55
|
it is technically possible with some shared state across multiple procedural macros
|
|
2024-04-11 09:27:02
|
anyway, that is the easy part
|
|
2024-04-11 09:28:20
|
highway tags can be mapped to Rust traits, where ScalableTag simply corresponds to a trait-generic function
|
|
2024-04-11 09:29:20
|
I'm less sure about explicit `load` and `store` calls
|
|
2024-04-11 09:30:09
|
if a safe interface is desired, then they are not as necessary because iterators give them for free
|
|
2024-04-11 09:30:27
|
but we do have unaligned loads and stores as well, which have to be somehow mapped
|
|
2024-04-11 09:30:38
|
so that's another concern
|
|
|
|
veluca
|
|
yurume
it is technically possible with some shared state across multiple procedural macros
|
|
2024-04-11 10:53:50
|
ehhhh ๐
|
|
2024-04-11 10:54:04
|
that's scary
|
|
2024-04-11 10:54:21
|
and I think it introduces dependencies on compilation order?
|
|
|
yurume
|
|
veluca
and I think it introduces dependencies on compilation order?
|
|
2024-04-11 10:58:41
|
If carelessly implemented, yes, but I think you can avoid that with some magic...
|
|
|
|
veluca
|
2024-04-11 12:09:58
|
yeah makes sense
|
|
2024-04-11 12:10:32
|
that said, before getting dynamic dispatch, I'd already love to be able to use intrinsics without lots of unsafe
|
|
2024-04-11 12:10:38
|
(and/or elide bounds checks)
|
|
|
yurume
|
2024-04-11 01:21:28
|
yeah, that was where I stopped, because I was unsure about the actual use pattern
|
|