Item 43803915

maccard • 6 hours ago

I think you’re fixating on the very specific example. Imagine if instead of 2 + 2 it was multiplying arrays of large matrices. The compiler or runtime would be smart enough to figure out if it’s worth dispatching the parallelism or not for you. Basically auto vectorisation but for parallelism

lazide • 6 hours ago

Notably - in most cases, there is no way the compiler can know which of these scenarios are going to happen at compile time.

At runtime, the CPU can figure it out though, eh?

1 reply

maccard • 4 hours ago

I mean, theoretically it's possible. A super basic example would be if the data is known at compile time, it could be auto-parallelized, e.g.

    int buf_size = 10000000;
    auto vec = make_large_array(buf_size);
    for (const auto& val : vec)
    {
        do_expensive_thing(val);
    }

this could clearly be parallelised. In a C++ world that doesn't exist, we can see that it's valid.

If I replace it with int buf_size = 10000000; cin >> buf_size; auto vec = make_large_array(buf_size); for (const auto& val : vec) { do_expensive_thing(val); }

the compiler could generate some code that looks like: if buf_size >= SOME_LARGE_THRESHOLD { DO_IN_PARALLEL } else { DO_SERIAL }

With some background logic for managing threads, etc. In a C++-style world where "control" is important it likely wouldn't fly, but if this was python...

    arr_size = 10000000
    buf = [None] * arr_size
    for x in buf:
        do_expensive_thing(x)

could be parallelised at compile time.

1 reply

lazide • 4 hours ago

Which no one really does (data is generally provided at runtime). Which is why ‘super smart’ compilers kinda went no where eh?