Lisp could be simple... but there's a lot of reasons it isn't.
It uses a different memory model than current hardware, which is optimized for C. While I don't know what goes on under SBCL's hood, the simpler Lisps I'm familiar with usually have a chunk of space for cons cells and a chunk of "vector" space kinda like a heap.
Lisp follows s-expression rules... except when it doesn't. Special forms, macros, and fexprs can basically do anything, and it's up to the programmer to know when sexpr syntax applies and when it doesn't.
Lisp offers simple primitives, but often also very complex functionality as part of the language. Just look at all the crazy stuff that's available in the COMMON-LISP package, for instance. This isn't really all that different than most high level languages, but no one would consider those "simple" either.
Lisp has a habit of using "unusual" practices. Consider Sceme's continuations and use of recursion, for example. Some of those - like first-class functions - have worked their way into modern languages, but image how they would have seemed to a Pascal programmer in 1990.
Finally, Lisp's compiler is way out there. Being able to recompile individual functions during execution is just plain nuts (in a good way). But it's also the reason you have EVAL-WHEN.
All that said, I haven't invested microcontroller Lisps. There may be one or more of those that would qualify as "simple."
Mostly we have eval-when because of outdated defaults that are worth re-examining.
A Lisp compiler today should by default evaluate every top level form that are compiles, unless the program opts out of it.
I made the decision in TXR Lisp and it's so much nicer that way.
There are fewer surprises and less need for boilerplate for compile time evaluation control. The most you usually have to do is tell the compiler not to run that form which starts your program: for instance (compile-only (main)). In a big program with many files that could well be the one and only piece of evaluation control for the file compiler.
The downside of evaluating everything is that these definitions sit in the compiler's environment. This pollution would have been a big deal when the entire machine is running a single Lisp image. Today I can spin up a process for the compiling. All those definitions that are not relevant to the compile job go away when that exits. My compiler uses a fraction of the memory of something like GCC, so I don't have to worry that these definitions are taking up space during compilation; i.e. that things which could be written to the object file and then discarded from memory are not being discarded.
Note how when eval-when is used, it's the club sandwich 99% of the time: all three toppings, :compile-toplevel, :load-toplevel, :execute are present. The ergonomics are not very good. There are situations in which it would make sense to only use some of these but they rarely come up.