Fantastic tech that Musk hates
In a recent No Priors podcast with the Waymo Co-CEO Dmitri Dolgov, he talks about how they evaluated just driving with cameras and how it isn't good enough for full autonomy and doesn't meet their bar for safety [1].
They went deep down the wrong path and need to justify their mistake. Waymo will be killed off any day now.
I find opinions like this to be almost as crazy as saying that the earth is flat because Waymo has a working, truly self-driving taxi service RIGHT FREAKING NOW while Musk is still promising to have one some day in the hazy future while NEVER making a single vehicle that can actually drive without someone in the car. Musk rejecting LIDAR means that he fundamentally doesn't understand the technological challenge of self-driving despite have access to the world's experts OR he is cynically using false promises of self-driving to pump up Tesla share price. I know which one I think is true.
I think anyone who listens to Musk talking about something they themselves know a lot about quickly realises that Musk's skills are elsewhere. He can motivate and market the hell out of a business whilst snorting more ketamine than a herd of horses but he is not a technical genius by any means. He pays people well to agree with him and fires them when they don't, so I suspect that his companies that produce better and more stable products do so because he micromanages them less.
It's weird that what he does is so easy yet no one else is making EVs at scale in USA, or landing rockets, 10 years after SpaceX did it
Karpathy said in some podcast that Tela uses LIDAR in training, and by doing this they can get a lot of the benefits. Not sure that all off the "worlds experts" agree with you that you HAVE to use LIDAR. Rate of progress for FSD has been very impressive lately. I personally think that its very plausible that Tesla might beat Waymo to large scale location independent autonomous driving.
Waymo's recent experiment with multimodal models and a purely camera based system (EMMA) validate some of the claims that using LIDAR data in training does help. Pretty neat! Still not as good as a LIDAR + RADAR based system.
It doesn't. It has a party trick that works in very specific conditions.
At least it works. Meanwhile Tesla have nothing to show, even in "very specific conditions".
But it works vastly better than anything Tesla has made so what does that say about Tesla?
From person experience, the state of the art Tesla vision FSD still can't drive east at sunrise, west at sunset, or in moderate rain. I haven't seen any sign of them solving that fundamental problem with vision, especially given there are existing non-vision solutions.
That's a bold claim. Care to justify it?
Yeah, it only works in extremely controlled environments driving really slowly.
The design is also flawed as it has to work with cameras anyway. The last thing you want is two systems arguing over what they see.
Extremely controlled environments like the entire city of San Francisco?
Sensor fusion is a thing. There are no two systems that “argue with each other”. I can’t believe the same old ignorant tropes are still making rounds.
Waymos don’t drive slowly, I don’t know where you’re getting this from. If anything, they drive too fast for a thing without a driver.
It doesn't have to be an argument. You know what each system is good at and prioritize inputs accordingly.
Nice, you just outed yourself as being completely clueless. There exist many good sensor fusion techniques for summing the output of disagreeing sensors.
Sounds like bad designs. If you can get rid of something to reduce complexity you absolutely should.
Did you forget why the 737 max had 2 crashs ? The alert of the difference between 2 sensors didn't work / wasn't there. So the system was relying on 1 sensor.
Except when getting rid of something results in a non-working system. Reduced complexity doesn't work as evidenced by Tesla's inability to have a single driverless mile after nearly a decade of development.
So there's a video of him addressing this - he doesn't hate the tech. He mentions that it's wildly expensive for cars. But, they use it heavily for SpaceX
The issue isn't that it's wildly expensive for cars. But rather for Tesla.
Because the company has promised that existing Tesla owners would be able to use FSD.
Having to retrofit them to add LiDAR sensors would be cost-prohibitive.
Also he wants to reuse the foundational machine vision tech in Optimus bot, which probably won't have lidar.
Based on presentations we've seen what sets Tesla apart are its datasets not the core technology.
And those don't translate across to the Optimus bot.
I think they will though, I think the enormous corpus of video data and the supercluster that powers self driving development are the machine vision analog of internet scale text data that gave rise to LLMs. We'll see the same moment for vision models that text prediction models had once the data is there, where an enormous foundation model becomes much much better, especially at zero-shot tasks.
FSD is already using the fruits of this today with their end to end NN.
And based on what we've seen the results haven't improved enough to put them close to Waymo.
Optimus should probably have LiDAR more than a car…
I would guess the plan is to have the foundational machine vision tech that becomes the core of robotics sensors. Not just Optimus but every robot arm in a factory, robot mule, etc. I don't think everything will have LIDAR if its proven to be unnecessary.
The foundational tech Tesla is using is the same as everyone.
We know this because there have been public presentations about it.
And inventing groundbreaking new tech is so far the domain of academia and large, well funded R&D labs. And almost always shared.
It‘s not just Musk. Most automobile manufacturers have maintained that they need to find a way to do it with cheap and pretty sensors.
This is simply not true. Let's look at the best autonomous driving features available today, i.e. level 3:
Mercedes Drive Pilot: Uses a lidar (and a dummy unit) up front.
BMW Personal Pilot: Uses a lidar (and a dummy unit) up front
Honda SENSING Elite: Uses 5! lidars
They all use lidar, and some of the placement locations are downright hideous (Mercedes EQS). I think further development will require even more/better sensors, and manufacturers tend to agree on this point.
What are the benchmarks that say Mercedes, BMW, and Honda have the best level 3 features.
I ignore the Chinese because it is difficult to get reliable English information. Apart from those, these are the only level 3 systems available, and level 3 is the most advanced system that private individuals can currently get their hands on. Have I missed any?
It's not a benchmark, but there is a youtube channel (Out of Spec) which tests these systems, and I think they also say Mercedes are the best in their "Hogback challenge".
https://www.youtube.com/watch?v=xK3NcHSH49Q&list=PLVa4b_Vn4g...
Worth checking out, many cars are very bad.
All of these are far less capable than FSD. They might have more advanced regulatory approval because they have strong limitations of when it can be used, but if you drive the same route and compare, its not even close.
I doubt it. Yes, FSD is more flexible and can also drive reasonably well on city streets, but there is a reason why it is not certified for level 3 on motorways. It would most likely fail certification. With a level 3 system, I can take my eyes off the road and watch a movie. Doing that with FSD, even in the best conditions, is suicidal. Level 3 vehicles must have an extremely low failure rate. Any crash would quickly be picked up by the media.
FSD is a versatile level 2 system, but at best a prototype for level 3. If we are talking about prototypes, it has to be compared to prototypes from other manufacturers like this <https://www.youtube.com/watch?v=0uSph0asNsk> fully autonomous system from ... 11 years ago. The reason FSD is available to the average consumer is mostly a matter of philosophy, not technology.
> With a level 3 system, I can take my eyes off the road and watch a movie. Doing that with FSD, even in the best conditions, is suicidal.
That is hyperbole at best. I've test driven a Tesla with FSD and it worked flawlessly, such that I would have been perfectly safe taking my eyes off the road. Of course one test drive is not sufficient data to say one should trust the system all the time, but you are making the claim that it is never trustworthy which isn't true.
Oh, it's 100% trustworthy until it suddenly isn't.
I have driven a number of level 2 cars on the motorway and almost all of them can do extended zero-intervention driving, but that does not make them safe. The failure rate compared to humans is still sky high.
Multiple independent FSD tests have shown that you need to take over several times an hour to avoid dangerous or illegal situations <https://electrek.co/2024/09/26/tesla-full-self-driving-third...>. The number will be lower on a motorway and you will sometimes have time to correct even if you are not looking, but the number of failures is still significant. If you take your eyes off the road, it is only a matter of time before you end up in a ditch.
I stand by my statement. The system is _never_ trustworthy enough to take your eyes off the road.
Maybe they changed their mind on it in the last 10 years. I had as the source a high-ranking BMW manager as well as an Audi one who each gave a public lecture at a university with such a statement.
After a bit of research, I found out that they apparently did. Obviously every manufacturer would like to be able to use only proven technologies such as cameras and radar because they are cheap. One of the early Mercedes prototypes seemingly didn't have lidar <https://www.youtube.com/watch?v=DlgGTi4Gs50&t=79>.
Since then, the consensus has been that without lidar, the systems would not meet safety standards. For example, the cars need to be able to detect fairly flat objects, such as pallets that have fallen onto the road, which are very difficult to see optically, especially in difficult lighting conditions. For this reason, and because the technology has come down in price, virtually everyone except Tesla, which is developing advanced driving systems, is using lidar.
This development is nearly a decade old. It is for this reason, combined with the overwhelming amount of Musk-related nonsense, that I objected so strongly.
> have maintained that they need to find a way to do it with cheap
If the goal is to make roads safer. Aiming for cheap is good, it means aiming for more people who can afford that safer car. If it's not safer than humans, it should not be on the road in the first place.
If you want conventional car utilization where the car sits in a parking spot most of the time then the extra cost from the lidars is much more of an issue than if you're operating a fleet that is acting as taxis most of the day.
Theoretically if a human can drive a car using a pair of eyes connected to brain, it should be possible to do that using two cameras connected to some kind of image processing unit.
> Theoretically it should be possible to do that using two cameras connected to some kind of image processing unit
That "some kind of image processing unit" in humans has an awful lot of compute power and software.
If you remove $100k of sensors but have to add $200k of compute to run more advanced computer vision software, then it's a bad tradeoff to use only cameras, even if in theory that software is possible.
In theory. In practice neither the cameras nor processors available in cars function anywhere near human level.
It's not even entirely true in theory. We use a lot of our senses when driving. Force feedback on the wheel. Sounds from the environment. Inertial senses. And our vision isn't fixed, its constantly moving.
And yeah, as you mention, cameras don't really have the same level of range our eyes have and computers don't operate in the same way.
If we want the sell driving computer to be only possibly as good as a human. I can't see in the dark, can't see through fog, and have trouble with rain. Why is human visibility the bar to meet here?
Because we allow humans to drive, therefore if something can perform as well as a human it should be allowed. The bar is a floor, not a ceiling.
Theory isn't really all that applicable to this though - in theory nothing is stopping anyone from writing all code in assembly, but obviously that doesn't happen.
I think more practically cars have adding driver assistance feature for a while now - more cameras, blind spot monitoring, ultrasound for parking, lane drift indicators.
It is therefore not unreasonable to assume that adding more sensors is helpful (but even the old adage of more data is better than less would probably say that).
To be honest, it's possible that having too much data can only cause problems in quick decision-making. Any redundant data will only slow down processing pipelines.
In practice humans aren't particularly safe drivers.
Is that because their vision fails to provide the information necessary to drive safely? Or is it due to distraction and/or poor judgment? I don't actually know the answer to this, but I assume distraction/judgment is a bigger factor.
I'm not a fan of the camera-only approach and think Tesla is making a mistake backing it due to path-dependence, but when we're _only_ talking about this is _broadly theoretical_ terms, I don't think they're wrong. The ideal autonomous driving agent is like a perfect monday morning quarterback who gets to look at every failure and say "see, what you should have done here was..." and it seems like it might well both have enough information and be able too see enough cases to meet some desirable standard of safety. In theory. In practice, maybe they just can't get enough accuracy or something.
> Is that because their vision fails to provide the information necessary to drive safely?
In certain conditions, yes. Humans drive terribly in dark and low light, something lidar excels in.
Still, millions of humans drive every night and only a miniscule percentage cause any accidents. So maybe we are not so bad at this.
According to NHTSA, about half of all fatal crashes occur at night, even though only 25% of driving happens at nighttime. So yes, we are pretty bad at this.
I totally agree, I think most accidents are caused by human nature (especially slow reaction time in specific conditions like being tired or drunk) and ignoring laws of physics (driving too fast). And some are just a pure bad luck (something/someone getting on the road right in front of the car).
Sure, but why strive for that? We can have better than human perception by adding lidar and radar.
Because Musk thinks is much much smarter than he actually is and refuses to listen to anyone. And between how many people he fired at Twitter, Tesla, and soon the US Federal Government I think he gets off on it.
Musk has said several times Lidar is great. It's just a stupid idea for automotive use and he's not wrong.
There's nothing similar in nature for a reason.
Airplanes don't flap their wings and boats don't wag their tails.
Assuming that all technology should imitate nature is a naive engineering principle. The solution should solve the problem within the given constraints.
Nature came up with something much better in both those cases.
Portable, energy efficient, light, doesn't need refined oil, tightly steers...
Boats and aeroplanes are terrible in comparison. They only work due to a huge network of global effort.
>They only work due to a huge network of global effort.
And horses don’t need roads like cars do and cars only work thanks to a huge network of global effort. What point are you trying to make? That we abandon planes until we can develop flight as efficient as nature? Abandoning LIDAR until we can develop visual light perception and processing equal to the human eye and brain?
I don't see many birds around able to carry an extra 280,000lbs for 2,300 miles without having a meal.
Time of flight ranging is used in nature by bats and whales/dolphins.
My back of the napkin estimate is that a human using time of flight ranging would be unable to distinguish between an object directly in front of their face and 8.6 meters away[1]. I think human echolocation uses a different mechanism (presumably relating to amplitude)?
Skimming the Wikipedia article[2], it seems like animals do use time of flight, but also Doppler shifting.
(As a side note, some animals have apparently evolved active countermeasures to echolocation!? It seems obvious in retrospect but incredibly cool.)
There's interesting research into the mechanisms of human echolocation [3], but it was over my head. My impression was that the jury is out as far as the precise mechanisms involved but that there's a lot of evidence to be considered, I'm sure someone with a better background would get more out of it than I did.
(I'm just curious about the mechanism, I agree that LIDAR has natural analogs.)
[1] Speed of sound * 25ms, 25ms being the rule of thumb I've memorized for the minimum interval for two sounds to register as distinct from each other. This is just folk wisdom I've picked up hacking on audio, so perhaps I'm mistaken.
[2] https://en.wikipedia.org/wiki/Animal_echolocation
[3] https://durham-repository.worktribe.com/preview/1375913/1963...
Rotorwings are also not found in the nature yet they give us ability to navigate in a short distance 3D space better than fixed wing.
Nature makes for bad drivers. for some age groups cars are the largest causeof death. I self driving can do better.