Awesome work.
Only watched demo, but judging from the fact there are several agent-decided steps in the whole model generation process, I think it'd be useful for Plexe to ask the user in-between if they're happy with the plan for the next steps, so it's more interactive and not just a single, large one-shot.
E.g. telling the user what features the model plans to use, and the user being able to request any changes before that step is executed.
Also wanted to ask how you plan to scale to more advanced (case-specific) models? I see this as a quick and easy way to get the more trivial models working especially for less ML-experienced people, but am curious what would change for more complicated models or demanding users?
Agree. We've designed a mechanism to enable any of the agents to ask for input from the user, but we haven't implemented it yet. Especially for more complex use cases, or use cases where the datasets are large and training runs are long, being able to interrupt (or guide) the agents' work would really help avoid "wasted" one-shot runs.
Regarding more complicated models and demanding users, I think we'd need:
1. More visibility into the training runs; log more metrics to MLFlow, visualise the state of the multi-agent system so the user knows "who is doing what", etc. 2. Give the user more control over the process, both before the building starts and during. Let the user override decisions made by the agents. This will require the mechanism I mentioned for letting both the user and the agents send each other messages during the build process. 3. Run model experiments in parallel. Currently the whole thing is "single thread", but with better parallelism (and potentially launching the training jobs on a separate Ray cluster, which we've started working on) you could throw more compute at the problem.
I'm sure there are many more things that would help here, but these are the first that come to mind off the top of my head.
What are your thoughts? Anything in particular that you think a demanding user would want/need?
Those sound like great next steps! Even without parallel runs, just by having the ability to tweak things during model builds would be super valuable; sort of like how many of the AI IDEs today (Cursor) make changes in increments and, crucially, always ask you whether to proceed or not (and how). At least that's what comes first to mind!
Absolutely! We started off thinking that we wanted to automate the whole way in one go and then add restrictions and interruptions based on areas where users face issues. This is great feedback so thank you!