Devin review: is it a better AI coding agent than Cursor?

82.52k views2169 WordsCopy TextShare

Steve (Builder.io)

Read the full review: https://www.builder.io/blog/devin-vs-cursor

Video Transcript:

I paid the $500 a month to use Devon the AI coding agent so you don't have to let's compare it to cursor agents and see if it's worth the2 billion dollar they valued their company at the main thing to know about Devon is it's primarily a slack based workflow so it's not an IDE you tag Devon and slack and ask Devon to update something fix something Etc which includes a remote server browser vs code editing interface as well as the planner and you can follow along step by step to see what it did and what

it's doing in my case I heard about this new image generation model that's supposed to be small enough to run on consumer grade Hardware I was hoping for a basic web UI but then I found all this and realized I don't code in Python I don't know what to do with this so I asked Devon Devon went to work and cloned the repo and got it spun up and generated an image of a cat for me it even attached the image back to me I then asked for four more images of a dog riding in

a hot air balloon and I got my images in full terrifying quality now that's not Devon's fault of course that's the model we're using I saw one of the to-dos on this repo to create a local realtime interactive app so I as Deon if he could do it CL this repo and add a web based UI to type prompts and see images Devon began spinning things up and sending me updates one really interesting thing Devon does is it takes notes and it stores them in this notes.txt file to refer back to and use in subsequent

prompts this seems like an interesting technique to summarize information that's important and carry it across subsequent steps Devon will also sometimes create knowledge entries which are like bits of information that could be useful to refer back and totally subsequent runs it'll store these and look them up when needed which is supposed to emulate the tribal knowledge that exists within a team I will say that overall Devon's pretty impressive it creates plans it writes code it finds the bugs in the code it corrects the code and even runs its own end tests to verify it works

it'll even respond to your feedback if you find issues and attempt to address it anything you reply and slack Devon will start working on and reply to in this case it was able to verify we're hitting deployment issues I kept working on debugging it but un fortunately after a lot of back and forth it still never was able to solve it and eventually I gave up because I was sick of trying I then asked can I just pull this code down locally and I'll just run it locally and it gave me instructions but they weren't

valid because it didn't actually send this code in a pool request now that's not to say that Devon can't do a pull request one of my very first runs of Devon is I had to add a feature to a weather app and I was able to add the feature I wanted as well as respond to my feedback that I want it to look more like iOS styling the final pull request is not bad it added two packages The Styling does look iOS like the code is pretty good but there's a console log in the code

as well as it forgot to uninstall a package that it no longer needed after my feedback but we can go in and just leave comments like a normal person like remove this log or also pointing out this package is no longer needed and one cool thing Devon did when we were going back and forth on what the UI of this weather app update should be is without it asked me it actually generated a deployment with a preview URL so when I type in a city I can see that the feature I want wanted has an

iOS style like I asked so even though I actually don't have a deploy preview setup on this repo it deployed a version for me to see anyway when it learned I want an iOS style for this app it proposed this to save in the knowledge and I can review and approve it and it'll remember that during subsequent runs for some reason though I couldn't get Devon to reply to my feedback even though I've seen it do it before I don't know what went wrong this time I have had a few bugs along the way but

nothing super crazy that I couldn't usually work around a separate task I asked of Devon was to fix a bug and our EX website it spun up a PR with a fix finding this Boolean it needed to update from True to false but then updated some other stuff I didn't expect like this fallback true even though get Builder static paths already sets fall back to blocking as well as removing this check even though we already turned that value to false and adding a type declaration that I know firsthand isn't needed the cool part is I

asked in the pr why did you do this and Devon added the eyes emoji to tell me it sees this and then it explained itself I'll be honest I was kind of hoping it would fix those things but it did provide a thorough explanation it just wasn't a good one most of this information is not actually true fallback true does not enable client side navigation or enable builder. ao's preview system fallback blocking which is already used is our preference also Tabler icons react type definition is just not needed it's included in the package it made

some weird comment that these components are part of the client side navigation system whatever that means but the nice part is I can talk to Devon like a human leave a comment and it can make updates accordingly hopefully this time it'll see the comment at least maybe my Devon session end it and I need to resume it somehow I don't know but it's pretty cool when it works the last thing I asked Deon to do is Implement a backend feature I said add to our graph kill admin API the ability to read and write from

the Comets collection and Devon created a PR that was decent it adds this reflect metadata package that I don't think is needed we haven't needed it to date but most importantly it did recognize we used this resolver structure it created a comment resolver and added it this code actually looks pretty typical of how we've written this on the back end now it did make up a couple fields that would have been nice to ask me what the schema is but otherwise I'd say this is decent code now overall I'd say the biggest problem I have

with Devon is this is just not my preferred workflow I don't want to make an ask and wait 15 minutes for a pull request and then have this back and forth on the pull request I much prefer cursor's workflow where I have all of this right in my local environment in ide I can see the updates in real time and I can commit and debug locally without jumping to some remote server and other set of tools I don't know and having all these long Waits and delays that are just unfamiliar and unproductive I get that

the idea of Devon is to set some asynchronous agent co-workers off at a task and let them do lots of things in parallel and just come to you with results but that really isn't a great workflow until devans are a lot better I don't want the AI to just go off into its thing and come back only when it's done unless I have high confidence it's going to be really really reliable at that otherwise I'd prefer my IDE just do it so let's now try cursor agents to fix the client side routing bug the big

difference between cursor agents and the standard compos view is you don't have to manually add files to the context cursor will scan your code base and find the relevant files and add them for you cursor was able to find this no client side routing variable and flip it to false and if I accept the updates we can see it did exactly what we wanted one basic minimal diff but cursor is not always perfect but the part I like most is I'm in control and in the driver's seat and if I want something different I could

also say just delete that variable and all references all together and I can see the update immediately there's less waiting and more action and while I'm more closely in the loop I have more trust with this process because I know what I want and if it can scan my code update multiple files and not make me have to worry about the details and I can provide real-time feedback and hand modifications and send the pull request my way that's a much easier to adopt workflow for me and my team so now if I look at the

code we've now removed that variable entirely it's totally gone and I can commit and send a pull request like I always would now it's more clear who owns the PO request it's me I find this process faster easier and nicer we don't have weird Bots creating pool requests and it's unclear who actually owns that and it's responsible for making sure the code is good nobody has to clone down that bot's PR and push updates to it and every update happens pretty quickly I also tried the graphql prompt with our very large internal repo and cursor

agent as well and I got very similar results it added the comments resolver it integrated it into the API and added the types as well so pretty similar results and what you'd expect with cursor's composer view but again because the agent mode I didn't have to specify files I just typed my prompt and it happened that was nice now let's try a more agentic workflow where we have a clone this image generator model repo and you'll see the main difference between cursor agents and Devon is it ask me before it runs any commands cursor is

generally more cautious than Devon which is nice because it's running on my local machine but also sometimes I wish it would just run this stuff for me I've noticed if it catches an error it'll automatically try to fix it which I've seen it be successful at which is great now it's written the code which I'll accept it found an error and it's rewriting the command accordingly now unfortunately my computer froze before I could show you if cursor was able to finish that task it looked to me like it was generating the image fine but it

turns out that model is meant for having a real GPU and not burning through my laptop CPU like I was trying to do now overall I don't think Devon will take off like cursor and it's not just because of the $500 a month starting point cursors are so much easier to adopt and I like their incremental approach Devon I is trying to jump too far and raise all this money saying there are this all new way to build software with agents and it just wasn't my preferred workflow maybe one day when llms are even better

and agents are extremely reliable but I'm not sure the rate of progress will get us there really soon and I personally believe more in cursor's incremental approach than Devon's let's change everything approach my preferred workflow looks more like this a developer works iteratively with cursor and other teammates like designers iterate with their tools products like builder. can convert designs to code and also patch in design updates as they're needed and ultimately your workflow doesn't change much you're still coding and debugging locally you're pushing changes is as needed but I will say that I'm excited to

have a new player in the agent coding space to push cursor even further and I can't wait to see what comes out from the result of this but that's my quick take from everything you saw what do you think let me know the comments and if you made it to the end and you want to see more videos like this be sure to like And subscribe