Vibe Coding Tutorial and Best Practices (Cursor / Windsurf)
33.17k views4369 WordsCopy TextShare
Matthew Berman
Got a lot of questions asking about my stack and what I do when vibe coding. So I made a full video ...
Video Transcript:
I've been Vibe coding like crazy lately which basically just means using an AI agent to do my coding for me very little actual writing of code in fact basically none and I'm going to tell you all of the things that I learned along the way to get the most out of it let's get right into it all right so quickly what I'm talking about with Vibe coding or really agentic coding whatever you want to call it is you're using this agent based coding in cursor or wind Surf and you can see it right here so there's agent ask and edit this is not your starting to code something you hit Tab and it finishes that piece of code for you this is I am literally trying to get AI to write the entire application end to end and so it's a little bit more difficult to do and there are definitely some limitations now first what is my setup either cursor or wind Surf and I have almost exclusively been using CLA 3. 7 models mostly focused on the thinking version so I'm using CLA 3. 7 Sonet thinking right here now inur cursor and wind surf you can set up really any model that you want to do that you go up to cursor settings and here it comes with a predefined list of models that you can choose from all provided by cursor directly or wind Surf and then you can also add models down here the way that you add custom models is essentially and it's not super clean is by overtaking this open AI API key here so you turn it on you add your key and you can override the base URL you can only do that once which means you can only really have one non-open AI API service but that's okay so here I use grock and I put my grock key here grock is formatted using the open AI API standard so it just works now whatever model you choose if you're using it with the agent feature you want to make sure that it really supports agentic behavior and function calling and Tool calling really well most models don't Clyde 3.
7 thinking does so what I would recommend doing first is getting a very specific spec written that just means a very detailed explanation of what you want built and so I'm actually using grock 3 for that write a spec for an application that will be a Twitter clone be as specific as possible we're going to be using python for the back end so I'll just hit enter and I work with AI separately to help me write that spec and here you can see it writes out all of the not only technical specification but also how it operates how it works Etc and here you can even see it's writing out the database schema so this is really complete here the API n points and so once it's done you basically take all of that and you paste it into cursor I'll show you what that looks like so we have a brand new cursor window you're going to come up to the top right and click this toggle AI pane here it's going to open up the chat window if you have a recent version of cursor the agent will be the default I have clad 3. 7 Sonic thinking as my model and here's the important part we're going to paste our spec in but before doing that one other thing I learned and this is a especially important as the codebase gets larger is to use rules cursor rules or wind surf rules both of the idees support rules and the rules are basically a way to tell AI to tell the agent how you want to code what technologies you should use what workflows you want to use and this was a huge Discovery for me because frequently I would try to build something and it would build it using some technology that wasn't already in my step and it would break other things if there was a bug and I was trying to fix the bug it would try to use a different technology a good example is I wanted to use a SQL database for everything but every time I ran into an issue with SQL database the agent would try to fix it by switching to Json storage in a file and it was just super annoying because I would think it was working and then nothing would be appearing in the database and so on and so forth all right so switching back to this is my actual project I am going to show you where the rules are so incursor set you go down to rules and you can have user specific rules which are rules for any project but I like adding it to project rules you add a new rule and it essentially creates this cursor folder and within that a rules folder all of these MDC files are the rules and you can reference them kind of like a system message for the AI to work with thanks to the sponsor of this segment MTH access the best generative AI for just $10 per month this includes the best llms like Claude deep seek GPT 40 llama models Mistral Gemini grock and reasoning models like deep seek R1 and 03 mini and for that same price it also includes the best image generation models mid Journey flux Pro recraft Dolly stable diffusion all in one place and you can create custom amus to help you on your projects these are kind of like agents and you give them all your custom context and they will know what you need them to do you can install it on any device Apple Android Windows Linux you can do oneclick repr prompting and so much more so definitely check out mamouth they've been a great partner to this channel tell them I sent you links down below now back to the video so let me show you my coding preferences this is how it's formatted all natural language description you can Auto attach different files but I didn't use that coding pattern preferences now let me go through each of these and tell you why I added them because hopefully it's going to save you a lot of time by just understanding why these things are necessary which kind of reflects some of the tendencies that these AI coding agents have that don't really work all that well so always prefer Simple Solutions that speaks for itself avoid duplication of code whenever possible which means checking for other areas of the codebase that might already have similar code and functionality what I found is that when adding new functionality or trying to fix old functionality it would often duplicate code thinking that that code did not already exist so now I'm explicitly saying make sure you check everywhere make sure this code doesn't already exist and if it does fix that rather than adding new working code another thing that I struggled with is the AI agents didn't have a good sense of the split between Dev test and prod environments and this was a huge huge learning for me I would make some change write some test for it the test would affect the dev environment the production environment would be trying to use local stuff and it was just a mess so now I say okay everything you do you are going to take into account we want separate Dev test and prod environments here's another issue you are careful to only make changes that are requested or you are confident are well understood and related to the change being requested another issue that I ran into is I would ask for a change and one little piece of code one little piece of functionality and like three other things would just break for no reason and it was just trying to touch other pieces of code so I'm really emphasizing to make sure only focus on the thing that I'm asking you to focus on again next reemphasizing just focus on what I'm asking you to focus on don't introduce new things so when fixing an issue or bug do not introduce a new pattern or technology without first exhausting all options for the existing implementation and if you finally do do this make sure to remove the old implementation afterwards so we don't have duplicate logic again this is something I would find something would be done one way it would not work perfectly I would say fix that and rather than fixing it it would just rewrite it in a different way completely thinking it was new code so hopefully that fixes it so keep the code base very clean and organized I would often times find disorganization in the code avoid writing scripts and files if possible especially if the script is likely only to be run once what I found again is that when the agent was testing something so let's say one of the API endpoints was broken or one of the pages was broken it would write a script to test it which is fine but then it would just leave the script in my code base and I would have a bunch of different one-off scripts that would likely never be used again so I want them either to just write the script and execute it in line or when it's done with the script just delete the file all right next this is an obvious one avoid having files over 200 300 lines of code refactor at that point now that's not usually a good hard and fast rule to have but for AI I think it's appropriate I would frequently find files that were just absolutely massive and then I would ask it to refactor it and by the time I noticed it the refactoring would break all the tests and it would just take a long time so just get ahead of it as soon as it looks like you're about to get over that line size go ahead and refactor the code I also found that it would mock data like crazy stub and mock data which basically just means have kind of backup fake data but that doesn't really work in Dev or production environments so let's say I asked it to go scrape an article and for whatever reason the scraping failed it would catch the error and fall back to using mock data and think everything worked but it didn't work that is not what I wanted the behavior to be what I really wanted is it to actually scrape the data properly and so it would always use mock data it was really frustrating so I explicitly said said do not use mock data for Dev or prod only in test environments and I kind of doubled down and said it again never add stubbing or fake data patterns to code that affects the Devon prod environments next I actually found it overwriting myv file so I had to frequently get new API keys so I just said don't do that now another rule is my stack and I could probably even get more detailed with this but this is just what is your technical stack because again I would find it using for example if the SQL didn't work it would just switch to using a Json store in a file which was not what I wanted so I said python for the back end HTML JS for the front end SQL database never Json file storage separate databases for Dev test and prod again just reemphasizing that elastic search for search using a hosted version because I would talk about elastic search and it would think okay you want to do that locally sometimes it would do it with a hosted version but I only wanted it to be hosted and python test and I'm going to get to testing in a moment next I had my coding workflow preferences a little bit of duplicative information here but hopefully just kind of giving it more guidance so focus on areas of code relevant to the task do not touch code that is unrelated to the task obviously you could see I ran into a lot of issues with this write thorough test for all major functionality I was doing this manually I would just have it write some code for me and then say okay now write test for it and sometimes I would forget so now it should just write test for all major functionality avoid making major changes to patterns and architecture of how a feature Works after it has shown to work well unless explicitly instructed it should say instructed and so again I would say hey fix this issue and to fix it it would basically throw away the original pattern and rewrite it from scratch using some other pattern or some other Tech and I didn't want that I wanted it to just fix what I had already there and always think about what other methods and areas of code might be affected by code changes so those are the rules you can add more and just give it as much instruction as you see fit because this is going to be really helpful as you go so now we have that full spec back in grock So what I would do is obviously go through it work with grock on it or whatever other AI you want to use make sure it is exactly what you want make the appropriate changes copy it all paste it into here and you're good to go so I'm not going to iterate on this I'm just going to take it on one shot I'm going to highlight it all I'm going to paste it right into the window and say build this based on this spec and let's see so it's going to go I'm not going to watch it let's just assume it's going to go I'll show you what it looks like in a moment but let me just show you some more best practices okay so back to my existing code one thing that you're going to notice and you should get used to is this panel right here this is all the context you're giving it this is important because at a certain point it becomes too much cont context and it starts not performing as well so you kind of have to be very aware of how much context you're giving it how long the conversation is before starting a new chat and you can see right here in dark gray start a new chat for better results now this is something you're going to have to just figure out by playing with it if you start a new chat you lose context you can move some of the context over to a new chat window but that's annoying you want to give it as much context as possible but if you give it too much context it starts performing ing poorly so make sure you're playing around with that all right so I'm going to start a new chat I'm going to click it one more time so immediately the first thing it does is insert workflow preferences but I want all of my rules in there every time so I'll click my stack and coding preferences and so now it has the context of these three rules I don't actually know if it does it automatically but it seems like I do have to manually insert these three files which you know fine my next suggestion is stay very narrow with your requests from the agent meaning fix little things add little features and test as much as you can have it right tests now testing is an entire thing of its own what I have found works best is endtoend testing so testing where it's actually clicking through and attempting to do what you as a user would be doing rather than unit tests for example and keep in mind every single time you write something your you're going to run the test and if the test pass great and if the test don't pass have it fix the test but the fixing of the test is interesting you have to keep a close eye on it because sometimes it'll fix tests in a way that affect production the way you think it should work where actually just addressing a failure of the test is really the right way to go so you kind of got to just keep an eye on it you don't necessarily have to code it yourself or manually adjust anything but make sure the test pass and then use the integration test to confirm or just simply go to the app yourself and test that functionality yourself and so if you look right here we have tons of different test test for everything another recommendation uh kind of jumping all over the place but use a popular stack okay if you're using some nent technology AI is probably not going to do as well because it doesn't have as much exposure to whatever that coding language is or that stack is and thus it's just not going to perform as well so I would choose really common stacks for me I like python HTML and JavaScript on the front end keep it very simple SQL for the database elastic search if I'm doing search but choose very popular technology stacks and you'll just have a lot more documentation for AI to go research all right let me show you an example chat what it actually looks like so this is an older feature that I built so enforce a maximum tag length of 20 characters make sure we don't already have code for this right test for this after implementation obviously I'm like reinforcing over and over again how I want to it's a code it can't hurt then you send and it starts so first it thinks okay and you can actually read the thought process which is really nice then you have tool calling so where you see listed directory listed directory read files search files it has a bunch of tools available to it you can also use mCP servers if you want to give it external tools that's a whole separate issue I actually didn't use that at all in any of my code yet but I need to play around with that some more so I'll get to that eventually but that's for another video so you can see here all this stuff happens you can actually just click through and really dig in and get a feel for how the agent operates with your codebase so here I'm going to scroll down so it GPT right there read the file read the file and then it says based on my analysis of the code base here's what I found tells me what it's going to do and then it starts now there are three settings for executing things you can do completely manually which means every time it's actually going to make a change or execute something that might affect your files It'll ask you and you have to manually approve things every time for me I do YOLO mode and it is literally called YOLO mode that's the opposite it just Auto executes everything and it's risky for sure because it'll push to GitHub it'll deploy to production so if you actually have a production level code base and people are using it I would not recommend that but for Vibe coding something from scratch go ahead and then they also have kind of the in between I think it's called Auto and that just means it's going to make the determination of which ones it does automatically and which commands it will ask you to approve so it makes some changes you can see it's running these tests all the tests pass out of all my tests one test failed so we need to fix the failing test so it looks at why it failed and it starts changing the code and you can see here more mock repository stuff I couldn't stand that that really frustrated me like crazy uh which is why I mentioned like five different times don't use any mocker stub data at all in Dev or production so continued on all the tests pass great it gave me a summary at the very end of all the things that changed and then I just continued but remember the more you continue the more the context window just becomes bloated and you want to just start a new chat as frequently as it makes sense I know that's not a very prescriptive rule but you're going to have to figure out what is best for you so I really became pretty darn dependent on the agents to do things for me I would even say Okay Commit This code right write a good description and deploy it to Heroku and it would do that and it really didn't have any problems the thing is all of this stuff is pretty slow now I feel a little bit spoiled at this point because I'm saying this stuff is slow but it's writing more code than I could in a much shorter period of time but every command that I would send off it would take between two and sometimes up to 15 minutes to finish the iteration cycle it would test things try things run the test the test failed it would fix it it would verify things are working and so it's a little bit slow in that sense one potential solution to that is to have different cursor windows open and just let it operate on different branches and have kind of two branches going at the same time and you could do that multiple times and so yeah you can have a lot of different branches that you're working on at the same time and then merge them all together at the end I occasionally switch back and forth between Cloud 3. 7 thinking and Cloud 3.