How I reduced 90% errors for my Cursor (Part 2)

21.21k views4217 WordsCopy TextShare

AI Jason

Test Driven Development Memory bank Access firecrawl's FIRE-1 Scraping Agent: https://firecrawl.d...

Video Transcript:

This video is sponsored by Firecore, one of the best open-source solution to power your AI application with clean data from any website. Last week, I got a lot of good feedback on breaking down a PRD into smaller tasks. And today, I want to share two more really important tips and workflow that help you dramatically reduce huge amount of errors for bigger and complex projects with cursor.

One is called taskdriven development. Another is setting up memory bank for your cursor project. With those two flow, I was able to get cursor implement complex function and making sure the code generated integrated into existing codebase very well.

I will show you my best practice workflow so you can adopt it for your own project. So first thing I want to talk about is taskdriven development. Sometimes you might want cursor to build some sort of complex function that it won't get it right at first try.

And for this type of scenario often you will encounter arrow then you paste this arrow to cursor tell it to update code and fix it. But then you might got more errors after you try different fix and you will just kind of be the messenger between the testing result and cursor for multiple different times until it kind of works. But still while it is iterating that specific function, it often will break something else.

And this is very common and super frustrating. And what I found really useful is this concept of testdriven development that basically means a software development process where you define and write the test for your functions or features before you write a code to implement those features. For example, you can define a test script where you can define expect input for the function you're going to write and expect output validate and you will write just enough code to pass a test and improve the code while ensuring all the tests still pass where it will tell you if the test passed or not so that you can iterate a function based on the test result.

On one hand, help you align the requirement of what expect input and output with cursor but most importantly because cursor can run the command line to run the test by itself. Then you can just turn on the yolo mode and ask it to write a test, write a function, run a test and iterate a function multiple times until all the tests pass. I will show you a quick example.

But before I dive into that, I know many of you are building AI applications or AI agents that need to access certain website data from research about business to get the latest news and a smart and reliable way to turn any website data into large language model friendly format is critical. That's why I want to introduce you to Fore. If you don't know if I cro there was first open source project that offer an effective way to turn any website into clean format that is optimized for large language model to consume.

You can browse through all the subpages of a website and they handle PDF, word doc or excel file that attached from URL and also has capability like smartwe so you can guess your website with more strict anti-bos setup and what's really cool is that they just had a launch week where a whole bunch of feature has been introduced that make web scraping 10x ether for example they introduced this extract endpoint where it can handle pageionation and page interactions. So if you want to script e-commerce website where you need to click through multiple different page to get full content, you can use the endpoint very easily and get structured product information out. And my personal favorite is this fire one web action agent.

They went beyond just scraping a specific website by building intelligent agent that can navigate through a specific domain to uncover niche data that you want. For example, I can give the URL just prompt it about the information I care about. This agent will open up a browser, navigate through the website to multiple different pages until you found a page that contains the information and can also handle scenario that require logging as well.

You can just simply give a prompt and you'll be able to take those complex actions and extract structured data out. Most importantly, those ability can be called val API directly. So you can just write a simple script like this to have a research agent on demand available anytime or automation with zapier make.

com and many others by calling file crawl API directly. So if you're looking for a smarter way to script internet data, I highly recommend you go check out firecro. Now let me show you an example of how to use this testdriven development process.

So I want cursor to build a solution for me that where I can paste in such JSON like string which including pact model as well as nested JSON stream and have very easy way to just view this JSON type string. There are existing JSON viewer tool already where you can paste in those type of string. It will allow you to quickly view the JSON structure like this.

But it generally didn't work when you mix together with pandetic model as well as nested JSON stream. So I wanted something better and I tried multiple times with latest coding model but it almost always fail because it's actually pretty complex. So this is a great example.

It is a complex function. Very likely it won't get it right at beginning but by defining test cases clearly you can put a cursor in the driver seat and let it iterate until the function is ready. So I will open the cursor chat and give the prompt create a function that convert JSON like string into the proper readable JSON.

Firstly let's write some tests then implement the code then run test and iterate code until all tests pass. So with this one it should already start writing some tests. But in this case I'll also try to include a example input output so it understand what I want better.

So I can paste in this example inputs as well as example output that I expected to deliver. Then I can click send. And while we're doing this, as I mentioned before, to make this works really well, you kind of want to turn on the auto run mode so that it can write a test run test by itself and try to fix the code.

It tries to create some test examples from basic simple case to more complicated nest example. And then it write a function to actually convert the string to proper JSON. And what happened next is try to run the test here.

And here it is using the unit test. You can see one out of five tests failed. Then you go back to iterate code.

Run the test again. And this time it actually failed more but it doesn't mean it is passing. It just means it's discovering new stuff.

And while it is failing it is also tried to debug it by adding some debug lock so that if things failed it will be able to better understand where does things go wrong. A great now after four or five different iterations all five tests just passed and also write a new example to test it. But this time it find some vulnerability inside the code where certain nested JSON string is not passed properly.

So it go back to fix code more. Great. So this example of how can you use taskdriven development method to build those much more complicated function with lots of edge cases.

And now I want to convert this into a simple web app that I can use. After a few iterations, this website looks pretty good. I can paste in any Python object type string and it will automatically convert into a trace structure like this.

And what I want to showcase next is this concept called memory bank. It is concept introduced by clean initially. If you don't know claim, it is a open-source cursor alternative basically where you can just bring your own API key and it work as a VS code extension.

What they introduce here is a concept. What if the AI coding agents can have memory of what has been done, what type of text that choice we have make as well as the latest task it is working on. Then next time you ask coding agent to build some new features, it can read from this memory bank files first to almost get a download about what this project is, what has been done, what kind of implementation decision has been made to get enough context so I can build new features on top of what you have instead of messing up your existing files.

And to achieve that it is actually nothing magic. They come up with this system where they define a few files to store different type of context from project brief dog which should define the core requirements and goals and product context which including the user experience and how it should work and active context should contain the current work focus the recently changed the active decisions and considerations and important pattern and preference where system patterns will store the system architect key technical decisions and tech context including the tech stack as well as constraints and dependencies and progress MD include what works what's left to and they will have a almost like cursor rules or custom instruction to teach AI coding agent about how to use those files and when to use those files. And what's really cool is that it not only works for a new project that use from scratch.

You can also use the same method on your existing project as well to ask AI coding agent to look at your project file, understand what's going on and generate those files so that it has context then developing new features and I'm going to quickly show you how it works in client initially and how can you adopt same method in cursor with a recent custom mode feature. Let's firstly try out client. If you haven't installed client yet, you can download client which basically work as a VS code extension.

And what you need to do is you will need to click on the settings button where you need to copy this prompt and add in here as a custom instructions and click done. And then in the project folder, create a new folder called memory bank. Then there's a few commands you can use.

At the beginning, you can say initialize memory bank which will start by setting up all those files. We force client to read the memory bank files and continue where you left off. and update memory bank is something you can do after client finish certain tasks.

So I'll give prompt try to understand the current project we already set up and initiate memory bank. You can see try to view what's inside memory bank folder which should be empty. Then decide to read the readme and then try to read page by page.

Then firstly create this project brief file which should contain the core requirements. And here you can see that it actually capture the core requirements very clearly. So I save this and then continue to create the project context file which including more detailed user experience requirements.

Then it also created a system pattern which including the whole architecture and data flow as well as the key components. Then in the tech context it articulate the whole tech stack dependencies file structure as well as implementation details that we have done so far. And in active context it summarize what has been done and what are the next steps and progress will list out the detailed tasks kind of similar to taskmaster and we can probably update prompt a little bit to making sure it works with taskmaster as well.

And once it's finished in the memory bank, you can see all the key information about where we are with this project has been stored here. Now let's start the real interesting part. I will start a new task and I won't give too much context.

Just give a prompt implement a search function. And here what I expected to do is to add a search function to this JSON output viewer. So I can search for the specific data point and because of custom instruction before you start implementing it will actually look at the project brief doc to understand the project that we already set up and we're continuously looking through multiple different docs to get the full picture.

Then it decide to read the relevant files to get more context and understanding the specific part that you need to change. Then start making change to the JSON display component and adding the new search function. Then it try to add new features to highlight the search results.

And after it finish, it'll also go back to active context to keep track that they just added search functionality as well as a progress file. And now if I go back to my application, I can see a search function here. It is fully functional.

I can search for anything and it'll automatically highlight text which is awesome. And most amazing part is that it is system that continuously updating this memory bank. So I can continue pile up more and more complex feature in the existing project without worrying it messing up my project.

And to get this memory bank works in other EID like cursor is also pretty straightforward. There's a cursor memory bank rules already written follow similar structure where you can just click on row and copy the whole thing go back to cursor where they introduced this custom mode feature recently. So you can literally come here give a name and in the advanc option you can just paste in this custom mode here and that's it.

You can choose this one and then tell initiate memory bank. This should actually start working and setting up the memory bank for your project out of box. Apart from this basic setup, there's one project that I thought is super interesting called cursor memory bank.

It basically took what client has but pushed it a lot further. It has few different modes. One is van mode to to do the initial setup.

Plan mode to actually do the task planning. Consider that like a engineer manager and creative mode. So this is interesting one.

It's basically a mode that specifically designed for syncing and exploration. So it can be used for you to discuss back and forth which text stack you should go and you can also use this for debug. So if you got a bug that cursor just kind of getting stuck in the loop, you can actually use creative mode to ask it to figure out what are three or four potential root cause.

Try to sync and play out and figure out which is most likely one and then a building mode to actually build out functionality and it create a whole bunch of cursor rules that will be dynamically retrieved based on specific mode that they are in. So this setup is a lot more sophisticated but it's really really interesting. I'm going to quickly show you how can you use that.

So to use this new memory bank I can for example set up a new nextjs project and cursor my app and then I can get clone this project so that it should download this folder and this folder contain a lot of different things but the actual thing you need is this cursor rules so I'm going to drag it to top level so I get this list of cursor rules which I like dive a bit deeper into but another thing is that you have those custom instructions for different modes I basically copy the custom instruction here and then create different custom modes in cursor the first one can be van so van is basically mode that will try to initialize the memory bank. So I can paste in the whole thing and then I will create another custom mode called like plan paste the specific custom instructions and same thing for created mode and build mode. So now we have four different modes that can be used here.

What I would do is that I will click on van as I mentioned before the rest of thing you actually don't need. You can just delete it and next we can start initialize the project and set up as memory bank. If you're already part of the AI builder club that I'm building, then you can use tools like 10X coder where it can help you flesh out detailed PRD and requirements of the project that you try to build including project structure database schema which you can use to copy paste into prompt as context for this project.

I have put the link of AI builder club in the description below so you can click and join. But for this purpose, I'm just going to prompt it here directly with a simple to-do app. Zip prompt van.

I want to build a to-do app with Nex. js and I already set up per. Let's first initialize the memory bank.

So now it will start doing a few things. You could see that here it load a few different rules and this is the part that became really interesting. If I go inside cursor it has a whole bunch of different rules.

The first rule is this me mode map. So this is like the detailed instruction and a prompt about what kind of process it should follow and cursor also introduces type of new root types which you can control and decide when a rule should be attached. First it will check this van mode map and when you look inside this vam mode map what it really does is that it kind of instruct what are things you should do to analyze the existing project.

For example, here it will firstly asked to detect which platform it is using, which OS it is using, the file path, the command line it is needed as well as platform checkpoint. And for each one, it is actually leads to more specific cursor rules like for platform detection it will have specific rules about decide whether it's using Windows, Mac or Linux which probably leads to different command line and file verification is a rule that would tell the cursor those are the best practice steps that you should follow to understand existing purchase. There are also something called complexity determination.

This is a part where it will analyze the complexity. So if it is simple complexity like level one complexity, it doesn't need to switch mode. It will just continue doing the building in the same mode.

But if for something more complicated like level two to level four, it'll have more rules that will be loaded dynamically. This is like super interesting part. As you can see, this is giant rules has been built out and all those rules are kind of dynamically feeding to the cursor agent.

So with this one it basically try to set up a few files. One is a project brief to understand the goal of the project the tech context including what are the text st it should be using or already used the system patterns and active context and in the end it gave me the conclusion that this project classifies level two complexities which means should actually require plan mode breaking down task into smaller ones. So now let's try it.

I will switch the mode here plan mode and accept previous trends. So as it enter the plan mode, it will start loading a few other rules dynamically and checking out the implementation plan that has been written before and making updates and now I can accept that. So it updates the implementation plan and breaking down task into more smaller and modular one.

And now I have a few options. I can either go to creative mode if I want to let's say understand the trade-off between using Nex. js versus view.

Uh I can like just switch to creative mode and bounce idea with it. But I can also just click on build mode to start boot. So I'm going to start change to build mode and say implement.

During the implementation as you can see it will again load those rules dynamically. And this is probably one of the most interesting part of this project is that it kind of really try to push the boundary and experiment how can you use this custom rules as well as custom modes together to create really dynamic experience. I will just set up this build mode to auto run.

Why not? Cool. So it build this app and I can just add to-d do easily buy food have dinner and actually build a bit more feature than what you normally get with my shot.

There's tabs for filtering and uh I can editing things with some shortcuts can delete things. So so it's kind of fully functional which is really really cool and as a building things out it also going back to file and making updates. So the next time when I open a even a new chat it will still have a context about what has been done before instead of starting from scratch and messing things up.

But what's really interesting is that I can actually delete this memory bank and because of all those cursor rules it has it can work with any existing purges as well by looking through the whole project and generate a memory bank about what this existing project is about. For example, let's try this. So I delete this existing memory bank that I created before.

I will switch to VMO. I already set up a to-do app in XJS. Let's initialize a memory bank so I got context about what has been done.

So you can see that it again raise those random rules. Then it will start reading through all the different files and project structure based on best practice rules that it was given and start articulating what are the tax stack what has been done into the memory bank. So this is probably the most interesting and useful part that it shows a potential pathway to use cursor for your existing project which is normally the hardest part.

This context will give so much better starting point when you want a cursor to implement new features. And just as a quick example let's say I want to add some new features. Each task can be set due date, priority as well as rich text for description.

Now it will make a plan about what are the core features. Let's get into the build mode. Let's add those new features.

Okay, so now I got this arrow that it tried to resolve a few times but just can't resolve it. This is where I think creative mode will be interesting. I can paste this in.

I still get this arrow. Help me debug please. and created mode supposed to like sync through a few potential options before it actually take actions.

So it's more likely to solve the actual arrow instead of stucking in this kind of loop. And now if I switch to the app and click add it. Great.

Now it's working. It doesn't have any arrows. So this is a quick example of how you can use memory bank to either new pure or existing purge.

It's still pretty early stage but I can already see it improve the performance a lot. If you want to get a prompt and cursor that showcasing this video, you can join AI builder club I'm building where I share step-by-step process of my best practice AI coding workflow as well as building production ready agents and you can also access platform like 10xcoder. dev dev where it will help you build and generate PRD that can instruct cursor much better as well as next.

js J boy play they already have authentication payment superbase setup so you can launch your own SAS in just a weekend but most importantly we have this growing community of top AI builders who are launching their AI product so if you have any questions or doubts you can just come here and make a post where I and other community member can just come and share our learnings. I have put the link in the description below so you can click and join. I hope you enjoyed this video.