hey can you hear me yes I can hear you how can I help you today can you please check uh my today's calendar and this is why it will call the inting tool today December 20th 2024 you have five events scheduled they are a train journey to London from 11:00 a. m. to 12:00 p.
m. client work booked for the day from 12:00 p. m.
to 6:00 p. m. a return Journey from London from 7:00 p.
m. to 9:00 p. m.
a meeting with Al from 9:45 p. m. to 10:15 p.
m. and some decompressed time from 10:15 p. m.
to 10:30 p. m. um I'm just wondering I think I have to go for a car wash tomorrow can you please book uh let's say a 15 minute uh for the car wash session from 8:30 to I don't know 8:45 okay look to Car Wash for tomorrow so Google have finally catch up in this AI race and what a way to actually catch up with the market right if you see the leather boards now open a is no more in the top of the Le leatherboard now Google have captured all the leatherboard plus now they have also released their video models if you have not seen and on top of that they are also focusing on different agent use cases which is what I really love that they are actually looking at the problem specifically and trying to solve it rather than just focusing on creating a intelligent model so this is really really interesting I think there's a lot more interesting disc coming up and it is high time that you get to know the multimodal use cases and this is the best Channel where you can actually learn a lot of the multimodal use cases and voice agent use cases because I have been kind of covering this areas over the last four five months and I have also created an open source project if you do not know uh you can go ahead and check that as well uh in the previous videos but for this particular video we are going to explore the Google's Gemini's function calling or AI tool calling ability with 18 so let's go ahead share my screen and take you through the Journey hey welcome back what we're trying to do in this video is we are trying to introduce function calling or tool calling with Google Gemini 2 and for this instance what we are going to do is we are going to actually modify the play AI playground code that Google have already provided as part of their j2o launch you have probably seen it and I have actually covered their AI play ground code in more depth in the last video I will attach the link in the description so that you can uh check the video if you don't know how that code Works uh of course in this uh demonstration you will also be able to know it a much better way uh because in this particular video what we are going to do is we are going to introduce Custom Tool calling as part of the application that they have already provided and what we are going to do and my end goal with this particular uh journey is to build bunch of sh agent in my na10 self-hosted instance and then use the na10 web hook or API to get called from the Google gin AI playground this is again in the previous video I have said you the playground code that Google have provided is not really for production used you cannot really Host this is in Internet for people to use or for your users but this is absolutely fine to be used in our local environments and that is what I have started working on uh creating a N8 and production standard agents so basically I am creating a framework almost to create the N8 agents that can be used by different type of AI assistants that I am running as part of my company or as part of my personal use uh and the larger goal is to create a service like Jarvis more about that later part of the video so please stay tuned for the upcoming videos subscribe to the channel because uh this is really going to be more interesting as day by day go on and this will only become interesting what I have done so far is I have created bunch of AI agents uh in na10 which I'm going to show you quickly and uh as part of this video I'm just going to integrate the calendar agent uh and the whole idea is because I don't want to give every functions like every tool calling to Gemini to make it overwhelm uh because every AI agent have limitations of number of AI tools that it can process at the same time so my idea is to create a framework where uh we have modularized each and every agent with a specific tools or specific function call and then call that particular agent as another tool and I'm going to talk about that uh uh in a more details in a separate video about how you should actually create a large a AI agent framework with a lot of tools lot of uh different um in SAS service or or apis in the background but let's go to the code base and understand how we are doing it for this session so I think one of the first thing that I should show you is uh the agent so basically I created this particular agent which is the uh calendar agent this is basically a simple two legend that is already provided by an all I have done is added some system message there and I have given a specific tool a tool for creating event a tool for updating events let's just name it update event and then I have created a tool to create event with attendee uh to get all the events and to get check the current calendar availability for booking meetings to the AI agent basically and what it is doing is it it will actually for testing I have used a chat trigger note so that I can actually test it locally and additionally I have added a wave hook and that is the wave hook which Jin API will be able to call using the Custom Tool calling or function calling and uh of course uh because the way chat works is bit different it actually directly sends jon.
query so that is why what I have done is I have added uh the chat input like this uh from the CER body from the web hook and then we are just sending it into the calendar agent so here we are just reading the json. chat input uh which is kind of standardized now so whether it is coming from webbook or whether it is coming from the chat message it's going to be having um same same uh field basically which is jon. chat input now one of the key things here is we want to make it seamless right so that is why I have added a radius memory as well which will have context length of five uh which is kind of default uh but to have the memory working we need to have a session ID because if you see for the chat message if you have ever used the chat uh message in the N1 you know that it used something called session ID and uh that is what it's being used to keep track of the memory uh I think you can see it from here this is like the session ID so that is why even in the web hook we had to send an session ID and if you see the edit fields we are actually getting the session ID so this session ID is really important to keep the context of the conversation for example if I ask the AI agent hey what is the day to-day and then immediately ask can you book me a meeting with Andrew it should be able to know automatically that I'm asking about booking a meeting today not in future uh so that it has the context right so that is the whole reason of this session ID being introduced and in the end uh it will only return the response that is being generated by the calendar agent uh so all the AI agent processing and calling the actual calendar tool are happening in the N itself so that really means that what this particular Google jini API playground is to do is call that AI agent that's it right and and it will just send uh the instruction that I would ask Gemini and then na10 would do all the processing so we will keep less into this particular application and you might ask me why you are doing that because I told you in the last video that this is a react based application that means the entire application will be running on the client side in your browser so that is why you don't want to keep a lot of Logics and stuff in your client side code cool let me quickly show you what I have done I have not done a lot of things really so basically if you have seen the previous video you know that the al.
TSX which comes default with this particular playground code already have this particular function which is render alter now I have kept this for reference so that but this function is not being used anymore I have instead created another uh kind of function declaration called manage calendar and manage calendar basically have two input one is uh the same query that I will ask to the jinii so it can formulate the query properly and just send it to this particular tool and then the context of course the context of the conversation like I said about the date right so it should send that as a context once it does that uh what should happen is this call should go to the n0 and then an10 should be able to process it based on whatever being asked and with the context now you can ask me hey but you have shown me that the wook is accepting the chat input and as well as session ID you are not sending the session ID uh and where is the server URL and all these BS right now for that what we have done is we have created these two particular Services which is one is called calendar service so calendar service. ts is which is actually making the call to the N agent so that is actually making the call to the calendar API URL uh and to the send message and it needs a session ID and the chat input so it is getting the chat input and session but where it is getting it from right so that is important so that is why you need to look back into this particular function declaration function again and then you see the name manage calendar right so what we'll do we'll just search with manage calendar and you will see that there is this particular logic where it's checking if the function calling name is manage calendar then we get the query and the context as you have seen defined here right and then what we are doing is we are calling this particular method which is calendar service. handle calendar query and we are providing the query and context that Gemini have produced for the function tool calling so we'll go to this function and we'll see what it is doing right now it's doing basically here it's trying to make a call to uh make request function with again query and the context now we'll go to the make request function and see what make request is doing again so make request is a function uh under the calendar service.
ts and this is where it is first basically checking whether the context uh of this particular conversation is a new context if it is a new uh conversation or a completely new new topic and and if it is a new context then this will set as true and then it will set a new session ID basically so here it's basically get fishing the particular context that has came with the Json uh request which uh gini have called as part of the function calling and then you go here in this particular function this one is done and it's checking if not equal to current context then return true that means if there is already a current context available and and if it if the context that has came as part of the function call if it is not the same then return true that means it's a new context right and then I have added some more um you know logic where it will check the current query related to the previous conversation or not this is all just to safeguard um so that we have the context of the conversation intact uh and then what it is doing if we go to the function where it was been called uh it's basically creating a session ID and and then it's calling the our N8 API URL with the session ID and the chat input that was already available as part of the function calling from the jini too and once that is done uh it will await for the response and the response once it's available it will it will basically update the current conversation context with the uh response as well uh the last response data as well and then it will return that particular response which is the data and the session ID back to the application here basically in this as part of this tool called and then Gemini would use that and actually respond to uh to me using his voice or using her voice so this is the uh entire architecture I'm actually of course going to give you the codes so that you can do bit more analysis if you have any questions or if you have any confusion I have plans to add more tools and as I've said the plan is to actually create a bunch of more agents so let me actually show You So the plan is actually to create bunch of AI agent and then uh make all of this AI agent as part of the main uh assistant agent So This Is the personal assistant agent that I building as you can see and it has all this email agent calendar agent and Jarvis Project agent and everything uh and individually all of these agents also I'm building at the moment so that means what will happen is uh we will have a segregation of Agents with its own specific tool tools and that will actually help a large language model like gemini or something which is used for multimodal use cases uh not overwhelmed with actually a bunch of function calls and tools because what can happen otherwise is that the g2. 0 or other multimodal AI agent can easily get overwhelmed with let's say if you plan to use more than 40 50 functions or tools and it might just call the irrelevant tools that is not relevant to the particular uh input or the particular questions or query that user would ask right so this is what I'm planning uh to do in the coming days so please stay tuned I'm going to update on that in more details and to get the code of course uh you can check out the link in the description and grab the code for free absolutely free there is no charge at all but if you are planning to build a production like uh agent I'm of course having the SAS course that I have launched I have already talked about it and in this weekend I'm going to talk about the design of that particular SAS that I doing in the later part of the video you will definitely able to see what uh I'm talking about so for now let's do one thing let's just go ahead and test this agent shall we and and now it's time to test this Jarvis agent that I'm building with tool calling enabled with J 2. o let's try that so here we are in the UI I'm just going to start it hey can you hear me yes I can hear you how can I help you today do you know what date and time it is yes I do today is October 26th 2023 and the time is 7:14 p.
m. no that's wrong I apologize for that error could you please tell me the cor my apologies it seems I made a mistake earlier the current date and time is December 20th 2024 12:42 a. m.
coordinated universal time that's right can you please check uh my today's calendar and this is why it will call the in8 tool today December 20th 2024 you have five events scheduled they are a train journey to London from 11:00 a. m. to 12: p.
m. client work booked for the day from 12:00 p. m.
to 6:00 p. m. a return Journey from London from 7:00 p.
m. to 9:00 p. m.
a meeting with Al from 9:45 p. m. to 10:15 p.
m. and some decompressed time from 10:15 p. m.
to 10:30 p. m. would you like more details about any of these events no um I'm just wondering I think I have to go for a car wash tomorrow can you please book uh let's say a 15 minute uh for the car wash session from 8:30 to I don't 8:45 okay book to Car Wash for tomorrow December 2 from 8:30 a.
m. to 8:45 a. m.
okay have scheduled a car wash for tomorrow December 21st from 8:30 a. m. to 8:45 a.
m.