Answering Your Questions On The NVIDIA Jetson Orin Nano Super

2.85k views2126 WordsCopy TextShare

Ominous Industries

Timestamps: 00:00 - Intro 00:36 - LLM Token Speed 02:13 - Compared to a 3060 04:44 - No Pi5 Comp 05...

Video Transcript:

while I don't intend on becoming a strictly Jetson focused Channel I really can't help but notice that this little device did Garner a rather large amount of Interest now being that that is the case I did notice there were some commonly reoccurring questions pertaining to some of the abilities or perhaps lack thereof of this little device so for today's video we're basically going to do a more casual and higher level overview addressing some of these questions about capabilities or lack of capabilities for the Jetson Orin Nano super so let's get right into it now the first thing I want to do is something that I feel like kind of neglected to touch upon in the previous two videos I made on this device and that is simply just give the token speed for running llms on this so I am using olama with the verbose flag and we're essentially going to just quickly test llama 3. 2 3B to get its token speed and following that we will test llama 3. 1 8B to get it token speed and then we'll compare them to a somewhat comparable device so I will quickly just ask ask it some Quantum Computing questions which will just kind of give it a good amount of text to respond so we can get a good speed reading all right and we can see here that we do get a lot of output but specifically the one that we are looking at is the most bottom one which is the eval rate and it shows that we are getting 18 18.

2 18. 1 tokens per second so I'm just going to keep that measurement here for now because later we will kind of go ahead and see if that's good or bad or how it measures up to other similar devices now we're essentially doing the same exact thing except we are going to be running llama 3. 1 8B so I will just go ahead and ask the exact same thing all right and we can see here on the bottom that our eval rate was 6.

09 tokens per second so definitely a big difference between the two models however for our purposes I think we're a little more interested in seeing how this compares to a different card I want to quickly proceed the comparison we're about to see by just saying this isn't directly fair or one: one however the aura Nano is using nvidia's Amper architecture which is also what is commonly known as the 3000 Series of GPU for the consumer grade cards and to do a somewhat I suppose fitting comparison I actually have an Nvidia 3060 desktop GPU now this GPU is about $250 new now has 12 GB of video RAM and truth be told is a stellar option for running locally hostable AI Solutions I am actually testing this on a machine that has two of those in it and this card is really just fantastic even though it's kind of old at this point now I want to make make it clear that this is going to be a ton faster than the Orin Nano and that's where this isn't quite fair is that this card will pull around 170 Watts Max compared to the 25 that the oron Nano will do and you can say hey they're the same price why wouldn't I just get the 3060 which is fair but you have to then factor in the additional cost of power supply CPU motherboard case Etc and footprint it takes a much larger PC and things of that sort so we can see that I am just going to go ahead and do the same exact thing that we did before where I'm running llama 3. 2 3B and we're just going to ask it what a quib it is and we can see that the speed here is rather rather quick so all said and done 109 tokens a second which is extremely fast and this is a small model of course but comparatively to what we saw with the oron Nano this is quite a big difference to do the same right here with llama 3. 1 8B I will just ask what is a qubit and we can see that this is significantly slower than 3B which is kind of the same thing we saw obviously while running this on the Orin it's um 45 tokens a second so overall the desktop 3060 as expected is much quicker but I did want to just provide some level of Benchmark so we have a reference point at least to see how quick this or a nano is in running llm now of course the desktop card takes way more power and there are much more considerations so one to one obviously the desktop card is going to smoke it but overall they do have different end use cases as a side note I want to mention why I didn't include the pi5 in that comparison and the reason for that is because the pi5 with llama 3.

2 3B will Benchmark somewhere around 3 to four tokens per second as opposed to the 18 or so we saw on the orange Nano it wouldn't really have been a fair comparison and I wanted to compare this to another Nvidia GPU that folks may be more familiar with with that out of the way the next and what I think was the most asked question about this was whether or not it could run Windows initially I had responded to some of those comments and said I don't see why it wouldn't be able to and I think it is important to say that I was completely off- based with that and wrong it is not likely that this will be able to to run Windows and I can definitively say that now because we're actually going to try it so I have a micro SD here with an arm build of Windows 11 that is essentially supposed to install Windows 11 on specific arm devices now this Nvidia has an arm cortex I think a78 processor don't quote me on that and this build would be our best chance at just getting an outof the-box Windows installation working on this and you can see on the screen right now that there is a kind of almost boot menu that we're looking at and we're going to go ahead from here and select the micro SD to boot and we'll see what happens now we can see that we have our boot devices listed and one of them is actually shown as the NVM SSD a few folks had mentioned being able to just use that as the boot dis which based off this it does seem possible and I will look into doing that in the future and seeing if I can put a quick thing together on how to do that but if I go ahead and select enter here to boot from the SD card with the windows install we're basically going to see this error which is above my head but essentially what it boils down to supposedly is that the boot libraries will not load for Windows and it basically just means that without somebody with extreme knowledge in actually getting Windows running on devices like this we likely are not going to see a Windows install for the Jetson or a nano super I'd like to quickly make note of something that I saw a few people having trouble with and that was perhaps that they got one of these jeton but the firmware may have not been the correct version that would have allowed you to follow the tutorial I had where you could essentially go straight to installing jetpack 6.