Why captchas are getting harder

1.29M views1280 WordsGrade 17 Readability4.9/5 RatingDownload TxT File
Vox

Want to watch this video? Please identify all the traffic lights first. Subscribe and turn on notifications (🔔 ) so you don't miss any videos: http://goo.gl/0bsAjO It's not you — captchas really are getting harder. The worst part is that you’re partly to blame. Correction: At 5:22, we say that Google uses reCaptcha V2 data to train their self-driving cars and improve Google maps. While they have used V2 tests to help improve Google maps, according to an email from a Waymo representative (Google’s self-driving car project), they aren’t using this image data to train their autonomous cars. For more on the future of self-driving cars, check out this article from Vox’s Kelsey Piper: https://www.vox.com/future-perfect/2020/2/14/21063487/self-driving-cars-autonomous-vehicles-waymo-cruise-uber A captcha is a simple test that intends to distinguish between humans and computers. While the test itself is simple, there's a lot happening behind the scenes. The answers we give captchas end up being used to make AI smarter, thus ratcheting up the difficulty of future captcha tests. But captchas can be broken by hackers. The tests we’re most familiar with already have been broken. Captcha makers try to stay ahead of the curve but have to balance increasing the difficulty of the test with making sure any person on earth — regardless of age, education, language, etc. — can still pass it. And eventually, they might have to phase out the test almost entirely. Read more about captchas from the Verge: https://www.theverge.com/2019/2/1/18205610/google-captcha-ai-robot-human-difficult-artificial-inte Subscribe to our channel! http://goo.gl/0bsAjO Vox.com is a news website that helps you cut through the noise and understand what's really driving the events in the headlines. Check out http://www.vox.com. Watch our full video catalog: http://goo.gl/IZONyE Follow Vox on Facebook: http://goo.gl/U2g06o Or Twitter: http://goo.gl/XFrZ5H

... Show More

Video Transcript:

I am not a robot and yet my computer accuses me of being one constantly sign up for a fitness app profile captcha getting a vaccine appointment capture buying dumbbells capture ordering cookies online because i have no self-control and the most annoying part i don't always pass these captcha tests on the First try it feels like captchas are getting harder and they are but it turns out there's a lot more going on behind the scenes than just proving you're human [Music] the word captcha is an acronym it stands for completely automated public turing tests To tell computers and humans apart so there's a little bit of a cheating because the t there's like a lot of t's in there turning test to tell luis von um invented captions in the year 2000 he was a first-year phd student at carnegie mellon university attending a talk by the chief scientist at yahoo in the year 2000 yahoo was like the biggest tech company out there the talk was about 10 Problems they didn't know how to solve and one in particular stood out they had this problem that people would write programs to obtain millions of email accounts from yahoo and the people who did that were spammers so they just couldn't figure out how to stop it what we need is a test that can distinguish humans from computers the test needed to be passable by any Human regardless of age gender education or language that becomes even more challenging because this is a test that a computer should not be able to pass but a computer should be able to grade um so it's kind of kind of a paradoxical idea the epiphany came when they realized that humans are really good at optical character Recognition aka reading we read text at all kinds of angles in different lighting conditions when it's bent over the seams of a book when it's in scratchy doctor handwriting and we've been training ourselves on how to do this since we were kids you don't need to be all that smart or know how to spell or anything you just just kind of pattern matching computers Of the era were really bad at this making it the perfect test capture programmers would give the computer the correct text so it knew the answer then they'd stretch that text and warp it the computer with the answer would be able to grade it but a new bot that didn't have the answer wouldn't be able to understand it having cracked the code they gave it to Yahoo who started using it on their front page for sign ups within a couple of weeks of the first implementation it was being used millions of times a day and the test worked it differentiated between humans and computers and helped stop bots but in the background all the letters and numbers humans typed were doing something else making computers Smarter in 2005 a new version of the test debuted called recaptcha it used two words one was generated so that the computer knew the answer the second word was pulled from a book or an old distorted new york times article and the computer had no idea what that word was when a human got the generated word right the program assumes they likely Got the other word correct as well though they'd distribute the same word to several other people just to be sure if there was consensus they'd approve the word so many tests were taken that a year's worth of new york times articles were digitized roughly every four days then google acquired recaptcha in 2009 and began using the tech to digitize their scanned books and news archive when you repeat this process enough Times you begin to build a robust image library of distorted characters and eventually with enough images in this dataset the computer becomes smart enough to extrapolate letters and words from new images captures basically taught computers how to read extremely warped text in a test by google in 2014 a human could read their Most distorted captions with about 33 percent accuracy their ai got it right with 99.8 accuracy and once the computers got better than humans the test had to change enter recaptcha v2 which features images instead of text they serve the same purpose differentiating between humans and computers and keeping the bots out but this time google leveraged the tests By getting humans to teach machines how to identify real world objects you might have noticed that v2 tests often have us selecting transportation photos fire hydrants traffic lights crosswalks and more google uses this data to train their self-driving cars to see these objects as well as to improve google maps but just like computers learn how to Read warped text better than humans they're also getting better than us at figuring out these picture puzzles so much so that the test had to change again as did the way the computers graded the test no captcha and its most recent counterpart recaptcha v3 verify that you're human just based on your behavior so how does that work there's a secret Test constantly running in the background making this captcha nearly invisible if you seem bot-like like if you click around too quickly or type out paragraphs of text in seconds then they'll make you take a standard picture test or ask you to verify yourself with two-factor authentication pretty much now if you use the web basically you're being tracked That's just it the idea is now we can tell that you're a bot or not because we can tell who you are you know you can say this is creepy but from a usability standpoint that's a lot better as opposed to me having to do some puzzle or whatever you kind of already know yep this is a human but unlike previous versions of the test there's no public-facing answer for what our clicks might be training computers To do and it's not clear how long behavior tracking capture tests will last before computers can outsmart them it is my belief that at some point computers are going to be able to do everything that humans can it may take a while but at some point they'll be able to and so there's not going to be a way to differentiate between a human and a computer This was not the first idea we had actually the first idea we had was giving you some images and then we would ask you what are these images off basically we'll go find a lot of images of flowers we'll give you a lot of flowers and we would say hey can you can you tell us what what these are images of um the problem with that is that um humans were not that great at it um for one it kind of required them to Spell and you'd be surprised how bad people are expelling and then secondly you know if it's flowers people could say plants or cars but it turned out all cars also had tires so people could say tire and so it was kind of hard to to to get it right whereas with the with the text it's this beautiful thing where not only are humans trained on it from you know very early but also there is a key for every thing that we Display like in the in the keyboard so it's like r yes r uh t yes t so that's that's why we settled on that

Like it? Make YTScribe even better by leaving a review