TWIL vol.4 (blocklists and APIs, encoding vs encryption, and does AI have an uncanny valley?)

This week I learned about malicious site blocklists and some APIs that might be interesting to dig into, read up on encoding vs encryption, and pondered whether AI could dip into the uncanny valley.

TWIL vol.4 (blocklists and APIs, encoding vs encryption, and does AI have an uncanny valley?)
Photo by Siora Photography / Unsplash

A few things I learned, relearned, or just learned more about this week, mostly by reading. Since I can't absorb knowledge by osmosis, like the woman above seems to be able to do. Yet.

Blocklists and APIs

When I hear about devices like Gryphon or Bark having access to hundreds of thousands of sites, or a browser, browser addon, search engine, or whatever else checking for malicious/unsafe sites, I've wondered how exactly they're doing that. It'd be incredibly inefficient for everyone to amass and maintain their own lists (although some probably try), but I never bothered to look for the source.

This week I stumbled on a site with a list of block lists (how meta), which solves that mystery I suppose. Contact me for the movie rights. 🕵️‍♂️

Free Blocklists of Suspected Malicious IPs and URLs
Several organizations maintain and publish free blocklists of IP addresses and URLs of systems and networks suspected in malicious activities on-line. Some of these lists have usage restrictions:

Some of the links are broken, but there's some interesting stuff in there...

  • Lists of phishing sites by OpenPhish and Artists Against 419 (sites trying to trick you into entering your creds for another site, usually by taking advantage of typos or by using a popular domain as the subdomain of their own dodgy website)
  • Lists of malware sites by URLhaus (sites that trick you into downloading software that might look like a legit app, or by scaring you with big popups warning you that you have a virus and need to download their antivirus app)
  • APIs you can use in your own app, like the one from URLhaus and Google's SafeBrowsing API

Encoding vs Encryption

The world of IT is filled to the brim with acronyms and terms, and it's easy to get some of them mixed up. Some of them are important to distinguish though, like the difference between encoding and encryption.

I came across some code online this week that was base64 encoding some values, but the names of the classes, methods, etc suggested the values were being encrypted. They're definitely not the same.

Encoding is for transforming data for one reason in another. For example, base64 encoding and decoding a string in C# is simple and not at all secure... it's not meant to be.

There are neat things you can do with it too, like encode an image and embed it in a webpage, like this:

<img title="Link" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAEAAAABYBAMAAAC3wuwbAAAALVBMVEUAAAApKSnGGCGMWilSlBA5lGu9ayH/ewDvY7V7vSFC3nPnlFL3pWv33kL///9VMw/WAAAAAXRSTlMAQObYZgAAAPtJREFUaN691lERgzAQBNBYOAu1gAUsxEIsxAIWsBALWMACFqqh7CyXpJ3Sr2b3A0LufdyEMCSELnYl3GU4MJvPrOs836DhwMsMkBJgiLZy9iZzBq+tDgdm+262LCQMxgCoKAHiG8bHOpBSIy0op6QDvlB9MKMC4SIpPc+k5PdalgASFLA4WDTCt09HAPyVEfA1KYFZKTGSMCjHWEpdycHAbJrwCMIFw53PqKgACVttI85rwLZ5W6WGFBUVwCBGJxz5vAqgfJwxa1cSDfBjBRo7rvj2qbt6OAB5PBx5ETPd1y0A+KkSscifqxK0Y16fr+fAgeAT/TyP/hu8AMIVt4cNSUt4AAAAAElFTkSuQmCC" />

Encryption, on the other hand, is about keeping your data safe. You can certainly encode an encrypted value, but the intent of encryption is different and can't be easily undone like simple encoding/decoding.

Here's a nice explanation and comparison:

Hashing vs. Encryption vs. Encoding vs. Obfuscation
This article concisely explains the differences between encryption, encoding, hashing, and obfuscation. Includes multiple examples of each…

AI and the uncanny valley

A new service called HereAfter claims to use AI to help love ones deal with loss. Someone records a bunch of stories about themselves, and then family members can ask natural questions and hear answers from them after they've died. There's something unsettling about it, and the term "uncanny valley" popped into my head as I was reading about it.

If you've never heard of the uncanny valley, it's when something gets too close to realistic but misses the mark in a bad way. Like Dwayne Johnson in The Mummy Returns or anything in The Polar Express. One way to put it is that it's our brains experiencing "a human doing a terrible job at acting like a normal person". If something looks and acts mostly like a human but is missing some key quality, we reject it.

That had me contemplating whether a chat app pretending to be a human (esp a human you knew well) could fall into the uncanny valley too. The creator of HereAfter specifically says "We don't want to create a chatbot version of someone that kinda goes on living in the world today. . . [T]hat would start to raise the hair on the back of my neck." But someone will try it eventually.

ChatGPT has been a huge topic lately. You can ask it pretty much any question to or make any request of it, and it'll spit out a really convincing response. What if someone combined that with a service like HereAfter, generating new responses that that person would likely have said? What if it mostly sounded like them, but was just slightly off? Would we be able to tell? What if all the generated responses sounded convincing, but then one response was so off the mark that we just knew they would never have said it in real life? 🤔