The wonder of the world to come in software and information circles, and especially in the circles that talk about them, is AI. Give a magical machine many things, ask it a question and it will give you a meaningful and useful answer. It will create art, write books, compose music and generally change the world as we know it. All of this is really impressive stuff, as anyone who has played with DALL-E will tell you. But it’s important to think about what the technology can and can’t do that’s new so as not to get caught up in the hype, and when I do, I’m immediately drawn to a past career of mine.
I knew I should have taken that 8051 firmware job instead
I am an electronic engineer by training, but when I graduated in the 1990s I was seduced by the Commodore CDTV into the world of electronic publishing. CD-ROMs were the thing, then suddenly they weren’t, so I dabbled through games and online companies, and unexpectedly ended up working for Google. Was I by Larry and Sergei’s side? Hardly, the company I had worked for folded so I found some temp agency work as a search engine quality assessment.
It’s a fascinating job that teaches you a lot about how search engines work, but as one of the trained monkeys against which the algorithm is tested, you’re at the bottom of the Google pile. This led me into the strange world of white hat search engine marketing companies, where my job transitioned into self-discovering the field of computational linguistics without realizing it was already a thing, and using it to guide clients to create better content for their websites their.
At this point, it’s probably time to talk about how the search engine marketing business works. If you own a website, at some point you will almost certainly have been bombarded with search engine optimization, or SEO, companies offering you the chance to be number one on Google. As we used to say: if someone says that to you, ask their name. If it’s Larry Page or Sergei Brin, hire them. Otherwise not.
What the majority of these companies did was to find chinks in the search giant’s armor, ways to exploit the algorithm to deliver a good result on a carefully selected keyword. The result is a constant battle between the SEOs and the algorithm developers, something we saw first hand as quality assessors. If you’re unwise enough to hire a black hat SEO company, any success you achieve will inevitably be taken away by an algorithm update, and you’ll likely be thrown into search engine hell as a result.
At the white hat end of the scale, the job is different. You have a customer with a website they think is good, but with little interesting content beyond what they sell, the search engine doesn’t agree with them. Your job is to help them turn it into an amazing website full of interesting, authoritative and constantly updated content, and there were no shortcuts. The computer linguistic analysis of competitor search results pages and websites would provide a healthy bunch of things to talk about, but making it happen was impossible without someone putting in a lot of hard graft and creating the content. If you think about Hackaday for a moment, my colleagues have an incredible breadth of experience and are really good writers, so this site has really good content, but behind all that is a lot of work as we pound away at our keyboards creating it.
Does a thing have to be smart to tell you things you didn’t know?
If there’s one amazing thing that corpus text analysis can do for you, it’s tell you something you didn’t know about something you thought you knew, and there were many times when we had clients who gained a whole new insight into their industry by looking at a corpus of the rest of the industry’s information. They may know everything there is to know about the widgets they produce, but it turns out they often know very little about how the world talks about those widgets.
But at this point it’s super important to understand that a corpus analysis system isn’t smart, and it doesn’t try to be. Comparing it to AI, it’s a big pot full of sentences where the idea is to make the things you want float to the top when you touch it, while AI is an attempt to create a magical smart box that knows all that information and say the good things from your mind when you ask. For simplicity, I will refer to the two as simply faint and bright.
I’m very happy to be writing for Hackaday and not customizing the web anymore, but I still follow the world of content analysis because it interests me. I’ve noticed a tendency in that world to discover AI and have a mind-blowing moment. This technology is amazing, they say, it can do all these things! And it can, but here I have a moment of wonder. I look at people who presumably have access to and experience with the “dark” tools that do the job of statistical analysis of a bunch of data, and react with surprise when a “bright” tool does the same job using an AI model trained on the same data. And I guess here is my point. AI is a very cool technology, but it’s cool because it can do new things, not because it can do things other tools already do. I’ve even read search engine marketers gushing about how an AI can tell you how to be a search engine marketer, when all I see is an AI that presumably has a few search engine marketing guides in its training just repeating something it knows from them.
Don’t put AI on a pedestal just because it’s new to you
A friend of mine is somewhere near the bleeding edge of text-based AI, and I’ve taken the opportunity to increase my knowledge by asking him to show me what’s under the hood. It’s a technology that can sometimes amaze you by seeming smart and human – one of the things he demonstrated was a model that, for example, makes a very passable D&D DM, and being a DM is something that takes some skill to do good – but I despair that it is placed on a hype pedestal. It’s clear that AI tools will find their place and become an indispensable part of our technological future, but let’s have some common sense when we get excited about them, please!
My pot of sentences eventually developed into a full-fledged corpus analysis system that got me a job at a well-known academic publisher. When fed news data, it could sometimes predict election results, but even with that party trick, I never found a freelance client for it. Maybe time has passed and an AI can do a better job.
Meanwhile, I worry about how the black hats in my former industry will use the new tools, and that an avalanche of AI-generated content that appears higher quality than it is will pollute search results with unfilterable junk. Who knows, maybe an AI will be hired to discover it. One thing you can count on though, Hackaday content will remain written by real people with demonstrable knowledge of the subject!