American Addiction to Violence
Jerry Lawson·12 min

There are two sides to every coin. In this case, the other side of the coin is a crucial stage in the AI development process: data creation and data annotation.
Whether it be the ability to identify cars in an image or understand human speech, AI technologies are built to replicate or even surpass the things humans can do.
In order to do this, the AI has to learn from humans, that is, from human-annotated data.
This data is often referred to as AI training data, or ground truth data. This data serves as the building blocks for the AI algorithm to learn from, in order to perform well.
For a chatbot, this data could be in the form of customer service email histories.
With speech recognition technology, like Amazon Alexa, this training data might be recordings of natural conversations.
For autonomous vehicles, some of the ground truth data would be in the form of traffic images with cars, traffic lights, and pedestrians.
However, autonomous vehicles require more than just raw traffic images. They also require the cars, bikes, and pedestrians in those images to be accurately labeled like in the image below.
[caption id="attachment_18023" align="aligncenter" width="499"]
Image via lionbridge.ai[/caption]
The task of labelling images, text, audio, and other forms of data, is done by human workers. Humans look at each piece of data and label it according to the project’s needs.
Hundreds of thousands, if not millions, of jobs in AI come from a growing demand of annotated data.
However, when we talk about annotating data, we don’t mean a couple hundred images or words.
In fact, many AI algorithms require millions of pieces of data in order to perform and generalize well. That means millions of cars need to be labeled or millions of audio clips need to be transcribed.
As a result, many annotation projects require thousands of contributors.
However, when you take into account the simplicity of the tasks, i.e. drawing a box around a car, it becomes inefficient for a data scientist to spend their time on annotation.
Data scientists want to spend their time researching, developing, and testing.
Therefore, many tech companies outsource their annotation tasks, rather than assign the work in house.
Thus, an entire market around AI training data was born. Today, there are a multitude of data annotation companies that offer a variety of services and annotation tools.
The best part about data annotation jobs is that most tasks can be done completely remotely. All you need is a decent laptop and internet connection.
The remote nature of the work allows employers to hire annotators all over the world.
If you’re looking to find entry level online jobs in data annotation, here are a few places to start:
Real-time institutional flow data and trading signals for serious investors.
Explore DataDrivenAlpha →Instantly repurpose any DDI article into a professionally produced short-form video.
Try DDI Media →
Limarc is a Canada-born writer specializing in AI, tech, video games, and pop culture. He currently works as Director of Content & Marketing at ISNation. He has been published on various high-profile websites such as Towards Data Science, Hacker Noon, Becoming Human, and of course, Data Driven Investor. Outside of work, he spends most of his time reviewing game titles, writing guides, getting lost in virtual reality, and exploring Japan.