Table of Contents
From neural networks, deep learning or natural language processing – machine learning is rapidly expanding to more and more exciting projects through a wide variety of open-source code. You can swap faces, build recommendation engines, get your code to write any text – the sky is the limit.
For thousands of ML projects, it becomes more difficult to pick what to work on next. It should be something useful but fun, highly rated by others and should teach you something along the way.
To help you choose your next machine learning project, we have gathered 11 of our favourite use cases. We have scoured the interwebz, looked at fun levels, complexities and GitHub ratings. Mind you: this might not be the beginner playground anymore.
Let’s dive in!
See what people say on Twitter with sentiment analysis
Twitter is an ever-evolving well of opinion and information. But how can you filter to only see the information you need? This is where Twitter sentiment analysis comes in handy. This basically means that you can run through the whole set of tweets and select only those that are related to a topic relevant to you.
In this example, Data Driven Investor has used this exact technique to find disaster related tweets. This process could help to find relevant news and improve disaster response by easily finding tweets related to an event.
This process can also be used to track tweets about any subject. The traditional sentiment analysis algorithm is adjusted from showing positive and negative sentiment to related or non-related.
Start here:
Find the practice dataset here, and more about this technique at Fast Ai.
Making art and music with Magenta
Are you ready to make art with nothing but code?
Magenta is a research project exploring the role of machine learning in the process of creating art and music. They develop deep learning and reinforcement learning algorithms for generating songs, images, drawings, and other materials. It’s also an exploration of building smart tools and interfaces that allow artists and musicians to extend their processes using these models. Magenta was started by some researchers and engineers from the Google Brain team, but many others have contributed significantly to the project.
Start here:
Access the Magenta project with an impressive 14K stars on GitHub, with hundreds of contributors. Let’s get composing!
Digitalize handwriting using MNIST Dataset
Deep learning and neural networks play a vital role in the application of image recognition. These two methods can help you determine a single hand-written digit and translate to digital writing. This machine learning project helps you to enrich your deep learning and logistic regression skills, and teaches you how to convert pixel data into an image.
This project is suitable for beginners as well, while you can extend your dataset to any handwriting to up the challenge.
Start here:
You can use the MNIST Dataset, which contains over 70,000 labelled images of digits. More about this technique here.
Transfer styles for images and videos with TensorFlow
Have you ever wondered if you could just transfer the style of one image to another? To mash the Mona Lisa with Andy Warhol? This is just one of the most exciting projects you can do with TensorFlow in just a few seconds.
Logan Engstrom’s implementation is based off of a combination of Gatys’ A Neural Algorithm of Artistic Style, Johnson’s Perceptual Losses for Real-Time Style Transfer, Super-Resolution, and Ulyanov’s Instance Normalization.
The project is impressively easy to implement and deserves over 8,000 stars on Github.
Start here:
Access the Github documentation here.
Try deep universal probabilistic programming with Pyro
Do you know how Uber is matching drivers to riders, optimizes routes and builds the next generation of driving? They are doing it with the help of Pyro – their own AI framework which they have open sourced in 2017.
Pyro itself brings together the best of modern deep learning, Bayesian modeling, and software abstraction: it is a modern, universal, deep probabilistic programming language. It elevates the usually cumbersome probability techniques by marrying probability with the representational power of programming languages. It is universal, scalable, minimal and flexible – which means you can use it freely for your own probability queries.
Start here:
Access it here through GitHub, with over 5000 stars.
Real-time face detection and emotion/gender classification
Get ready to read emotions – the machine learning way! Through this handy project, you can determine both the gender and the emotional state of the people in the image. It works based on a convolutional neural network, which in one step detects a face, defines gender and classifies the emotion. The algorithm reported 96% accuracy when tested with IMDB gender dataset and 66% in the FER-2013 emotion dataset.
Over 4,000 coders have given it a star on Github and can now detect emotions in real-time through the algorithm.
Start here:
Access documentation, code and everything you need at Github.
Predict diseases with code
Medical science is one of the areas which can greatly benefit from machine learning. By considering a wide-array of diverse data, we can create better assumptions to prevent and cure diseases.
For example, you can build a model to predict breast cancer in R studio.You can use the dataset of the Breast Cancer Wisconsin (Diagnostic) Dataset from the UC Irvine Machine Learning Repository. There will be 2 predictor classes you would be considering with the random forest method: malignant or benign breast mass.
Start here:
Get a detailed description of the project here.
Recognize human activities
We can even teach our machines to recognize human movements: whether it is fitness, posture or body measurements. It works on the basis of a simple multi-classification, using super-vector machines or SVMs and Adaboost.
You will be using the Human Activity Recognition dataset. It was built from the recordings of 30 study participants performing activities of daily living (ADL) while carrying a waist-mounted smartphone with embedded inertial sensors. The objective is to classify activities into one of the six activities performed (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING).
Start here:
This project is also fit for beginners to lead them into the world of classification. Acces the data here.
Build a self-driving car
While the automotive industry is busy refining the future trend of self-driving cars, you don’t have to feel left out. You can build a donkeycar: a small self-driving RC car with the appropriate software and hardware. It helps you to experiment with autopilots, mapping computer vision and neural networks. You can drive your car using a computer or phone, record images, steering angles and throttles, then train neural net pilots to drive your car on different tracks. You can even race your car in a DIY Robocars race.
Start here:
Find all the information about the project on the Donkeycar website or on Github.
FaceSwap in videos
Not long ago the internet was full of faceswap images: you could switch face with your buddies, celebrities or even the Queen of England! This technology is enabled by deep learning and neural networks, which means you can program your own tool too.
DeepFaceLab is has built a whole set of tools to replace faces in videos, with a highly impressive over 10K stars on Github! They also have a ready-to-use face set, tons of instructions and bespoke communities in more languages.
Start here:
Access the Github library and all associated pages from here.
Predict stock prices
Financial institutions have pioneered machine learning- from automated trading, price predictions and advanced analysis. Now you can get a slice of their success too!
There is a great collection of tools named “bulbea” – referring to the bull and bear movements of the stock market. It is an Open Source Python module (released under the Apache 2.0 License) that consists of a growing collection of statistical, visualization and modelling tools for financial data analysis and prediction using deep learning.
With over 1000 stars they are the most popular stock prediction models on Github.
Start here:
Access the code base here and more documentation on bulbea’s website.
Clone a voice and generate arbitrary speech
Would you like to make Morgan Freeman talk? Or your favourite cartoon character? Maybe you want to clone the voice of your boss? Well, now you can! With as little as 5 seconds of voice recording, this algorithm will capture the identity of the speaker and can play it back on any text. The neural network-based system for text-to-speech (TTS) synthesis is able to generate speech audio in the voices of many different speakers, including those unseen during training.
Start here:
You can see a video demonstration here about how it works, and find the documentation on Github with over 8,800 stars.
The possibilities of what machine learning can do are extending day by day. New techniques are popping up and luckily many of them are available through open-source projects. You can predict, edit, create endlessly with the help of just your computer.
So, what are you waiting for? Get ready to dive in and share with us your favorite machine learning projects!