With all the talk about machine learning in recent years, it’s hard to ignore the desire to try it for yourself. The technology seems to be evolving at a very rapid pace, and it’s already found applications in many environments. If you have some programming skills and a basic understanding of some concepts about statistics, you’re good to go in terms of skill requirements.
However, you’ll also have to consider the hardware that you’ll need for this. You can either host everything at home or use remote services—both have their advantages and disadvantages.
Basic Requirements for Machine Learning Development
You’re going to need some relatively powerful hardware to get things going. While you can run most related tools on an inexpensive laptop, you’ll be severely limited in your learning potential, and everything will take much longer than it needs to.
Your GPU (Graphics Processing Unit) is the most important component here. It doesn’t have anything to do with graphics directly. It’s just that GPUs happen to be better suited for the types of calculations that machine learning relies on.
A GPU that supports CUDA will be even better here, though it will cost you more to get your hands on one. Don’t worry if you can’t afford this kind of hardware at the moment. You can also run your solutions remotely, though you’ll have to deal with the ups and downs of that setup.
Why Your Costs May Be Higher in 2021
It’s also worth noting that shopping for new hardware for machine learning can be even more challenging right now. There’s a tricky global situation developing revolving around a shortage of semiconductors used in the manufacturing of various consumer electronics. From GPUs to smartphones and other devices, a lot of markets have been affected.
Some predictions claim that this shortage could last for several more years, as it was the result of several factors aligning unexpectedly. Between the pandemic impairing production capabilities and boosting demand, and miners and scalpers buying out the whole stock, the situation has been challenging for those who just want to get a new GPU.
It’s not clear when prices are going to normalize either - prices might continue to climb. Looking for a used GPU could be a better option, though you can’t guarantee that you’re going to find something suitable.
Benefits and Disadvantages of Hosted Platforms
A hosted platform for machine learning development will allow you to focus on the actual development work without worrying about hardware considerations. You will benefit from advanced processing power, and these platforms can typically run your solutions much faster than anything you could build at home.
Of course, this kind of power doesn’t come for free. You’ll have to pay a subscription fee to use most of these services. The ones that are offered for free come with their own separate limitations.
For example, you may not be able to run your program on demand, and may have to wait in a queue. This can be particularly problematic for longer training sessions, where you will have to add a few extra hours on top of an already long waiting period.
And then, some people just feel more comfortable in their work when they have everything available locally. It can certainly be more convenient to work with machine learning this way when some models can be several gigabytes, and it can take some time to transfer them to and from the appropriate servers.
The Best of Both Worlds
You could use a mixed approach. Do most of your development locally–like actual work on your algorithms and models–and use a hosted service for major, expensive processing.
You can usually submit your data in batches to have it processed all at once over a period of time, and you just have to come back to retrieve your results afterwards. This can work well when you don’t need immediate results, and it can allow you to perform expensive training at a relatively low cost.
This is the approach most people go for these days. If you don’t want to spend too much on hardware, but are fine with the idea of spending some money on this in the first place, this is likely what you should be looking into.
There are various offers on the market, with some aimed at people with smaller budgets, so take a look around and see what’s available out there. Sometimes you can get away with having your projects hosted for surprisingly little, as long as they don’t have any complex requirements.
Be Careful With Sensitive Data
Remember that machine learning can often involve working with sensitive data. For example, you may be tasked with processing medical records or other personal information. It should go without saying that you need to be much more careful in these situations if you’re working with remote hosted services.
You have to be aware of the implications of transmitting that data to remote servers. Sometimes you might find yourself in violation of certain legal frameworks without even realizing it. In the European Union for example, you have to be very careful with GDPR.
It’s a good idea to consult a legal specialist if your machine learning exercises are going to involve any kind of sensitive data. Even better, you should probably not use this kind of data for your first training projects in the first place. Just go with something that’s safer and easier to handle.
Machine Learning on Your Own
Machine learning at home is feasible, and it has many advantages. But it also has some negative implications that you need to consider, and you have to make sure to find a balanced approach in the end. Pay special attention to details like working with sensitive data, and always familiarize yourself with any legal requirements your situation might impose on you.
In the end, this can be a very fun and productive experience that can put you in a great position in the job marketplace.
About The Author