Product and service reviews are conducted independently by our editorial team, but we sometimes make money when you click on links. Learn more.
 

How to Become a Data Scientist

By - Source: Toms IT Pro

All such roads lead to the same destination: a job assembling, analyzing and interpreting large data sets to look for information of interest or value.

Data science encompasses "Big Data," data analytics, and more. It is becoming a vital discipline in IT because it enables businesses to extract value about the many kinds of data they collect in doing whatever it is that they do.

For those who do business with customers, it lets them learn more about those customers. For those who maintain a supply chain, it helps them to understand more and better ways to request, acquire and manage supply components. For those who follow (or try to anticipate) markets – such as financials, commodities, employment and so forth – it helps them construct more accurate and insightful models for such things.

MORE: Best Big Data Certifications

In fact, no matter where you look for data, if large amounts of information are routinely collected and stored, data science can probably find something useful or interesting to say about such collections, if those who examine them can frame and process the right kinds of queries against that data. That's what explains the increasing and ongoing value of data science for most companies and organizations, since all of them routinely collect and maintain various kinds of data nowadays.

Basic Educational Background

The basic foundation for a long-lived career in IT for anybody getting started is to pursue a bachelor's degree in something computing related. This usually means a degree in computer science, management information systems (MIS), computer engineering, informatics or something similar. Plenty of people transition in from other fields, to be sure, but the more math and science under one's belt when making that transition, the easier it will be. Given projected shortages of IT workers, especially in high demand subject areas – which not only include data science, but also networking, security, software development, IT architecture and its various specialty areas, and more – it's hard to go wrong with this kind of career start.

For data scientists in particular, a strong mathematics background, particularly in statistics and analysis, is strongly recommended, if not required outright. This goes naturally along with an equally strong academic foundation in computing. Those willing to slog on through to a master's or PhD before entering the workforce may find data science a particularly appealing and remunerative field of study when their slog comes to an end. If so, they can also jump directly into mid- or expert/senior-level career steps, respectively.

Early Career Work Focus and Experience

If data science is a long term goal, the more experience one has in working with data, the better. Traditional paths into data science may start directly in that field, though many IT professionals also cross over from programming or database positions.

Much of the focus in data science comes from working with so-called "unstructured data" – a term used to describe collections of information usually stored outside a database such as large collections of event or security logs, e-mail messages, customer feedback responses, other text repositories and so forth. So many IT pros find it useful to dig into technologies such as NoSQL and data platforms such as Hadoop, Cloudera and MongoDB. That's because working with unstructured data is an increasingly large part of what data scientists do. Early stage career IT pros will usually wind up focusing on programming for big data environments, or working under the direction of more senior staff to groom and prepare big data sets for further interrogation and analysis.

At this early stage of one's career, exposure to text-oriented programming and basic pattern-matching or query formulation is a must, along with a strong and expanding base of coding, testing and code maintenance experience. Development of basic soft skills in oral and written communications is a good idea, as is some exposure to basic business intelligence and analysis principles and practices. This leads directly into the early-career certifications mentioned in the next section.

Early-Career Network Certifications and Learning

Basic data science training is now readily available online in the form of massively open online courses, or MOOCs. Among the many offerings currently available, KDNuggets article Top 20 Data Science MOOCs lists courses from such institutions as MIT, Harvard, CalTech, Stanford, Brown and others, along with edX courses from Microsoft. MS has since instituted a Microsoft Professional Degree in Data Science that includes nine courses on a variety of related topics and a capstone project to present a reasonably complete introductory curriculum on this subject matter. (Courses aren't free, but at $59 each, they are pretty inexpensive.)

MORE: Confessions of an IT Professional: School Isn't the Only Path

Mid-career Work Focus and Experience

Data science is a big subject area, so by the time you've spent three to five years in the workforce and have started to zero-in on a career path, you'll also starting narrowing in on one or more data science specialties and platforms. These include areas such as big data programming, analysis, business intelligence and more. Any or all of them put you in a front-line data science job of some kind, even as you narrow your focus on the job.

This is the career stage at which you'll develop increasing technical skills and knowledge, as you also start to gain more seniority and responsibility among your peers. Soft skills become more important mid-career as well, because you'll have to start drawing on your abilities to communicate with and lead or guide others (primarily on technical subjects related to data science and its outputs or results) during this career phase.

Mid-career Network Certifications

This is a time for professional growth and specialization. That's why there is a much broader array of topics and areas to consider as one digs deeper into data science to develop more focused and intense technical skills and knowledge. Data science-related certifications can really help with this, but will require some careful research and consideration. Thus, for example, one person might decide to dig into certifications related to a particular big data platform or toolset – such as Cloudera, EMC, Microsoft, Oracle or SAS. You'll find all of these described in our TIP story, Best Big Data Certifications for 2017.

This is a point at which one might choose to specialize more in big data programming for Hadoop, Cloudera or MongoDB on the one hand, or in running analyses and interpreting results from specific big data sets on the other. Cloudera covers most of these bases all by itself, which makes its offerings worth checking out: among many other certifications, they have Data Scientist, Data Engineer, Spark and Hadoop Developer and Administrator for Apache Hadoop credentials. There are dozens of Big Data certifications available today, with more coming online all the time, so you'll have to follow your technical interests and proclivities to learn more about which ones are right for you.

Expert or Senior Level Work Focus and Experience

After 10 or more years in the work force, it's time to get serious about data science/Big Data. This is the point at which most IT professionals start reaching for higher rungs in the job role and responsibilities ladder.

Jobs with such titles as senior data analyst, senior business intelligence analyst, senior data scientist, big data platform specialist (where you can plug in the name of your chosen platform in searching for opportunities), senior big data developer, and so forth, represent the kinds of positions that data science pros are likely to occupy at the point on the career ladder. Expert or senior level IT pros will often be spearheading project teams of varying sizes by this point on the career line as well, even if their jobs don't carry a specific management title or overt management responsibilities. This means that soft skills are even more important with an increasing emphasis on leadership and vision, along with skills in people and project management, plus oral and written communications.

Expert or Senior Level Big Data Certifications

This is the career step at which one typically climbs near or to the top of most technical certification ladders. Many of these credentials – like the SAS "Advanced Analytics" credentials (four at present) – actually include the term "advanced" or "expert" in their certification monikers.

The SAS Institute and Dell EMC, in particular, have rich and deep certification programs, with various opportunities for interested data scientists or Big Data folks to specialize and develop their skills and knowledge. Database platform vendors, such as Oracle, IBM and Microsoft are also starting to recognize the potential and importance of Big Data and are adding related elements to their certification programs all the time. Because this field is still relatively young and new cert programs are still coming online, the shape of the high end of the cert landscape for Big Data is very much a work in progress.

Whatever Big Data platform or specialty you choose to pursue, this is the career stage where deep understanding of the principals and practices in the field and an understanding of their business impact and value must begin to combine. It is also where people must focus on their soft skills at the highest level, because senior data scientists or Big Data experts must be able to lead teams of high-level individuals in the organizations they serve, including top executives, high-level managers, and other technical experts and consultants. As you might expect, this kind of work is as much about soft skills in communication and leadership as it is about in-depth technical knowledge and ability.

Continuing Education: Master's or PhD?

Depending on where you are in terms of work experience, family situation and finances, it may be worth considering a master's degree with a focus on data science or some other aspect of Big Data as a profound developmental step for career development. For most working adults, this will mean getting into a part-time or online advanced degree program.

Many such programs are available, but you'll want to consider the name recognition value and the cost of those offerings when choosing a degree plan to pursue. If pursued later in life (after one's 20s), a Ph.D. is probably only for someone with strong interests in research or teaching, and will not be an option for most readers unless they plan and budget for a lengthy interruption in their working lives (nearly all Ph.D. programs require full-time attendance on campus, and take from three to six years to complete).

With proper education, certification, planning and experience, working as a data scientist, or in some other Big Data role, is an achievable goal. It will take at least three to five years for entry-level IT professionals to work their way into such a position (less for those with more experience or an advanced degree in the field), but it's a job that offers high pay and one that is expected to stay in high demand for the foreseeable future. Because the amount of data stored in the world is only increasing year over year, this appears to be a good specialty area in IT that's long on opportunity and growth potential.

Comments