Data Science: It’s About More Than Tools

By Dan Sullivan September 24, 2012 11:10 PM

Big DataSalary surveys are one way to get someone’s attention, especially if the survey is focused on an emerging IT job role such as Data Scientist.

SiSense provides business intelligence solutions so it’s no surprise they put out a salary survey for data scientists. (Available here). The results are positive if analytics is your thing. 

Salaries are up around the globe, with 61 percent of the 400 respondents reporting higher incomes this year over last year. Women in the field are making as much or more than men which is a welcome change of pace when we are used to hearing about female professionals earning less than their male counterparts. 

As you would expect, salaries increase with more education and years of experience.

Life is good if you’re a data scientist, right?

Yes, at least in the short run and maybe even in the medium term. The long run is not necessarily so positive it you don’t have the right skills. I wrote earlier about learning the right skills to be a data scientist. I put in my plug for R and other tools in that post. Others have written on the role of Hadoop in big data and data science.  Learning software is not enough though.

When we are working with big data we are actually working on a problem that generates big data.  Good data scientists aren’t necessarily an expert in every big data tool out there but they are good at thinking about problems, abstracting away unimportant details, and identifying the key elements of a problem that lend themselves to providing the kind of insight you are looking for. 

Hadoop - A Tool for Data ScientistsSometimes this means understanding your problem in terms of predicting future trends based on past outcomes (linear regressions might be a good method for that) or learning which customers are likely to churn (classifications algorithms like SVM fit the bill here) or understanding how diverse types of data give insight into some phenomenon (graph theory and graph databases might be a good starting point in this case).

Things are looking good for the data science profession.  We just need to remember succeeding in this field requires more than learning some tools. We should learn techniques and methods. We apply these techniques and methods using tools like Hadoop, R, and AllegroGraph.  And most of all we need to key our eyes on the prize: the business value of the data we analyze and the results we produce.

The survey results report concludes with a section called Final Words of Wisdom.” One point is worth repeating here:

“Understand the business needs for this data as much as you understand  the technology that drives it.”


Dan Sullivan is an author, systems architect, and consultant with over 20 years of IT experience with engagements in systems architecture, enterprise security, advanced analytics and business intelligence. He has worked in a broad range of industries, including financial services, manufacturing, pharmaceuticals, software development, government, retail, gas and oil production, power generation, life sciences, and education.  Dan has written 16 books and numerous articles and white papers about topics ranging from data warehousing,Cloud Computing and advanced analytics to security management, collaboration, and text mining.

See here for all of Dan's Tom's IT Pro articles.

(Shutterstock image credit: Cloud Data Folder)

Comment on this article
Comments