Skip navigation
Confident Data Skills
Book

Confident Data Skills

Master the Fundamentals of Working with Data and Supercharge Your Career

Kogan Page, 2018 more...

Buy book or audiobook

Read offline

auto-generated audio
auto-generated audio

Editorial Rating

7

Qualities

  • Applicable
  • Overview
  • Concrete Examples

Recommendation

Data scientist, entrepreneur and author Kirill Eremenko is enthusiastic about how data science can solve real-world problems, including in medicine and business. A former consultant and now CEO of the online educational portal SuperDataScience, he’s writing both for business leaders and for novice college grads investigating data science careers. Eremenko details the “Data Science Process” and explains algorithms in a way even readers with no technical background can understand. His clear manual will help anyone who wants to understand the processes and potential of data analytics.

Summary

You leave a “data exhaust” trail others can collect and analyze.

Scientists define big data relative to current hardware and software. They base their definitions on the three “Vs”:

  1. Volume – In the billions of rows.
  2. Velocity – The speed of gathering data.
  3. Variety – The types of data included.

Museums, governments and companies all gathered data for years – in the form of hard copies – before the technology emerged to collect, store and analyze it. Data science relies on technology, but stories built of data nuggets are at its core, since data in whatever form tells the story of a culture. When you post on social media, drive past a security camera or shop online, you create “data exhaust” – the information used by data analytics. Businesses, governments and individual researchers have multiple ways to collect your data exhaust, an extremely important commodity. As The Economist reported in 2017, “Data has superseded oil as the world’s most valuable resource.”

Use the five-step “...

About the Author

Kirill Eremenko, CEO and founder of the online education portal SuperDataScience, provides online courses to more than 300,000 people.