Improving the decision-making of the company is the ultimate goal of data science. It drives decision-making in every part of this modern society By suggesting what advertisement you watch, books you read, and Music you listen to. By recommending friends on social media, and emails that are filtered to spam on your mailbox.
This massive use of data in this era of social media, the speedup of computing power, the evolution of AI algorithms, and the reduction of computer memory lead to a giant growth of data science applications across all industries.
With more organizations are exploiting data, the ethical challenge of using data and data privacy becomes more pressing.
What is Data Science?
Data science includes other fields such as Mathematiques, statistics, machine learning, and programming to extract useful patterns from a large set of data to improve decision making.
Difference between Data Science vs Data mining vs Machine Learning:
Machine learning focuses on designing and evaluating algorithms to extract patterns from data.
Data mining focuses on analyzing structured data and is applied often to commercial use cases.
Data science is broader and uses some techniques from both data mining and machine learning depending on the problem studied.
Challenges of Data science:
- Capturing and collecting data.
- Transforming of unstructured data.
- Big data technologies of storage and processing.
- Data ethics questions.
Data science real use case:
By using data science, organizations can extract different types of patterns. some examples are:
- Group customers into different segments that behave similarly which is called customer segmentation.
- Extract products bought together in a shop, called association-rule mining
- Detect abnormal events such as fraudulent insurance claims, a process known as anomaly detection.
- Group spam emails in a mailbox, a process called prediction-classification.
When we should use data science:
If a human can create and detect a pattern in his own mind, it is not worth the time and effort to use data science.
Data science becomes useful when we have a large number of data examples and the patterns are too complex for humans to discover and extract manually.
For humans, it is easier to check patterns with less than 3 variables, more than this, it is a struggle and needs more complex algorithms to find the patterns among the data set with hundreds or even millions of attributes(variables) and extract insights(actionable information about the problem that is not obvious).
For example: in a cell company, many customers are switching to other cell companies (churn problem).
How can we use data science in this case?
We can extract patterns from previous customers who left and try to predict the customers that are lucky to switch the company. Then, we can persuade them to stay with personalized offers.
How data science is used in real life?
Data science is used for decision-making in different domains, we can state three cases:
Sales and Marketing:
Companies use data science for sales and marketing goals such as Walmart.
Walmart has access to large datasets of their customer’s information by tracking their behaviors and comments on their website.
Walmart used data science to introduce new products and make product recommendations by analyzing social media trends and credit card activities which leads to a more personalized online experience for their customers. Those optimizations lead to a 15% increase in online sales.
Similarly, Netflix uses recommendation systems to suggest your next movie. Also, Amazon recommends other products you might like to buy next.
Governments using data science:
US governments use data science to improve health, criminal justice, and urban planning.
The US government uses a huge dataset collected by volunteers to help improve precision medicine.
Data science helped on improving many prediction systems to predict diseases before they happen or develop.
Also, data science is helping Police to predict crime hot spots.
Sports and data science:
Professional sporting franchises using data science in player recruitment.
The strategy of using data science to help sports teams recruit their next player evolved after Oakland A’s baseball team used data science to improve its player recruitment. (Lewis 2004)
If the right data are available and the problem can be clearly defined, then data science can help.
Why Now?
- Data science got a huge spotlight recently due to different reasons:
- The emergence of Big data increased thanks to internet users on social media, e-commerce websites, and online platforms.
- Growth of computer power (GPUs).
- Development of Data science and analysis tools. The advancement in Ml methods in the last 15 years.
In particular, deep learning (which uses neural networks that work with large and complex data sets) revolutionized how to process languages and image data.
One example in the gaming world is Alpha Go. Alpha Go becomes the first computer program that beat professional Go player: Led Sedol In 2016 (18-time Go world champion).
Conclusion
Data science is changing the business game.
In the next articles, we will cover more details about the actual algorithms used and python implementation.
source: Data Science (The MIT Press Essential Knowledge series).
Hey there! I am the creator of AI Decoder.
I am a data scientist by training and a Ph.D. student in AI. In this blog, I try to explain the knowledge I learn in simple words and help someone somewhere.