← Back to all posts

Reading time - 7 mins

1. ML Picks of the Week

A weekly dose of ML tools, concepts & interview prep. All in under 1 minute.

2. Technical ML Section

Learn When Gradient Boosting Fails for Tabular Data

3. Career ML Section

Learn why NOT to be an ML/DS generalist to make a successful ML career

1. ML Picks of the Week

🥇ML Library tsfresh

Tsfresh is a powerful Python library for automated feature extraction from time series data.

It’s designed to help you transform raw sequences into meaningful features, without manually crafting time lags or domain-specific indicators.

What makes tsfresh worth learning:

✅ Automatically extracts hundreds of time series features
✅ Filters out irrelevant ones using built-in statistical tests
✅ Works well with both univariate and multivariate time series

So, if you’re working on time series-related problems, tsfresh helps you save time and avoid manual errors in early pipeline stages.

📈 ML Concept

p-value

The p-value is a key metric in statistical hypothesis testing.

It measures the probability of getting results at least as extreme as your observation, assuming the null hypothesis is true.

In simpler terms: a small p-value means your result is unlikely to be random, and provides evidence against the null.

Why it matters in ML:

✅ Used for feature selection in statistical models
✅ Supports A/B testing and experiment analysis
✅ Provides evidence-based support for decision-making

Just remember:
A low p-value means “statistically significant,” not necessarily important in practice.

Read a deeper breakdown of on p-value HERE.

🤔 ML Interview Question

What are the 4 main limitations of K-Means clustering?

K-Means is fast and easy to implement — but it has several core weaknesses you should know before using it in production.

📌 Requires specifying K:
You must define the number of clusters up front, and choosing the right K isn’t always clear.

📌 Assumes equal-sized, spherical clusters:
K-Means struggles with elongated or unevenly sized clusters. It works best when groups are compact and similar.

📌 Sensitive to outliers:
A single outlier can pull a centroid far off, distorting results.

📌 Initialization matters:
Different random starts can lead to different outcomes. K-Means++ helps, but doesn’t guarantee optimal clusters.

Read more about the K-Means clustering HERE.

2. Technical ML Section

When Gradient Boosting Fails for Tabular Data

Gradient Boosting (GB) is considered a go-to algorithm when you deal with tabular datasets. This is not without a reason. GB has a strong predictive power for tabular data.

However, as is often in ML, GB is not a silver bullet.

In this Maistermind issue, you will learn 5 cases when not to use Gradient Boosting even if you have a tabular dataset.

1️⃣ Don't use GB when features <-> target relationships are mostly linear

In this case, Gradient Boosting will barely beat Linear or Logistic Regression.

But linear models will

Train much faster
Be more interpretable
Be easier to tune hyperparameters

Below is an example of a relationship when GB is an overkill.

2️⃣ Don't use GB when you have noisy and sparse data with low variability

In this case, Gradient Boosting will again barely beat Linear Models.

Gradient Boosting likes:

Data with a low noise level (will not overfit)
High data variability with many distinct feature values

Below is an example of the target and features distribution over time when it does not make sense to use Gradient Boosting.

You can often see such data in industrial processes when there is a stable behavior of an industrial system over a long period of time.

3️⃣ Don't use GB when model extrapolation is important

In most cases, we use Gradient Boosting with Decision Trees as weak learners. They don’t extrapolate.

Yes, you can use Gradient Boosting with linear models as weak learners. They can extrapolate outside the training range.

However, people don't do that in 99.9% of cases.

If you need extrapolation for non-linear data, use smooth non-linear models, e.g., Neural Networks or Gaussian Processes.

Yes, performance is not guaranteed, but with smooth functions, it can still be relatively good in case you are not very far away from the training set. Then, at some point, you will re-train the model and be OK.

4️⃣ Don't use GB when you want to have a baseline non-linear model

Gradient Boosting is hard to tune because it has many hyperparameters and it's prone to overfitting.

If you need a quick non-linear model baseline, use Random Forest. Easy to tune, hard to overfit from the box.

5️⃣ Don't use GB when you want to use the ML model for optimization

Gradient Boosting with Decision Trees is a piece-wise constant non-smooth algorithm.

This will make Optimization Gradients noisy and unstable.

In this case, use smooth non-linear models, e.g., Neural Networks, Gaussian Processes, Splines.

Below, see the example of a smooth function approximation.

6️⃣ Summary

Don't use GB when features <-> target relationships are mostly linear
Don't use GB when you have noisy and sparse data with low variability
Don't use GB when model extrapolation is important
Don't use GB when you want to have a baseline non-linear model
Don't use GB when you want to use the ML model for optimization

That's it for the Technical Part!

Follow me on LinkedIn for more daily ML breakdowns.

2. ML Career Section

Why NOT to be an ML/DS generalist to make a successful ML career

Too many data scientists try to learn everything.

NLP, computer vision, reinforcement learning, fraud detection, demand forecasting, marketing analytics, MLOps, causal inference... all at once.

That’s a mistake.

Having a broad awareness is helpful.

But trying to master every area at once often leads to shallow understanding and slow progress.

💰 The highest-paid data scientists usually follow a different path:

→ Go deep in 1–2 ML domains (e.g., tabular ML and time series forecasting)
→ Focus on 1–2 business verticals (e.g., retail or insurance)

Why does it work well?

When your experience consistently maps to specific problems in a domain — e.g., demand forecasts for retail inventory, you’re no longer just “another data scientist.”

⭐ You become the top 1% candidate for those roles:

-> The one who gets shortlisted first.
-> The one who gets paid more

This is because you’ve already solved their exact problems.

👉 So, my recommendation:
-> Pick one ML domain and one business domain.
-> Go deep. Build intuition. Work on real use cases.

That’s how you build a successful ML career!

That is it for this week!

If you haven’t yet, follow me on LinkedIn where I share Technical and Career ML content every day!

Whenever you're ready, there are 3 ways I can help you:

1. ML Job Landing Kit

Get everything I learned about landing ML jobs after reviewing 1000+ ML CVs, conducting 100+ interviews & hiring 25 Data Scientists. The exact system I used to help 70+ clients get more interviews and land their ML jobs.

2. ML Career 1:1 Session

I’ll address your personal request & create a strategic plan with the next steps to grow your ML career.

3. Full CV & LinkedIn Upgrade (all done for you)

I review your experience, clarify all the details, and create:
- Upgraded ready-to-use CV (Doc format)
- Optimized LinkedIn Profile (About, Headline, Banner, and Experience Sections)

Join Maistermind for 1 weekly piece with 2 ML guides:

1. Technical ML tutorial or skill learning guide
2. Tips list to grow ML career, LinkedIn, income

Join here!

#12: 5 Cases When Gradient Boosting Fails | Why NOT to be an ML/DS Generalist

Reading time - 7 mins