Overfitting and Underfitting in Machine Learning
Updated on January 30, 2026 6 minutes read
A common sign is a large gap between training and validation/test results: the model scores high on training data but drops on unseen data. Confirm with cross‑validation and learning curves, and rule out data leakage.
Often, yes, if the added data is diverse and representative of production. More coverage reduces variance and makes it harder for the model to memorize noise, but evaluation and leakage checks still matter.
Start by increasing model capacity (or choosing a more flexible algorithm) and improving features. If you’re using strong regularization or very restrictive hyperparameters, relax them and re‑evaluate with a consistent validation setup.