Data science

Python libraries

  • Pandas

  • Numpy

  • Exploratory data analysis

  • Data visualization

  • Big data

  • Data mining

  • Network analysis

  • Kaggle checklist

    • Preprocessing: EDA, normalization, filling null values, etc
    • Feature engineering: building new features from the base ones, adding polynomial features, etc
    • Mode training: trai
    • CV testing: cross-validation tests to give estimate of test error
    • Ensembling: combine a series of different/diverse models to get better performance