Seminar: Advanced Topics in Data Mining

    About the seminar

    Goal of the seminar is the independent research of a scientific topic based on a publication as well as a high quality presentation of the topic in both written (report) and spoken forms (presentation and Q&A sessions).

    This seminar is dedicated to the discussion of selected topics in data mining. Each semester the focus is on a different topic, for example, certain learning techniques (such as clustering, classification, …), evaluation methods, mining for different data types (such as timeseries, trajectories, text …), feature selection etc.


    SoSe19 focus: Top algorithms in Data Mining

    In SoSe19, we will focus on top algorithms for data mining. In particular, we will review top algorithms as well as certain extensions for each of them.

    Organisation

    • Contact person: Prof. Dr. Eirini Ntoutsi
    • Participation is equivalent to 3 ECTS credits (1 ECTS credit equals 30 hours of study).
    • We will use StudIP for discussions/announcements/material/schedule.
    • Kick-off meeting: Wednesday 17/04/2019

    Process

    The students choose a topic from the list of provided topics. For each topic, two papers will be provided: a paper referring to the original algorithm and another one refering to some extension of the algorithm. Students are expected to carefully read the papers as well as related work (at least 2 papers) necessary to comprehend the topics (i.e., both the original algorithm and the extension).

    For each student there will be an advisor who guides the whole process and helps the student in case of difficulties and questions. The students are expected to write a report of their research topic, present their findings to the class and be able to pose and answer questions in the Q&A sessions.

    The final grade depends on the presentation, report and overall participation and engagement.

    Schedule

    There will be a few group meetings and regular meetings with the advisors through the semester. It is obligatory for the students to participate in those meetings and actively engage in discussions and Q&A sessions.

    Check Stud.IP for actual schedule and announcements on the seminar for SoSe19.

    Topics

    The selection of the top algorithms is based on the ICDM 2008 paper ``Top 10 algorithms in data mining''. The selection of the extensions to these algorithms is made by the mentors and/or the professor.

    List of topics

    1. C4.5 & fairness-aware DT induction
    2. Naive Bayes & tackling the poor assumptions
    3. Naive Bayes & model merging
    4. Naive Bayes & fairness-aware NB
    5. Adaboost & cost-based learning
    6. Adaboost & fairness-aware adaboost
    7. kNN & class imbalance
    8. kNN & high dimensionality
    9. SVM & fairness-aware learning
    10. SVM & cost-sensitive learning
    11. CART & Gradient tree boosting
    12. k-Means & stability
    13. k-Means & Stream k-Means
    14. EM & dealing with outliers
    15. DBSCAN & Density-based stream clustering
    16. Apriori & Redundancy reduction

    Mentors SoSe19

    • Prof. Dr. Eirini Ntoutsi
    • MSc Vasileios Iosifidis
    • MSc Tai Le Quy
    • MSc Felipe Reis
    • MSc Amir Abolfazli