Skip to content

WI Exam

January 4-6, 2021

  • You will get access to the assignment by the start of the exam (January 4, 2021 at 9.00)
  • Your paper must be uploaded by the end of the exam (January 6, 2021 at 23.59)
  • In case of errors in the assignment, please contact your semester secretary (9940 8854/[email protected])

Topics

  1. Web crawling. Frontier, robustness, politeness, Mercator frontier, duplicate identification, Shingles
  2. Index construction. Text preprocessing, inverted index, Boolean retrieval, merge algorithm, positional index
  3. Content-based Ranking. TF-IDF, vector space model, cosine similarity, SMART notation
  4. Structure based Ranking. Centrality, prestige, PageRank, random walks and Markov chains (exclude: HITS)
  5. Recommendation: basics and content-based. Cold start problem, serendipity, explicit/implicit feedback, evaluation measures, content-based recommendation, naive Bayes, content based recommendation with implicit feedback.
  6. Recommendation: basics and collaborative. Cold start problem, serendipity, explicit/implicit feedback, evaluation measures, neighborhood methods, personalized PageRank (exclude: Itemrank), matrix factorization from error function minimization perspective (exclude: matrix theory behind SVD).
  7. Community Detection: basics and classics: network characteristics, degree distribution, power law, diameter, clustering coefficient, Kernighan-Lin algorithm, Newman-Girvan algorithm, modularity
  8. Community Detection: basics and latent: network characteristics, degree distribution, power law, diameter, clustering coefficient, matrix factorization techniques, adjacency matrix, Laplacian matrix
  9. Node classification: inductive, transductive, homophily, node features, independent classification, collective classification, iterative independent classification, label propagation

Procedure

You will receive via Digital Exam an assignment on Jan. 4, for which you have to submit a written solution on Jan. 6.

Everyone receives a randomly allocated assignment from among these 9 different ones. For each assignment, there will be the option to solve the assignment in one of the two 'modes':

  • Theoretical
    • Your solution of the assignment consists of explanations, examples, discussion, figures, etc. that are based on the lecture slides and background literature. No Python programming required.
  • Experimental
    • Your solution will still require some explanations, discussions etc., but is otherwise largely based on implementation and experimentation. Python code developed in the self-study exercises can often be a basis for this type of solution to the assignment.

Obviously, solutions to the assignments must be prepared strictly individually, and no collaboration between students with the same assignments is allowed. Python code developed with other students in the self studies can be freely used, however.


Last update: December 30, 2020