You are here: Home » Study Plan » Subject

Sciences

Subject: INFORMATION MANAGEMENT (A.A. 2024/2025)

degree course in COMPUTER SCIENCE

Course year 3
CFU 6
Teaching units Unit Gestione dell'informazione
Information Technology (lesson)
  • TAF: Compulsory subjects, characteristic of the class SSD: ING-INF/05 CFU: 6
Teachers: Federica MANDREOLI
Exam type oral
Evaluation final vote
Teaching language Italiano
Contents download pdf download

Teachers

Federica MANDREOLI

Overview

The course introduces the student to the main techniques for information management and retrieval in different application domains, including WWW, Semantic Web, and Social Networks. The addressed information types are textual data and semi-structured data and the focus is on how to manipulate information, store large amounts of information and support effective and efficient searches through ad-hoc methodologies and data structures for the implementation of applications that access such information.

The ability to apply the knowledge will be reflected primarily in the ability to exploit and devise techniques for managing complex information, to design advanced data-centric applications and in the ability to design and provide complete implementations using web and database technologies.

Thanks to the fundamental analysis and project activities, the course will provide the student with the ability to make judgments, to justify the choices made and to critically evaluate the results obtained. Moreover, the course aims to encourage teamwork, thanks to which the skills of interaction and communication between peers will be learned.

Finally, thanks to the various pointers and short seminars in advanced topics in the field of information management, the course will provide students with the ability to interact, learn and keep up to date on ever evolving advanced information management technologies and methodologies.

Admission requirements

Mandatory: Algorithms and Data structures

Prerequisites: Deep understanding of the main data structures and algorithms for massive structured data management, relational databases, declarative query languages.

Course contents

The course is offered in the first semester of the third year, for a total of 48 hours of face-to-face teaching (6 CFU) divided between hours of "theory", i.e. lessons in which the course topics are introduced and illustrated, and hours of "exercises", consisting of small full-text processing projects.
The number of hours per topic is purely indicative. It may be subject to changes during the course according to feedback and student participation.

Introduction (2 hours):
___________
Overview of the types of information that go beyond relational data and of the recent developments in the management of such information in advanced applications scenarios such as data exchange, semantic web, search engines, mobile and pervasive systems.

Management of full-text information (30 hours):
___________________________________
Information retrieval systems.

Techniques for the manipulation of textual data contained in web pages, e-mails, electronic documents, etc. Text processing.

Definition, creation and updating of main and secondary memory data structures (inverted index, suffix tree, PAT trees, etc..) for efficient search in texts and character strings (eg. biological sequences).

Search algorithms and phrasal queries.

Approximate search models and result ranking. Classical models: boolean model, vector-space model, probabilistic model. Advanced models.

Tolerant retrieval.


Random walk models for web page ranking (6 hours):
________________________________________________
WWW as a graph of web pages, web crawling and visits of graphs.

Ranking Web pages in search engines: Page Rank and HITS.


Seminars (10 hours):
__________________________________________
The course includes introductory seminars on other cutting-edge topics and/or concerning the proposed project.

Teaching methods

Teaching is based on frontal lectures. The course lectures, besides providing in-depth analysis of the theory of proposed techniques, include a series of hands-on activities on the main technological solutions. At the end of the course, students will thus have a complete vision of how to design, structure and implement the best data-centric applications in the application domains considered. Finally, through short seminars on hot information management topics, several pointers will be provided in order to deepen the lectures' contents and to keep students up to date on relevant new technologies in the field. Questions and interventions from students are welcome and encouraged. Attendance is not compulsory but strongly recommended. The course is delivered in Italian. All the technical and organizational information on the course, as well as the teaching materials, will be uploaded on Moodle platform. Students are invited to register and visit Moodle course page regularly.

Assessment methods

The course includes a group project and a written examination. The group project allows students to deepen the techniques shown in class and to apply them in the context of a real data-centric application, thus requiring the ability to respond to specific information management requirements with effective and efficient solutions. During the oral presentation, in addition to identifying a correct and adequate solution, it will also be important to clearly explain the project, explain the design and technological choices made and show, even with experimental tests and in a comparative way, the adequacy of the solution. Project topic, development rules, and main project features will be presented during the course. The written exam includes some open questions about the course topics and very simple exercises. Each answer must be given in a limited space (half/full page). Grades are in 32/30 for laude. The final grade is given by the weighted average of 60% of the written grade and 40% of the project grade. During the Academic Year, there are 6 written exams and 5 oral exams for project presentation.

Learning outcomes

Knowledge and understanding: Through lectures, students will have a sound knowledge and understanding in the theory of non traditional information management, from textual to semi-structured and graph-oriented information; they will also understand the main technologies and techniques used in commonly used data centric applications.

Applying knowledge and understanding: Through practical computer exercises and individual and group project activities, the student will be able to apply the gained knowledge in the design and implementation of information management techniques and of applications based on them.

Making judgments: Through conducting individual and group project activities, the student will be able to evaluate, explain and critically discuss the design decisions taken and the results obtained in the context of a real data centric application.

Communication skills: the preparation and presentation of the project report will allow the students to organize and present with clarity and conciseness, as well as with appropriate technical language, the results of their work. In addition, implementing the project will require full practical ability to read with profit English technical documentation.

Learning skills: The described activities will enable students to acquire the methodological tools to continue their studies and to be able to perform their own update; this is especially crucial in an area such as computer information management, where key technologies are ever evolving.

Readings

Il libro di riferimento è Baeza-Yates, Ribeiro-Neto, “Modern Information Retrieval: The Concepts and Technology Behind Search”. Addison Wesley.

Libro con versione on-line: Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008. Disponibile al seguente link https://nlp.stanford.edu/IR-book/

Dispense in inglese a cura del docente disponibili sul sito del corso.
Le dispense del corso includono riferimenti disponibili per ciascuno degli argomenti trattati, consigliati per eventuali approfondimenti individuali.