- Vendredi le 08/05/2015
Machine Learning for the Web
Professeur Olfa Nasraoui invité de l’ESEN
Dans le cadre du nouveau Mastère de recherche « Web Intelligence » , nous avons le plaisir de vous informer que Professeur Olfa Nasraoui sera parmi nous à partir du 11 Mai pour assurer le cours de « Machine Learning for the Web ».
Dr Olfa Nasraoui : Professeur en E-commerce et chef de département en « Computer engineering and computer science » à l’université de Louisville, Kentucky, USA. Elle est aussi directeur du laboratoire « knowledge discovery and web mining ».
Dr Nasraoui : auteur de plus que 160 publications dans le domaine de l’intelligence artificielle, science de l’information, fouille de données et le web mining. Elle a reçu plus que 3787 citations, son h-index est de 31 selon google-scholar.
Pages web :
Son laboratoire : http://webmining.spd.louisville.edu/
ResearchGate : http://www.researchgate.net/profile/Olfa_Nasraoui
Google scholar : https://scholar.google.com/citations?user=SGscZDgAAAAJ&hl=fr
Le cours de « Machine Learning for the web » qu’elle va assurer est décrit comme suit :
Machine Learning for the Web
Short Description: Fundamentals of machine learning and knowledge discovery in semi-structured/unstructured data with an emphasis on the World Wide Web: Web usage, content, and structure/social network mining; applications to personalization, e-commerce, information retrieval, text mining, adaptive Web sites, etc.
Prerequisites : Graduate standing; statistics and probability, Having taken the data mining course is helpful but nor required.
Textbooks:
Web Data Mining, Exploring Hyperlinks, Contents, and Usage Data - 2ndEdition, by Bing Liu, Publisher: Springer (Series: Data-Centric Systems and Applications) , 2011, ISBN 978-3-642-19459-7 (book web page:http://www.cs.uic.edu/~liub/WebMiningBook.html)
Reference books and literature:
- Web Mining
-Modeling the Internet and the Web - Probabilistic methods and algorithms, by Baldi, Frasconi, and Smyth, Publisher: Wiley, 2003
-Mining the Web: Analysis of Hypertext and Semi Structured Data, by Soumen Chakrabarti, Morgan Kauffman, 2002
- Mining the Web: Transforming Customer Data into Customer Value, by Michael J. A. Berry, Gordon S. Linoff, J. Wiley, 2002
- Data Mining
Introduction to Data Mining, by Tan, Steinbach, and Kumar, Addison Wesley, 2006
- Research papers: provided by the instructor as needed.
Machine Learning for the Web:
The Web represents a key driving force for a large spectrum of applications in which users interact with or within companies, organizations, governmental agencies, and educational or collaborative environments. User preferences and expectations, together with usage, content, and structural patterns obtained from the Web, form the basis for intelligent, personalized, and business-optimal services. Enabling technologies include machine learning, data mining, scalable data warehousing and preprocessing, sequence discovery, real time processing, document classification, user modeling and evaluation models. Recipient technologies that demand for web mining and machine learning include web search and information retrieval, information filtering, recommendation systems, Web analytics applications, social media analysis, social network analysis, community detection, content management systems, and fraud or intrusion detection systems. The inherent and increasing heterogeneity of the Web has required Web-based applications to more effectively integrate a variety of types of data across multiple channels and from different sources. The development of techniques and architectures for more effective integration and mining of content, usage, and structure data from different sources is likely to lead to the next generation of more useful and more intelligent Web applications. This sets the stage for the future of the semantic web and intelligent information retrieval.
Course Structure:
1. Day 1: Background: Mathematical background review, overview of the various facets of Web data and its mathematical models, overview of machine learning.
2. Day 1-3 Machine Learning:
o Day 1 (continued): Supervised Learning,
o Day 2: Unsupervised Learning,
o Day 3: Semi-supervised learning
3. Machine learning for the Web
o Day 3-4: Text Mining and Information Retrieval
o Day 5: Web Recommender Systems
o Day 6: Link and Social Network Analysis
Main Assessment:
- Optional Project involves the study and solution of a challenging Web mining problem using your own collected data or public data sets, and a computer implementation of the solution (using existing tools and public code is allowed and encouraged). Project proposal and report must be submitted in PDF format. Final Project Report (5 pages maximum + additional Appendix allowed for experimental results or code)
o Report should be structured to include title, introduction, literature review on the studied problem, related problems and prior methods used to solve this problem in the literature. The report must include a computational complexity analysis, detailed algorithm, implementation details, and experimental results with analysis of proposed solution(s), conclusions, and bibliographic references. The project grade will be evenly distributed between report structure, algorithms, and project poster presentation.
o Follow ACM formatting guidelines