2021 Summer Semester: "Legal Data Science and Informatics" (IN2395)

Master's Level Elective Module

4 SWS / 6 credits

Instructor: Matthias Grabmair (matthias.grabmair@tum.de)

Session times:  Tuesday & Thursday at 14:00-16:00 (starting on time)

Media: Module will be administered via Moodle; Sessions will take place online via video conference

Content Outline

The way lawyers, Judges, corporate legal counsel, government agencies, and businesses engage with legal systems, requirements, and processes is increasingly influenced by technology. Prominent areas of practical interest are the intelligent search and analysis of legal documents, the role of machine learning in supporting legal decision making, and modeling legal processes using expertise encoded in formal rule systems. This module provides an overview, and practical introduction, to the research and state of the art in applying data science and artificial intelligence methods to tasks and problems arising in and around the public and private practice of law. 

Legal decision making, legal data, and legal documents in particular challenge many mainstream modeling and analysis techniques. Hence, the module is intended to be taken by (1) broadly interested students from technical majors interested in challenging interdisciplinary work, and (2) political science / business / law students seeking to enhance their understanding of how new technologies can shape their field.

The module consists of a mix of lecture and discussions sessions following a thematic progression:

  • Introduction to legal systems, legal reasoning, and the impact of AI on legal practice
  • Basics of machine learning and natural language processing (NLP) (intended as a primer/refresher for nontechnical students; largely tailored to specific legal applications contexts)
  • Case- and rule-based formalisms of legal reasoning
  • Legal data analytics, including case outcome prediction and empirical legal studies
  • Equal treatment imperatives and fair machine learning
  • Applications of NLP on legal text, including information retrieval and text-based outcome prediction

Module sessions will cover concepts in an example-driven way through a mix of lectures, guided programming workshops, and discussion of topical research publications that students are expected to read before class.

The course belongs to the "Fachgebiet MLA (Machine Learning & Analytics)"

Learning Outcomes

After completing this module, and depending on students’ focus in the final project, they will be able to:

  • explain knowledge representation and argumentation formalisms used in AI&Law
  • explain the application of techniques from statistics, applied machine learning, and natural language processing to legal data
  • examine and critique experimental work and systems in legal data science/informatics
  • plan, implement, and evaluate a basic legal data science/informatics project


This module focuses on the interdisciplinary application of modern data science / artificial intelligence methods to a predominantly unstructured domain, and hence requires frequent and diverse interaction with the subject matter. TUM political science students will be invited to take the module to facilitate cross-disciplinary discussion.

Grading will be based on an individual project that students are required to complete and which will take place incrementally (with step-wise deadlines) starting at around the middle of the semester. The project components will include (1) a text annotation task (15%)*, (2) a 2-3 page written literature survey (15%)*, (3) an experiment in data analysis and model evaluation (middle level programming in Python using common libraries) to be submitted as code (20%)*, and (4) an 8-10 page final written report explaining and discussing the data, analytical methods, and results (50%)*. By default, all students will conduct the same project by themselves without collaborating with course colleagues. If students want to conduct a special project of their own design, they can propose it to the instructor in writing using a template. The instructor may then approve it if he deems it educationally comparable to the default project. In any case, the collective annotation task is a mandatory part of all course projects.

(*) Component grade percentages of the final grade 

Students are encouraged to submit questions about weekly reading assignments (i.e. topical publications) ahead of the session, which will be picked up during in-class discussions. Submitting a minimum number of quality reading questions will lead to a grade bonus of 0.3 at the end of the semester.

Enrollment & Prerequisites

  • IN0002: Fundamentals of Programming
  • IN8026: Einführung in die Programmierung mit Python / Introduction to Programming with Python (or equivalent; students must be able to autonomously work with Jupyter notebooks in the Python ecosystem)
  • IN0018: Diskrete Wahrscheinlichkeitstheorie / Discrete Probability Theory (or equivalent; students must be able to work with basic concepts from probability and statistics)
  • IN2332: Statistical Modeling and Machine Learning
  • IN2062: Grundlagen der künstlichen Intelligenz / Foundations of Artificial Intelligence
  • Willingness/ability to work intensively across disciplines (reading legal text, drafting specifications, programming, and domain-specific data analysis)

A formal sign-up procedure will be announced.

Literature Sample

Interested students who wish to learn more can look at these exemplary publications, most of which will be discussed in the course.

  • Branting, L. Karl. "Data-centric and logic-based models for automated legal problem solving." Artificial Intelligence and Law 25, no. 1 (2017): 5-27. [link]
  • Simon, Michael, Alvin F. Lindsay, Loly Sosa, and Paige Comparato. "Lola v. Skadden and the Automation of the Legal Profession." Yale JL & Tech 20 (2018). [link]
  • LOLA v. SKADDEN, ARPS, SLATE, MEAGHER & FLOM LLP, Court of Appeals, 2nd Circuit 2015 [link], Dynamo Holdings et al. vs. Commissioner of Internal Revenue, Docket No. 2685-11, 8393-12; July 13, 2016 [link]
  • Surdeanu, Mihai, Ramesh Nallapati, George Gregory, Joshua Walker, and Christopher D. Manning. "Risk analysis for intellectual property litigation." In Proceedings of the 13th International Conference on Artificial Intelligence and Law, pp. 116-120. ACM, 2011. [link]
  • Katz, Daniel Martin, I. I. Bommarito, J. Michael, and Josh Blackman. "A general approach predicting the behavior of the Supreme Court of the United States" (April 12, 2017) PLOS One [link]
  • Ashley, Kevin D., and Vern R. Walker. "From Information Retrieval (IR) to Argument Retrieval (AR) for Legal Cases: Report on a Baseline Study." In JURIX, pp. 29-38. 2013. [link]
  • Conrad, Jack G., and Khalid Al-Kofahi. "Scenario analytics: analyzing jury verdicts to evaluate legal case outcomes." In Proceedings of the 16th edition of the International Conference on Artificial Intelligence and Law, pp. 29-37. ACM, 2017. [link]
  • Shulayeva, Olga, Advaith Siddharthan, and Adam Wyner. "Recognizing cited facts and principles in legal judgements." Artificial Intelligence and Law 25, no. 1 (2017): 107-126. [link]
  • Cardellino, Cristian, Milagro Teruel, Laura Alonso Alemany, and Serena Villata. "A low-cost, high-coverage legal named entity recognizer, classifier and linker." In Proceedings of the 16th edition of the International Conference on Artificial Intelligence and Law, pp. 9-18. ACM, 2017. [link] [better formatted version]
  • Grabmair, Matthias, Kevin D. Ashley, Ran Chen, Preethi Sureshkumar, Chen Wang, Eric Nyberg, and Vern R. Walker. "Introducing LUIMA: an experiment in legal conceptual retrieval of vaccine injury decisions using a UIMA type system and tools." In Proceedings of the 15th International Conference on Artificial Intelligence and Law, pp. 69-78. ACM, 2015. [link]
  • Aletras, Nikolaos, Dimitrios Tsarapatsanis, Daniel Preoţiuc-Pietro, and Vasileios Lampos. "Predicting judicial decisions of the European Court of Human Rights: A natural language processing perspective." PeerJ Computer Science 2 (2016): e93. [link]
  • Branting, L.K., Pfeifer, C., Brown, B., Ferro, L., Aberdeen, J., Weiss, B., Pfaff, M. and Liao, B., 2020. Scalable and explainable legal prediction. Artificial Intelligence and Law, pp.1-26.
  • Engel, Christoph, and Keren Weinshall. Manna from Heaven for Judges–Judges’ Reaction to a Quasi-Random Reduction in Caseload. No. 2020_01. Max Planck Institute for Research on Collective Goods, 2020. [link]
  • Wachter, Sandra and Mittelstadt, Brent and Russell, Chris, Why Fairness Cannot Be Automated: Bridging the Gap Between EU Non-Discrimination Law and AI (March 3, 2020). Available at SSRN: https://ssrn.com/abstract=3547922 or http://dx.doi.org/10.2139/ssrn.3547922