Detecting Misconduct and Malfeasance within Financial Institutions

Wednesday, September 19, 2018 - 18:00
NYC Open Data
New York

We're excited to partner with ACM NY ( and Dataiku Meetup Group ( to co-host this event.

6:00pm: Pizza + Beer + networking
6:15pm: Dr. Panos Ipeirotis, Professor at NYU
7:15pm: Alexander Wolf, Data Scientist at Dataiku


Detecting Misconduct and Malfeasance within Financial Institutions by Dr. Ipeirotis:

Misbehavior in the online world manifests itself in several forms, and often depends on the domain at hand. In the financial domain, firms have the regulatory obligation to self-monitor the activities of their employees (e.g., emails, chats, phone calls), in order to detect any form of misconduct. Some forms of misconduct are illegal activities (e.g., insider trading, bribery) while others are various forms of policy violations (e.g., following improper security practices, or inappropriate language use). Traditionally, and due to ease of understanding and implementation, firms deployed relatively archaic, rule-based systems for employee surveillance. Such rule-based systems generate a large number of false positive alerts, and are hard to adapt in changing environments. More recent techniques aimed at solving the problem by simply transitioning from simple rule-based techniques to statistical machine learning approaches, trying to treat the problem of misconduct detection as a single-document classification problem. We discuss why approaches that try to identify misconduct within single documents are destined to fail, and we present a set of approaches that focus on actors, connections among actors, and on cases of misconduct. Furthermore, we highlight the importance of having a "human in the loop'' approach, where humans are both guided and guide the system at the same time, in order to detect malfeasance faster, and also adapt to changing environments; we also show how humans can play an important role for detecting shortcomings of existing machine-learning-based malfeasance-detection systems, and how humans can be incentivized to detect such shortcomings. Our multifaceted approach has been used in real environments within both big, multinational and smaller financial institutions; we discuss practical constraints and lessons learned by operating in such non-tech, highly regulated environments.

Transformers in NLP- Building a Neural Machine Translator by Alexander Wolf:

In the past, Natural Language Processing has been dominated by Recurrent Models, but now a new architecture called the Transformer has been shown to dominate NLP in many domains. The revolutionary new model uses no recurrence but attention only and develops state of the art accuracy in a fraction of the training time compared to other Deep Learning models. Alex has built a translator using this architecture and will give introduction plus deep dive of how it works along, will explain how it can overcome pitfalls of RNN/LSTM models and will present a history of NLP/ Translation systems.

Panos Ipeirotis is a Professor and George A. Kellner Faculty Fellow at the Department of Information, Operations, and Management Sciences at Leonard N. Stern School of Business of New York University. He received his Ph.D. degree in Computer Science from Columbia University in 2004. He has received nine "Best Paper" awards and nominations, a CAREER award from the National Science Foundation, and is the recipient of the 2015 Lagrange Prize in Complex Systems, for his contributions in the field of social media, user-generated content, and crowdsourcing.

Alex is a Data Scientist at Dataiku, working with clients around the world to organize their data infrastructures and deploy data-driven products into production. Prior, he worked on software and business development in the tech industry and studied Computer Science and Statistics at Dartmouth College. He's passionate about the latest developments in Deep Learning/Tech and enriches Dataiku's NLP features.

NYC Data Science Academy

500 8th Ave, Suite 905