Publications

My CV and  Google Scholar page


Interpretable Machine Learning


  • Partially Interpretable Estimators (PIE): Black-Box-Refined Interpretable Machine Learning Show More We propose Partially Interpretable Estimators (PIE) which attribute a prediction to individual features via an interpretable model, while a (possibly) small part of the PIE prediction is attributed to the interaction of features via a black-box model, with the goal to boost the predictive performance while maintaining interpretability. As such, the interpretable model captures the main contributions of features, and the black-box model attempts to complement the interpretable piece by capturing the “nuances” of feature interactions as a refinement. We design an iterative training algorithm to jointly train the two types of models. Experimental results show that PIE is highly competitive to black-box models while outperforming interpretable baselines. In addition, the understandability of PIE is comparable to simple linear models as validated via a human evaluation.
    T Wang, J Yang, Y Li, B Wang
    Under Review

  • Interpretable Sequence Classification via Prototype Trajectory (paper) (code) Show More We propose a novel interpretable recurrent neural network (RNN) model, called ProtoryNet, in which we introduce a new concept of prototype trajectories. Motivated by the prototype theory in modern linguistics, ProtoryNet makes a prediction by finding the most similar prototype for each sentence in a text sequence and feeding an RNN backbone with the proximity of each of the sentences to the prototypes. The RNN backbone then captures the temporal pattern of the prototypes, to which we refer as \textit{prototype trajectories}. The prototype trajectories enable intuitive, fine-grained interpretation of how the model reached to the final prediction, resembling the process of how humans analyze paragraphs. Experiments conducted on multiple public data sets reveal that the proposed method not only is more interpretable but also is more accurate than the current state-of-the-art prototype-based method. Furthermore, we report a survey result indicating that human users find ProtoryNet more intuitive and easier to understand, compared to the other prototype-based methods.
    D Hong, S Baek, T Wang*

    (* corresponding author)
    Under Review

  • Disjunctive Rule List Show More We present a Disjunctive Rule List (DRL) for interpretable regression, that achieves better trade-offs between predictive performance and model complexity compared to existing rule-based regressors. A DRL model consists of a list of \emph{disjunctive rules} embedded in an if-else logic structure which stratifies the data space, and each rule captures one stratum. DRL is a generalized form of rule lists. Compared to traditional decision trees and other rule list models that stratify the feature space with single itemsets, DRL uses a set of itemsets to capture a sub-region, which avoids unnecessary partitions of the data space and allows the model to capture heterogeneous characteristics within a stratum. We define a global objective that considers both the predictive performance and model complexity. To train the model, we devise a hierarchical stochastic local search algorithm that exploits the properties of DRL’s unique structure to improve search efficiency. Experiments on public datasets demonstrate that DRL outperforms baseline interpretable models.
    R Ragodos, T Wang*
    (* corresponding author)
    INFORMS Journal on Computing (IJOC), 2022

  • Augmented Fairness: An Interpretable Model Augmenting Decision-Makers’ Fairness Show More We propose a model-agnostic approach for mitigating the prediction bias of a black-box decision-maker, and in particular, a human decision-maker. Our method detects in the feature space where the black-box decision-maker is biased and replaces it with a few short decision rules, acting as a “fair surrogate”. The rule-based surrogate model is trained under two objectives, predictive performance and fairness. Our model focuses on a setting that is common in practice but distinct from other literature on fairness. We only have black-box access to the model, and only a limited set of true labels can be queried under a budget constraint. We formulate a multi-objective optimization for building a surrogate model, where we simultaneously optimize for both predictive performance and bias. To train the model, we propose a novel training algorithm that combines a nondominated sorting genetic algorithm with active learning. We test our model on public datasets where we simulate various biased “black-box” classifiers (decision-makers) and apply our approach for interpretable augmented fairness.
    T Wang, M Saar-tsechansky
    NeurIPS 2020 Workshop Algorithmic Fairness through the Lens of Causality and Interpretability, 2020
    INFORMS Workshop on Data Science, 2020
    Best Paper Award at INFORMS Workshop on Data Science, 2020.

  • Explaining a Reinforcement Learning Agent via Prototyping  (paper) Show More While deep reinforcement learning has proven to be successful in solving control tasks, the “black-box” nature of an agent has received increasing concerns. We propose a prototype-based post-hoc policy explainer, ProtoX, that explains a black-box agent by prototyping the agent’s behaviors into scenarios, each represented by a prototypical state. When learning prototypes, ProtoX considers both visual similarity and scenario similarity. The latter is unique to the reinforcement learning context, since it explains why the same action is taken in visually different states. To teach ProtoX about visual similarity, we pre-train an encoder using contrastive learning via self-supervised learning to recognize states as similar if they occur close together in time and receive the same action from the black-box agent. We then add an isometry layer to allow ProtoX to adapt scenario similarity to the downstream task. ProtoX is trained via imitation learning using behavior cloning, and thus requires no access to the environment or agent. In addition to explanation fidelity, we design different prototype shaping terms in the objective function to encourage better interpretability. We conduct various experiments to test ProtoX. Results show that ProtoX achieved high fidelity to the original black-box agent while providing meaningful and understandable explanations.
    R Ragodos, T Wang*, Q Lin, X Zhou

    (* corresponding author)
    Conference on Neural Information Processing Systems (NeurIPS), 2022

  • AdaAX: Explaining Recurrent Neural Networks by Learning Automata with Adaptive States (paper)Show More Recurrent neural networks (RNN) are widely used for handling sequence data. However, their black-box nature makes it difficult for users to interpret the decision-making process. We propose a new method to construct deterministic finite automata to explain RNN. In an automoton, states are abstracted from hidden states produced by the RNN, and the transitions represent input symbols. Thus, users can follow the paths of transitions, called patterns, to understand how a prediction is produced. Existing methods for extracting automata partition the hidden state space at the beginning of the extraction, which often leads to solutions that are either inaccurate or too large in size to comprehend. Unlike previous methods, our approach allows the automata states to be formed adaptively during the extraction. Instead of defining patterns on pre-determined clusters, our method identifies small sets of hidden states, determined by patterns with finer granularity in data. Then these small sets are gradually merged to form states, allowing users to trade fidelity for lower complexity. Experiments show that our automata are capable of achieving higher fidelity while being significantly smaller in size than baseline methods on synthetic and complex real datasets.
    D Hong, A Maria Segre, T Wang*

    (* corresponding author)
    SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2022

  • Hybrid Predictive Model: When an Interpretable Model Collaborates with a Black-box Model  (paper) (code)Show MoreThis work addresses the situation where a black-box model with good predictive performance is chosen over its interpretable competitors, and we show interpretability is still achievable in this case. Our solution is to find an interpretable substitute on a subset of data where the black-box model is overkill or nearly overkill while leaving the rest to the black-box. This transparency is obtained at minimal cost or no cost of the predictive performance. Under this framework, we develop a Hybrid Rule Sets (HyRS) model that uses decision rules to capture the subspace of data where the rules are as accurate or almost as accurate as the black-box provided. To train a HyRS, we devise an efficient search algorithm that iteratively finds the optimal model and exploits theoretically grounded strategies to reduce computation. Our framework is agnostic to the black-box during training. Experiments on structured and text data show that HyRS obtains an effective trade-off between transparency and interpretability.
    T Wang, Q Lin
    Journal of Machine Learning Research (JMLR), 2021
    This project is the Finalist for the Best Paper Competition at 13th INFORMS Workshop on Data Mining & Decision Analytics, 2018.

  • Causal Rule Sets for Identifying Subgroups with Enhanced Treatment Effect (paperShow MoreWe introduce a novel generative model for interpretable subgroup analysis for causal inference applications, Causal Rule Sets (CRS). A CRS model uses a small set of short rules to capture a subgroup where the average treatment effect is elevated compared to the entire population. We present a Bayesian framework for learning a causal rule set. The Bayesian framework consists of a prior that favors simple models and a Bayesian logistic regression that characterizes the relation between outcomes, attributes, and subgroup membership. We find maximum a posteriori models using discrete Monte Carlo steps in the joint solution space of rules sets and parameters. We provide theoretically grounded heuristics and bounding strategies to improve search efficiency. Experiments show that the search algorithm can efficiently recover a true underlying subgroup and CRS shows consistently competitive performance compared to other state-of-the-art baseline methods.
    T Wang, C Rudin
    INFORMS Journal on Computing (IJOC), 2022

  • Interpretable Companions for Black-box Classifiers (paper) Show More We present an interpretable \emph{companion} model for any pre-trained black-box classifiers. The idea is that for any input, a user can decide to either receive a prediction from the black-box model, with high accuracy but no explanations, or employ a \emph{companion rule} to obtain an interpretable prediction with slightly lower accuracy. The companion model is trained from data and the predictions of the black-box model, with the objective combining area under the transparency–accuracy curve and model complexity. Our model provides flexible choices for practitioners who face the dilemma of choosing between always using interpretable models and always using black-box models for a predictive task, so users can, for any given input, take a step back to resort to an interpretable prediction if they find the predictive performance satisfying, or stick to the black-box model if the rules are unsatisfying. To show the value of companion models, we design a human evaluation on more than seventy people to investigate the tolerable accuracy loss to gain interpretability for humans.
    D Pan, T Wang, and S Hara
    International Conference on Artificial Intelligence and Statistics (AISTATS), 2020

  • Model-Agnostic Linear Competitors (paperShow MoreWe present the Model-Agnostic Linear Competitors (MALC) for partially interpretable multi-class classification. MALC is a hybrid model that uses linear models to locally substitute an (any) black-box model, capturing subspaces that are most likely to be in a class while leaving the rest of the data to the black box. MALC brings together the interpretable power of linear models and good predictive performance of a black-box model. We formulate the training of a MALC model as a convex optimization, where predictive accuracy and transparency (defined as the percentage of data captured by the linear models) are balanced through a carefully designed objective function, and solve it with the accelerated proximal gradient method. Experiments show that MALC can effectively trade prediction accuracy for transparency and provide an efficient frontier that spans the entire spectrum of transparency.
    H Rafique, T Wang*, Q Lin
    (* corresponding author)
    International Conference on Machine Learning (ICML), 2020
    Runner-up for Best Paper Award at INFORMS Workshop on Data Science, 2019.

  • Gaining Free or Low-Cost Transparency with Interpretable Partial Substitute (code) (paperShow MoreThis work addresses the situation where a black-box model outperforms all its interpretable competitors. The existing solution to understanding the black-box is to use an explainer model to generate explanations, which can be ambiguous and inconsistent. We propose an alternative solution by finding an interpretable substitute on a subset of data where the black-box model is overkill or nearly overkill and use this interpretable model to process this subset of data, leaving the rest to the black-box. This way, on this subset of data, the model gains complete interpretability and transparency to replace otherwise non-perfect approximations by an external explainer. This transparency is obtained at minimal cost or no cost of the predictive performance. Under this framework, we develop Partial Substitute Rules (PSR) model that uses decision rules to capture the subspace of data where the rules are as accurate or almost as accurate as the black-box provided. PSR is agnostic to the black-box model. To train a PSR, we devise an efficient search algorithm that iteratively finds the optimal model and exploits theoretically grounded strategies to reduce computation. Experiments on structured and text data show that PSR obtains an effective trade-off between transparency and interpretability.
    T Wang
    International Conference on Machine Learning (ICML), 2019

  • Multi-value Rule Sets for Interpretable Classification with Feature-Efficient Representations (code) (paper) Show MoreWe present the Multi-value Rule Set (MRS) for interpretable classification with feature efficient presentations. Compared to rule sets built from single-value rules, MRS adopts a more generalized form of association rules that allows multiple values in a condition. Rules of this form are more concise than classical single-value rules in capturing and describing patterns in data. Our formulation also pursues a higher efficiency of feature utilization, which reduces possible cost in data collection and storage. We propose a Bayesian framework for formulating an MRS model and develop an efficient inference method for learning a maximum a posteriori, incorporating theoretically grounded bounds to iteratively reduce the search space and improve the search efficiency. Experiments on synthetic and real-world data demonstrate that MRS models have significantly smaller complexity and fewer features than baseline models while being competitive in predictive accuracy. Human evaluations show that MRS is easier to understand and use compared to other rule-based models.
    T Wang
    Conference on Neural Information Processing Systems (NeurIPS), 2018

  • A Bayesian Framework for Learning Rule Sets for Interpretable Classification (code) (paperShow MoreWe present a machine learning algorithm for building classifiers that are comprised of a small number of short rules. These are restricted disjunctive normal form models. An example of a classifier of this form is as follows: If X satisfies (condition A AND condition B) OR (condition C) OR then Y=1. Models of this form have the advantage of being interpretable to human experts since they produce a set of rules that concisely describe a specific class. We present two probabilistic models with prior parameters that the user can set to encourage the model to have a desired size and shape, to conform with a domain-specific definition of interpretability. We provide a scalable MAP inference approach and develop theoretical bounds to reduce computation by iteratively pruning the search space. We apply our method (Bayesian Rule Sets — BRS) to characterize and predict user behavior with respect to in-vehicle context-aware personalized recommender systems. Our method has a major advantage over classical associative classification methods and decision trees in that it does not greedily grow the model.
    T Wang, C Rudin, F Doshi, Y Liu, E Klampfl, P MacNeille
    Journal of Machine Learning Research (JMLR), 2017 

  • Bayesian Rule Sets for Interpretable Classification (code) (paperShow MoreA Rule Set model consists of a small number of short rules for interpretable classification, where an instance is classified as positive if it satisfies at least one of the rules. The rule set provides reasons for predictions, and also descriptions of a particular class. We present a Bayesian framework for learning Rule Set models, with prior parameters that the user can set to encourage the model to have a desired size and shape in order to conform with a domain-specific definition of interpretability. We use an efficient inference approach for searching for the MAP solution and provide theoretical bounds to reduce computation. We apply Rule Set models to ten UCI data sets and compare the performance with other interpretable and non-interpretable models.
    T Wang, C Rudin, VD Finale, YI Liu, E Klampfl, P MacNeille
    The IEEE International Conference on Data Mining (ICDM), 2016

Machine Learning for Information Systems & Business Decision-Making


  • Can Your Toothpaste Shopping Predict Mutual Fund Purchasing? – Transfer Knowledge from Consumer Goods to Financial Products via Machine Learning (paper) Show More The rapid growth of E-commerce and the growing number of online customers have offered online retailers such as Amazon and Alibaba opportunities to extend their product categories. To better understand and predict customers’ purchasing behavior for products of new categories, it is natural to obtain information of customers’ prior browsing and shopping history from existing categories of products. However, there are several challenges when a new category is distinct from existing categories. First, it is unknown whether customer historical information is beneficial to predicting customer behavior in a new category that is distinct from existing categories. Second, the methodologies for such predictions are limited in industry, since customers have heterogeneous browsing histories involving different product categories. Third, it is unknown how to solve the cold-start issue in this context. To overcome these challenges, in this paper, we study whether and how customer historical information is transferable to a new category, which is distinct from existing categories, using one of the largest e-commerce platforms in China. Specifically, we use information from customers’ browsing history for shampoo, toothpaste, and washer to predict their mutual fund purchases. We propose two types of knowledge transfer, embedded feature transfer and model transfer. Results show that information extracted from the three products is beneficial for predicting customers’ purchases of mutual funds. Compared with direct feature engineering, embedded feature transfer achieves the best performance, and model transfer solves the cold-start issue.
    S Wang, T Wang, C He, Y Hu
    CIST, 2021
    Under Review

  • Same-Day Delivery with Fair Customer Service Show More The demand for same-day delivery (SDD) has increased rapidly in the last few years and has particularly boomed during the COVID-19 pandemic. Existing literature on the problem has focused on maximizing the utility, represented as the total number of expected requests served. However, a utility-driven solution results in unequal opportunities for customers to receive delivery service, raising questions about fairness. In this paper, we study the problem of achieving fairness in SDD. We construct a regional-level fairness constraint that ensures customers from different regions have an equal chance of being served. We develop a reinforcement learning model to learn policies that focus on both overall utility and fairness. Experimental results demonstrate the ability of our approach to mitigate the unfairness caused by geographic differences and constraints of resources, at both coarser and finer-grained level and with a small cost to utility. In addition, we simulate a real-world situation where the system is suddenly overwhelmed by a surge of requests, mimicking the COVID-19 scenario. Our model is robust to the systematic pressure and is able to maintain fairness with little compromise to the utility.
    X Chen, T Wang, B Thomas, MW Ulmer
    European Journal of Operations Research (EJOR), 2022

  • Evaluating the Effectiveness of Marketing Campaigns for Malls Using A Novel Interpretable Machine Learning Model (paper) Show MoreEvaluating the returns to marketing spending and designing optimal budget allocations have been challenging tasks for businesses. New data availability and novel machine learning methods provide new opportunities to significantly improve such decisions. The goal of this paper is to prescribe the optimal marketing budget allocation for a major shopping mall chain, specifying the timing, content, and budget allocated to different campaigns. We use a unique daily-level dataset on customer traffic and campaigns across 25 malls during a two-year period for the analysis. Then we propose a novel machine learning model, generalized additive model with a neural network term to learn the relationship between campaign budget and customer traffic. We classify the campaigns into different categories based on customer intentions, sales incentives, experience incentives, and online promotion conflicts and compare the ROI for each category. Results indicate that during the off-season, campaigns with experience incentives lead to larger increases in customer traffic than campaigns with sales incentives only; during the peak-season, campaigns with both incentives have similar impacts on customer traffic as those with experience incentives only. In addition, we find that malls can piggyback on online shopping promotion events and boost customer traffic with sufficient marketing spending in the same period. We further compute an optimal budget allocation scheme based on the prediction results. The optimization step yields an additional insight that malls should reduce budget during the peak-season and increase budget during the off-season to avoid over-marketing. Overall, we estimate that holding the total budget fixed, malls are expected to see an 11% increase in ROI from implementing the budget optimization.
    T Wang, C He, F Jin and Y Hu
    Information Systems Research, 2021

  • Nonverbal Cues in Text: An Algorithm for Automatic Coding of Textual Paralanguage Show MoreBrands and consumers alike have become creators and distributors of digital words, thus generating increasing interest in insights to be gained from text-based content. This work develops an algorithm to identify textual paralanguage, which are nonverbal parts of speech expressed in online communication. The textual paralanguage classifier (TPLC) is developed and validated utilizing social media data from Twitter and YouTube (N = 922,524 posts). Based in auditory, tactile, and visual properties of text, this tool detects nonverbal communication cues. These nonverbal cues are critical indicators of sentiment polarity and intensity, yet are often neglected by other word-based sentiment lexicons. We demonstrate the predictive power of automatically-detected textual paralanguage in its ability to predict consumer engagement over and above existing text analytic tools. This algorithm is designed for researchers, scholars, and practitioners seeking to optimize marketing communications and offers a methodological advancement to quantify the importance of not only what is said verbally, but how it is said nonverbally.
    A Luangrath, Y Xu, and T Wang

    Journal of Marketing Research, 2022

  • A Holistic Approach to Interpretability in Financial Lending: Models, Visualizations, and Summary-Explanations (paper) Show MoreWe propose a possible solution to a public challenge posed by the Fair Isaac Corporation (FICO), which is to provide an explainable model for credit risk assessment. Rather than present a black box model and explain it afterwards, we provide a globally interpretable model that is as accurate as other neural networks. Our “two-layer additive risk model” is decomposable into subscales, where each node in the second layer represents a meaningful subscale, and all of the nonlinearities are transparent. We provide three types of explanations that are simpler than, but consistent with, the global model. One of these explanation methods involves solving a minimum set cover problem to find high-support globally-consistent explanations. We present a new online visualization tool to allow users to explore the global model and its explanations.
    C Chen, K Lin, C Rudin, Y Shaposhnik, S Wang, T Wang
    (Authors are listed in an alphabetic order) 
    shorter version published by NIPS 2018 Workshop on Challenges and Opportunities for AI in Financial Services: the Impact of Fairness, Explainability, Accuracy, and Privacy 

    Decision Support Systems, 2021
    This project is the winner of FICO Recognition Award for the FICO xML Challenge, for building an interpretable model that beats black-box models. See the blog on FICO website about this entry.

  • A Crystal Ball for Product Success: Accurate and Early Predictions in Medical Crowdfunding with a New Deep Learning Approach (paper) Show MoreMedical crowdfunding has seen rapid growth in recent years and it has become a popular channel for people needing financial help. However, there exists large heterogeneity in donations across cases and fundraisers face significant uncertainty in whether their crowdfunding campaigns can meet fundraising goals. We aim to develop novel algorithms to provide accurate and timely predictions of fundraising performance, to better inform fundraisers. For this purpose, we use a combination of machine learning techniques to extract interpretable insights and provide accurate predictions. We start with a model using only the time-invariant features of cases, to provide an immediate evaluation of fundraising performance. Then we analyze the time-varying features from daily observations of case metrics, conduct a multivariate time series clustering and identify four typical temporal donation patterns. Finally, we incorporate the clustering patterns to design a deep learning model that provides daily updated predictions of the total amount of money fundraiser likely receive. Compared with baseline models, our model achieves better accuracy on average and requires a shorter observation window of the time-varying features from the campaign launch to provide robust predictions with high confidence. Our modeling approach can be applied to assist fundraisers’ decisions on promoting their campaigns better and can potentially help crowdfunding platforms design more customized suggestions to improve the chances of success for all cases. The proposed framework is generalizable to apply to other fields with both time-varying and time-invariant information.
    T Wang, F Jin, Y Hu, Y Cheng

    Under Review

  • Collaboration in the Digital Era: The Implications of Social Network Structures and Employee Turnover (paper)
    N Li, J Yu, T Wang
    Academy of Management Proceedings, 2018. 

Machine Learning for Social Good


  • Surviving COVID-19: Recovery Curves of Mall Traffic in China. (paper) Show MoreThe outbreak of COVID-19 has caused huge disruptions to the world economy. As a number of countries make progress in containing this outbreak, some of them have started to reopen their economy. We study the curves of recovery after reopening the economy, using a unique real-time dataset of daily customer traffic of 463 malls from 88 cities in China. Our results demonstrate that 9 weeks after reopening the economy, mall traffic has recovered to 64.0% of its level before this outbreak. In addition, the progress of containing this outbreak, such as reporting zero new local cases and clearing all existing cases, could significantly boost the recovery of mall traffic. Furthermore, We find that the recovery follows different curves across different cities, and this heterogeneity can be explained by pandemic situations, city tiers and city characteristics such as population, GDP, industrial structure, etc. More specifically, faster recovery speeds are observed in cities with better pandemic situations, lower city tiers, smaller migrant population, lower proportion of tertiary industry, higher proportion of secondary industry and higher GDP per capita.
    C He, T Wang, X Luo, Z Luo, J Guan, H Gao, K Zhu, L Feng, Y Xu, Y Cheng, Y Hu

  • Finding patterns with a rotten core: Data mining for crime series with cores (paperShow MoreOne of the most challenging problems facing crime analysts is that of identifying crime series, which are sets of crimes committed by the same individual or group. Detecting crime series can be an important step in predictive policing, as knowledge of a pattern can be of paramount importance toward finding the offenders or stopping the pattern. Currently, crime analysts detect crime series manually; our goal is to assist them by providing automated tools for discovering crime series from within a database of crimes. Our approach relies on a key hypothesis that each crime series possesses at least one core of crimes that are very similar to each other, which can be used to characterize the modus operandi (M.O.) of the criminal. Based on this assumption, as long as we find all of the cores in the database, we have found a piece of each crime series. We propose a subspace clustering method, where the subspace is the M.O. of the series. The method has three steps: We first construct a similarity graph to link crimes that are generally similar, second we find cores of crime using an integer linear programming approach, and third, we construct the rest of the crime series by merging cores to form the full crime series. To judge whether a set of crimes is indeed a core, we consider both pattern-general similarity, which can be learned from past crime series, and pattern-specific similarity, which is specific to the M.O. of the series and cannot be learned. Our method can be used for general pattern detection beyond crime series detection, as cores exist for patterns in many domains.
    T Wang, C Rudin, D Wagner, R Sevieri
    Big Data, 2015

  • Detecting patterns of crime with series finder  (code) (paper)  Show MoreMany crimes can happen every day in a major city, and figuring out which ones are committed by the same individual or group is an important and difficult data mining challenge. To do this, we propose a pattern detection algorithm called Series Finder, that grows a pattern of discovered crimes from within a database, starting from a “seed” of a few crimes. Series Finder incorporates both the common characteristics of all patterns and the unique aspects of each specific pattern. We compared Series Finder with classic clustering and classification models applied to crime analysis. It has promising results on a decade’s worth of crime pattern data from the Cambridge Police Department.
    T Wang, C Rudin, D Wagner, R Sevieri
    Twenty-Seventh AAAI Conference on Artificial Intelligence, Late Breaking track, 2013,

  • Learning to detect patterns of crime (code) (paperShow MoreOur goal is to automatically detect patterns of crime. Among a large set of crimes that happen every year in a major city, it is challenging, time-consuming, and labor-intensive for crime analysts to determine which ones may have been committed by the same individual(s). If automated, data-driven tools for crime pattern detection are made available to assist analysts, these tools could help police to better understand patterns of crime, leading to more precise attribution of past crimes, and the apprehension of suspects. To do this, we propose a pattern detection algorithm called Series Finder, that grows a pattern of discovered crimes from within a database, starting from a “seed” of a few crimes. Series Finder incorporates both the common characteristics of all patterns and the unique aspects of each specific pattern, and has had promising results on a decade’s worth of crime pattern data collected by the Crime Analysis Unit of the Cambridge Police Department.
    T Wang, C Rudin, D Wagner, R Sevieri
    Joint European conference on machine learning and knowledge discovery in databases (ECML), 2013
    – This project is the Second place winner of “Doing Good with Good OR”, INFORMS, 2015 and is reported in several media including WIRED.com and Wiki Crime Analysis.
    – Ideas from this work were implemented by the NYPD by Alex Chohlas-Wood and E.S. Levine in their algorithm Patternizr (Please see this paper for the implementation details), which operates live in New York City since 2019. 

  • Next Hit Predictor – Self-exciting Risk Modeling for Predicting Next Locations of Serial Crimes. Show MoreOur goal is to predict the location of the next crime in a crime series, based on the identified previous offenses in the series. We build a predictive model called Next Hit Predictor (NHP) that finds the most likely location of the next serial crime via a carefully designed risk model. The risk model follows the paradigm of a self-exciting point process which consists of a background crime risk and triggered risks stimulated by previous offenses in the series. Thus, NHP creates a risk map for a crime series at hand. To train the risk model, we formulate a convex learning objective that considers pairwise rankings of locations and use stochastic gradient descent to learn the optimal parameters. Next Hit Predictor incorporates both spatial-temporal features and geographical characteristics of prior crime locations in the series. Next Hit Predictor has demonstrated promising results on decades’ worth of serial crime data collected by the Crime Analysis Unit of the Cambridge Police Department in Massachusetts, USA.
    Y Li, T Wang
    AI for Social Good NIPS Workshop, 2018.

  • Humans in the Loop: Priors and Missingness on the Road to Prediction (paper)  Show MoreSurvey datasets are often wider than they are long. This high ratio of variables to observations raises concerns about overfitting during prediction, making informed variable selection important. Recent applications in computer science have sought to incorporate human knowledge into machine learning methods to address these problems. We implement such a “human-in-the-loop” approach in the Fragile Families Challenge. We use surveys to elicit knowledge from experts and laypeople about the importance of different variables to different outcomes. This strategy gives us the option to subset the data before prediction or to incorporate human knowledge as scores in prediction models, or both together. We find that human intervention is not obviously helpful. Human-informed subsetting reduces predictive performance and considered alone, approaches incorporating scores perform marginally worse than approaches which do not. However, incorporating human knowledge may still improve predictive performance, and future research should consider new ways of doing so.
    A Filippova, C Gilroy, R Kashyap, A Kirchner,A. C. Morgan, K Polimis, and T Wang
    (Authors are listed in an alphabetic order)
    Socius, 2019.

Machine Learning for Healthcare


  • Dental Anomaly Image Classification Using Transfer Learning and a Convolutional Neural Network
    R Ragodos*, T Wang*, G Wehby, S Weinberg, D Dawson, M Marazita, L Moreno Uribe, B Howe*

    (* Equal Contribution)
    Nature Scientific Reports, 2022

  • Does Having More Friends Reduce Cancer Risks? — Evidence from New Mobile-Based Health Data
    Y Cheng, Y Hu, F Jin, and T Wang
    (Authors are listed alphabetically)
    SCECR, 2019.

  • Use of Extracorporeal membrane oxygenation and Associated Outcomes in Children Hospitalized Due to Sepsis in the United States: A large population based study (paper)
    K Robb, A Badheka, T Wang, S Rampa, V Allareddy, V Allareddy
    PLOS ONE, 2019.

  • Outcomes Associated with Peripherally Inserted Central Catheters in Hospitalized Children: 7-year single center experience (paper)
    J Bloxham, A Badheka, A Schmitz, B Freyenberger, T Wang, S Rampa, V Allareddy, M Auslender, V Allareddy
    BMJ Open, 2019.

  • Interpretable Patient Mortality Prediction with Multi-value Rule Sets
    T Wang, V Allareddy, S Rampa, V Allareddy
    KDD Workshop on Machine Learning for Healthcare and Medicine, 2018.

  • Prevalence and predictors of C. difficile infections in hospitalized patients with major surgical procedures in the USA: Analysis using traditional and machine learning methods (paper)
    V Allareddy, Tong Wang, S Rampa, J Caplin, R Nalliah, A Badheka, V Allareddy
    American journal of surgery, 2018.
Publications was last modified: December 22nd, 2022 by Rachel Stewart