understanding black box predictions via influence functions

understanding black box predictions via influence functions understanding black box predictions via influence functions

Posted at h in 2006 ynt identification center by

For more details please see sample. . Datta, A., Sen, S., and Zick, Y. Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems. We'll then consider how the gradient noise in SGD optimization can contribute an implicit regularization effect, Bayesian or non-Bayesian. Here, we used CIFAR-10 as dataset. Cook, R. D. Detection of influential observation in linear regression. Despite its simplicity, linear regression provides a surprising amount of insight into neural net training. Loss , . Huang, L., Joseph, A. D., Nelson, B., Rubinstein, B. I., and Tygar, J. Adversarial machine learning. The canonical example in machine learning is hyperparameter optimization. In this paper, we use influence functions --- a classic technique from robust statistics --- Understanding Black-box Predictions via Influence Functions initial value of the Hessian during the s_test calculation, this is Neural tangent kernel: Convergence and generalization in neural networks. Neither is it the sort of theory class where we prove theorems for the sake of proving theorems. (a) train loss, Hessian, train_loss + Hessian . influence function. More details can be found in the project handout. Theano: A Python framework for fast computation of mathematical expressions. Dependencies: Numpy/Scipy/Scikit-learn/Pandas S. McCandish, J. Kaplan, D. Amodei, and the OpenAI Dota Team. Appendix: Understanding Black-box Predictions via Inuence Functions Pang Wei Koh1Percy Liang1 Deriving the inuence functionIup,params For completeness, we provide a standard derivation of theinuence functionIup,params in the context of loss minimiza-tion (M-estimation). A. S. Benjamin, D. Rolnick, and K. P. Kording. G. Zhang, S. Sun, D. Duvenaud, and R. Grosse. the algorithm will then calculate the influence functions for all images by Highly overparameterized models can behave very differently from more traditional underparameterized ones. The meta-optimizer has to confront many of the same challenges we've been dealing with in this course, so we can apply the insights to reverse engineer the solutions it picks. Understanding Black-box Predictions via Influence Functions In this paper, we use influence functions -- a classic technique from robust statistics -- to trace a model's prediction through the learning algorithm and back to its training data, thereby . We show that even on non-convex and non-differentiable models where the theory breaks down, approximations to influence functions can still provide valuable information. The datasets for the experiments can also be found at the Codalab link. In this paper, we use influence functions -- a classic technique from robust statistics -- to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. . Proceedings of Machine Learning Research | Proceedings of the 34th . Koh P, Liang P, 2017. A. Thus, we can see that different models learn more from different images. The answers boil down to an observation that neural net training seems to have two distinct phases: a small-batch, noise-dominated phase, and a large-batch, curvature-dominated one. In this paper, we use influence functions -- a classic technique from robust statistics -- to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. Or we might just train a flexible architecture on lots of data and find that it has surprising reasoning abilities, as happened with GPT3. We'll consider bilevel optimization in the context of the ideas covered thus far in the course. Why neural nets generalize despite their enormous capacity is intimiately tied to the dynamics of training. Acknowledgements The authors of the conference paper 'Understanding Black-box Predictions via Influence Functions' Pang Wei Koh et al. All Holdings within the ACM Digital Library. A. M. Saxe, J. L. McClelland, and S. Ganguli. Understanding black-box predictions via influence functions Understanding Blackbox Prediction via Influence Functions - SlideShare We'll cover first-order Taylor approximations (gradients, directional derivatives) and second-order approximations (Hessian) for neural nets. Gradient-based hyperparameter optimization through reversible learning. S. L. Smith, B. Dherin, D. Barrett, and S. De. Thus, in the calc_img_wise mode, we throw away all grad_z The security of latent Dirichlet allocation. Koh, Pang Wei. As a result, the practical success of neural nets has outpaced our ability to understand how they work. If the influence function is calculated for multiple Google Scholar In Artificial Intelligence and Statistics (AISTATS), pages 3382-3390, 2019. The second mode is called calc_all_grad_then_test and To manage your alert preferences, click on the button below. We show that even on non-convex and non-differentiable models where the theory breaks down, approximations to influence functions can still provide valuable information. Systems often become easier to analyze in the limit. He, M. Narayanan, S. Gershman, B. Kim, and F. Doshi-Velez. While one grad_z is used to estimate the In, Mei, S. and Zhu, X. TL;DR: The recommended way is using calc_img_wise unless you have a crazy Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., and Darrell, T. Decaf: A deep convolutional activation feature for generic visual recognition. 2019. There are various full-featured deep learning frameworks built on top of JAX and designed to resemble other frameworks you might be familiar with, such as PyTorch or Keras. A. This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. Liu, D. C. and Nocedal, J. Kansagara, D., Englander, H., Salanitro, A., Kagen, D., Theobald, C., Freeman, M., and Kripalani, S. Risk prediction models for hospital readmission: a systematic review. Then, it'll calculate all s_test values and save those to disk. We try to understand the effects they have on the dynamics and identify some gotchas in building deep learning systems. We look at what additional failures can arise in the multi-agent setting, such as rotation dynamics, and ways to deal with them. The idea is to compute the parameter change if z were upweighted by some small , giving us new parameters ^,z argmin(1 )1 nn i=1L(zi,)+L(z,). Goodfellow, I. J., Shlens, J., and Szegedy, C. Explaining and harnessing adversarial examples. Influence functions efficiently estimate the effect of removing a single training data point on a model's learned parameters. Understanding Black-box Predictions via Influence Functions The first mode is called calc_img_wise, during which the two 10.5 Influential Instances | Interpretable Machine Learning - GitHub Pages How can we explain the predictions of a black-box model? The project proposal is due on Feb 17, and is primarily a way for us to give you feedback on your project idea. Students are encouraged to attend synchronous lectures to ask questions, but may also attend office hours or use Piazza. Interpreting black box predictions using Fisher kernels. So far, we've assumed gradient descent optimization, but we can get faster convergence by considering more general dynamics, in particular momentum. Understanding black-box predictions via influence functions. I am grateful to my supervisor Tasnim Azad Abir sir, for his . Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. when calculating the influence of that single image. We have two ways of measuring influence: Our first option is to delete the instance from the training data, retrain the model on the reduced training dataset and observe the difference in the model parameters or predictions (either individually or over the complete dataset). Reference Understanding Black-box Predictions via Influence Functions ( , ?) Are you sure you want to create this branch? Disentangled graph convolutional networks. ordered by harmfulness. The marking scheme is as follows: The problem set will give you a chance to practice the content of the first three lectures, and will be due on Feb 10. To run the tests, further requirements are: You can either install this package directly through pip: Calculating the influence of the individual samples of your training dataset Understanding Black-box Predictions via Influence Functions - YouTube AboutPressCopyrightContact usCreatorsAdvertiseDevelopersTermsPrivacyPolicy & SafetyHow YouTube worksTest new features 2022. Pang Wei Koh and Percy Liang. C. Maddison, D. Paulin, Y.-W. Teh, B. O'Donoghue, and A. Doucet. International conference on machine learning, 1885-1894, 2017. Automatically creates outdir folder to prevent runtime error, Merge branch 'expectopatronum-update-readme', Understanding Black-box Predictions via Influence Functions, import it as a package after it's in your, Combined, the original paper suggests that. insignificant. Goodman, B. and Flaxman, S. European union regulations on algorithmic decision-making and a "right to explanation". Understanding Black-box Predictions via Influence Functions To scale up influence functions to modern machine learning settings, we develop a simple, efficient implementation that requires only oracle access to gradients and Hessian-vector products. Springenberg, J. T., Dosovitskiy, A., Brox, T., and Riedmiller, M. Striving for simplicity: The all convolutional net. Pang Wei Koh, Percy Liang; Proceedings of the 34th International Conference on Machine Learning, . Understanding Black-box Predictions via Influence Functions (2017) 1. Deep learning via hessian-free optimization. You signed in with another tab or window. , mislabel . The Imagenet classification with deep convolutional neural networks. In, Mei, S. and Zhu, X. We show that even on non-convex and non-differentiable models where the theory breaks down, approximations to influence functions can still provide valuable information. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1885--1894. lage2019evaluationI. Requirements chainer v3: It uses FunctionHook. Cook, R. D. and Weisberg, S. Characterizations of an empirical influence function for detecting influential cases in regression. Z. Kolter, and A. Talwalkar. You can get the default config by calling ptif.get_default_config(). Terry Taewoong Um (terry.t.um@gmail.com) University of Waterloo Department of Electrical & Computer Engineering Terry T. Um UNDERSTANDING BLACK-BOX PRED -ICTION VIA INFLUENCE FUNCTIONS 1 nimarb/pytorch_influence_functions - Github The datasets for the experiments can also be found at the Codalab link. most harmful. We are given training points z 1;:::;z n, where z i= (x i;y i) 2 XY . Frenay, B. and Verleysen, M. Classification in the presence of label noise: a survey. While this class draws upon ideas from optimization, it's not an optimization class. Limitations of the empirical Fisher approximation for natural gradient descent. Influence functions can of course also be used for data other than images, /Length 5088 In order to have any hope of understanding the solutions it comes up with, we need to understand the problems. This site last compiled Wed, 08 Feb 2023 10:43:27 +0000. In. %PDF-1.5 >> To get the correct test outcome of ship, the Helpful images from ICML 2017 Best Paper - But keep in mind that some of the key concepts in this course, such as directional derivatives or Hessian-vector products, might not be so straightforward to use in some frameworks. With the rapid adoption of machine learning systems in sensitive applications, there is an increasing need to make black-box models explainable. J. Lucas, S. Sun, R. Zemel, and R. Grosse. When can we take advantage of parallelism to train neural nets? Influence functions are a classic technique from robust statistics to identify the training points most responsible for a given prediction. Thomas, W. and Cook, R. D. Assessing influence on predictions from generalized linear models. This is a PyTorch reimplementation of Influence Functions from the ICML2017 best paper: Understanding Black-box Predictions via Influence Functions by Pang Wei Koh and Percy Liang. ICML 2017 best paperStanfordPang Wei KohCourseraStanfordNIPS 2019influence functionPercy Liang11Michael Jordan, , \hat{\theta}_{\epsilon, z} \stackrel{\text { def }}{=} \arg \min _{\theta \in \Theta} \frac{1}{n} \sum_{i=1}^{n} L\left(z_{i}, \theta\right)+\epsilon L(z, \theta), \left.\mathcal{I}_{\text {up, params }}(z) \stackrel{\text { def }}{=} \frac{d \hat{\theta}_{\epsilon, z}}{d \epsilon}\right|_{\epsilon=0}=-H_{\tilde{\theta}}^{-1} \nabla_{\theta} L(z, \hat{\theta}), , loss, \begin{aligned} \mathcal{I}_{\text {up, loss }}\left(z, z_{\text {test }}\right) &\left.\stackrel{\text { def }}{=} \frac{d L\left(z_{\text {test }}, \hat{\theta}_{\epsilon, z}\right)}{d \epsilon}\right|_{\epsilon=0} \\ &=\left.\nabla_{\theta} L\left(z_{\text {test }}, \hat{\theta}\right)^{\top} \frac{d \hat{\theta}_{\epsilon, z}}{d \epsilon}\right|_{\epsilon=0} \\ &=-\nabla_{\theta} L\left(z_{\text {test }}, \hat{\theta}\right)^{\top} H_{\hat{\theta}}^{-1} \nabla_{\theta} L(z, \hat{\theta}) \end{aligned}, \varepsilon=-1/n , z=(x,y) \\ z_{\delta} \stackrel{\text { def }}{=}(x+\delta, y), \hat{\theta}_{\epsilon, z_{\delta},-z} \stackrel{\text { def }}{=}\arg \min _{\theta \in \Theta} \frac{1}{n} \sum_{i=1}^{n} L\left(z_{i}, \theta\right)+\epsilon L\left(z_{\delta}, \theta\right)-\epsilon L(z, \theta), \begin{aligned}\left.\frac{d \hat{\theta}_{\epsilon, z_{\delta},-z}}{d \epsilon}\right|_{\epsilon=0} &=\mathcal{I}_{\text {up params }}\left(z_{\delta}\right)-\mathcal{I}_{\text {up, params }}(z) \\ &=-H_{\hat{\theta}}^{-1}\left(\nabla_{\theta} L(z_{\delta}, \hat{\theta})-\nabla_{\theta} L(z, \hat{\theta})\right) \end{aligned}, \varepsilon \delta \deltaloss, \left.\frac{d \hat{\theta}_{\epsilon, z_{\delta},-z}}{d \epsilon}\right|_{\epsilon=0} \approx-H_{\hat{\theta}}^{-1}\left[\nabla_{x} \nabla_{\theta} L(z, \hat{\theta})\right] \delta, \hat{\theta}_{z_{i},-z}-\hat{\theta} \approx-\frac{1}{n} H_{\hat{\theta}}^{-1}\left[\nabla_{x} \nabla_{\theta} L(z, \hat{\theta})\right] \delta, \begin{aligned} \mathcal{I}_{\text {pert,loss }}\left(z, z_{\text {test }}\right)^{\top} &\left.\stackrel{\text { def }}{=} \nabla_{\delta} L\left(z_{\text {test }}, \hat{\theta}_{z_{\delta},-z}\right)^{\top}\right|_{\delta=0} \\ &=-\nabla_{\theta} L\left(z_{\text {test }}, \hat{\theta}\right)^{\top} H_{\hat{\theta}}^{-1} \nabla_{x} \nabla_{\theta} L(z, \hat{\theta}) \end{aligned}, train lossH \mathcal{I}_{\text {up, loss }}\left(z, z_{\text {test }}\right) , -y_{\text {test }} y \cdot \sigma\left(-y_{\text {test }} \theta^{\top} x_{\text {test }}\right) \cdot \sigma\left(-y \theta^{\top} x\right) \cdot x_{\text {test }}^{\top} H_{\hat{\theta}}^{-1} x, influence functiondebug training datatraining point \mathcal{I}_{\text {up, loss }}\left(z, z_{\text {test }}\right) losstraining pointtraining point, Stochastic estimationHHHTFO(np)np, ImageNetdogfish900Inception v3SVM with RBF kernel, poisoning attackinfluence function59157%77%10590/591, attackRelated worktraining set attackadversarial example, influence functionbad case debug, labelinfluence function, \mathcal{I}_{\text {up,loss }}\left(z_{i}, z_{i}\right) , 10%labelinfluence functiontrain lossrandom, \mathcal{I}_{\text {up, loss }}\left(z, z_{\text {test }}\right), \mathcal{I}_{\text {up,loss }}\left(z_{i}, z_{i}\right), \mathcal{I}_{\text {pert,loss }}\left(z, z_{\text {test }}\right)^{\top}, H_{\hat{\theta}}^{-1} \nabla_{x} \nabla_{\theta} L(z, \hat{\theta}), Less Is Better: Unweighted Data Subsampling via Influence Function, influence functionleave-one-out retraining, 0.86H, SVMhinge loss0.95, straightforwardbest paper, influence functionloss.

Zander Fish Taste, Articles U

understanding black box predictions via influence functionsunderstanding black box predictions via influence functions

understanding black box predictions via influence functions understanding black box predictions via influence functions