Yeah! I feel like the possibilities are endless here. Good point on what the Logit Lens would look like with multimodal models. My colleague Ana Crisan at Waterloo did some really cool work developing a tool to link individual tokens to attention maps (I don’t have the link handy now). There is so much more to do!!! (P.S. Thanks for leaving a comment! Much appreciated.)
Yeah! I feel like the possibilities are endless here. Good point on what the Logit Lens would look like with multimodal models. My colleague Ana Crisan at Waterloo did some really cool work developing a tool to link individual tokens to attention maps (I don’t have the link handy now). There is so much more to do!!! (P.S. Thanks for leaving a comment! Much appreciated.)