Improved representation learning for semantic role labeling
Semantic role labeling (SRL) is a shallow semantic parsing task that helps determine
who did what to whom at where by recovering the latent
predicate-argument structure of the sentence. SRL is a fundamental problem in NLP that is
also useful in applications such as question answering, machine translation and information
extraction. I've been working on reducing labeling errors on a state-of-the art SRL model
called Linguistically-Informed Self Attention (LISA)
developed at UMass by Emma Strubell et al. LISA is a
neural network model that performs multi-task learning across dependency parsing,
part-of-speech tagging, predicate detection and SRL.
Error analysis on the model revealed that if labeling errors alone were fixed, the score
would improve by 5.8 absolute F1. That is, the predicates and arguments were identified
correctly in several cases, but classified wrong. My analysis further showed that 31% of the
labeling errors were due to core argument confusion in the PropBank label set (verb-specific
roles from ARG0-ARG5). While PropBank is a useful semantic formalism, it defines
coarse-grained labels which aren't strictly associated with a role. Since the meaning of a
role changes across different predicates, it can be difficult for the model to learn these
roles. To fix this, we're augmenting PropBank labels with finer-grained VerbNet labels by
first predicting VerbNet roles and using the predictions to compose an auxiliary role
representation which is then utilized for PropBank SRL.
|
Jun 2018 - Present
|
Efficient Graph-Based Word Sense Induction
arxiv | poster
This project was undertaken with the hypothesis that resolving polysemy would help improve
sentiment analysis . Polysemy is the phenomenon of a
single word having multiple senses, like bank the financial institution and
bank as in river bank. The task of selecting the right sense is called word sense
disambiguation, while the unsupervised discovery of latent senses is called word sense
induction (WSI). We developed an efficient method to perform word sense
induction using graph-based clustering.
Typically, graph-based clustering methods for WSI construct an 'ego-network' by finding the
nearest neighbors of the target word in the word-embedding space. However, this can be
computationally expensive if the graph is large, so we instead proposed to group words into
basis indexes that resemble topics, and then construct a graph in which each node is a basis
index relevant to the topic word. To obtain these basis indexes, we make use of
Distributional Inclusion Vector Embeddings(DIVE) developed by Haw-Shiuan Chang et al.
Sense clusters are
then obtained by clustering basis indexes using spectral clustering. We represent each sense
cluster by a sense embedding, which is the average of the topic embeddings in the cluster,
weighted by its relevance to the target word.
We then perform expectation-maximization to
refine the sense embeddings, where the E-step is replacing all words in the corpus with the
sense it represents, and the M-step is to retrain word embeddings using the induced senses.
Our method beats the previous state-of-the-art on several word context relevance tasks while
producing more interpretable sense clusters more efficiently. While we haven't yet been able
to correlate better word sense disambiguation with improvement in sentiment analysis, we
plan to get back to this task in the near future.
|
Feb 2018 - Apr 2018
|
Low-shot visual recognition for faces
report | code
Low shot learning, or the ability to learn from a small
number of examples, is a relevant problem in the domain of
facial recognition, where access to training data is limited by cost and privacy issues
We explored a solution based on data augmentation by hallucinating
new examples, which was found to work well
for
classification on ImageNet by researchers at Facebook AI Research. We used a subset of MS-Celeb-1
for training data.
In the low-shot
learning set-up, there are a fixed number of base classes, for
which a large number of training examples are available,
and then there are novel classes, for which a limited number
of training examples are available. The classifier is then evaluated based on its ability to
correctly classify both the base and novel classes. Our data augmentation method creates new
examples for the novel classes in the following way: The features of each base class are
grouped
into clusters using K-means clustering. The difference between two clusters can be
considered a transformation, such as a front-facing image to a side-facing image. All
transformations are mined from all base classes and used to train a generator. Then, for
each image in the novel class, a set of transformations are applied on the base class to
generate new examples. We compare to a baseline where images are generated by naive
approaches such as jittering, and achieve a significant improvement.
|
Mar 2018 - Apr 2018
|
Visual Place Recognition
report
As a course project for Computer Vision, I worked on automatically identifying the location
of a place given only an image and no other metadata. We compared three
machine learning models which use different kinds of features
and evaluate their suitability for this task. The first
approach uses a color histogram to represent the image; the
second approach uses the GIST global descriptor as image
features; the third approach uses raw images with a convolutional
neural network without explicitly extracting features. We used the Google Street View dataset,
which contains 62058 images from three cities: Pittsburgh, Orlando, and New York.
|
Oct 2017 - Dec 2017
|
The Sound of Sirens
This was a project that I worked on over 36 hours at HackUMass 2017 along with some cool
undergrads that I met at the venue. We built a signaling system that could alert hearing
impaired drivers if a vehicle with sirens is in the vicinity. To detect sirens, we trained a
neural network on the UrbanSounds dataset, by extracting features such as the short-time
Fourier transform, mel spectogram, and contrast. Our model achieved a test time accuracy of
over 91%. We then interfaced with a Myo wristband using the Myo-SDK to make the wristband
vibrate every time a siren was detected.
|
Nov 2017 - Nov 2017
|
CEGAR-based tool for specifying system properties
I worked on this project in collaboration with the Institute of Mathematical Sciences,
Chennai, advised by Prof. Ramanujam and Prof. Sheerazuddin. Our objective was to build a
tool
to make model checking more accessible to software designers. Model checking is a technique
for formal verification of concurrent or distributed systems, to ensure that the system
meets specifications such as safety and liveness. Automated model checkers like NuSMV and
SPIN check if the program satisfies the specified properties by taking in a description of
the program and the properties. However, the properties typically have to be specified as
formulae in Linear Temporal Logic (LTL), which makes it hard to do model checking at the
design stage.
To solve this problem, we built a tool based on the Counter-Example Guided Abstraction
Refinement mechanism that generates LTL formulae when the property is specified in the form
of a sequence diagram. This is done iteratively to help the user specify the property
correctly while concurrently identifying problem areas in the system. That is, the initial
'draft' of the formula is fed to the model checker along with the system description. If the
property is violated, the model checker generates a trace, which is a sequence of states and
actions where the system violates the property, either because of a bug, or because the
property was incorrectly specified. To identify if it's the latter, we convert the trace
back to a diagram which intuitively shows the possible sequence of messages that caused a
failure. In this manner, the user can iteratively improve the system and the specification.
|
Jan 2017 - May 2017
|
Detecting variability in multi-word expressions
As a research intern at the Computational
Linguistics Lab of Nara Institute of
Science and Technology, Japan, I worked on multi-word expressions (MWE) with Professor Yuji Matsumoto. MWEs are made up
of two or more words but tend to act as a single lexical unit, such as by the way.
While the above expression is fixed, and always occurs exactly in one form, some MWEs may be
flexible and thus harder to deal with, such as under the circumstances occurring as
under the specific circumstances. My project was to
automatically detect flexible-type multi word expressions (MWE) in English, and compile them
into a dictionary to enable MWE-aware POS-tagging.
Starting with a candidate list of over 2600 MWEs, I implemented a rule based system to
detect occurrences of each MWE and its possible variations in the LDC GigaWord corpus. The
initial
rules were as follows: allow up to two intervening words at all positions in the MWE (for
the MWE a number of, also look for a large number of and a very large
number of), allow interchangeable articles and pronouns (apple of his/her/their
eye), allow plural forms of nouns, and tense variations in verbs, etc. I then counted
the usage of the original MWE in comparison with its modified versions and imposed a
threshold to classify each as fixed or flexible, as well as to prune rules that didn't work
well.
|
Jun 2016 - Jul 2016
|
Sentiment analysis for foreign exchange trading
As a data science intern at Serendio Inc., I worked on developing a machine learning model
to gauge expert opinion on trends in currency exchange using sentiment analysis. The task
was to predict whether the sentiment about a specific currency pair, such as USD/EUR was
bullish (favors buying) or bearish (favors selling) based on posts on financial forums such
as Bloomberg, moneycontrol, etc. To gather training data, we scraped posts from StockTwits,
a forum on which users post their opinion on stock market trends and tag them as bullish or
bearish. We then built an ensemble of multiple machine learning models to do sentiment
classification. Specifically, I implemented a Naive-Bayes classifier which achieved an
accuracy of over 94% on the validation set.
|
Jan 2016 - Feb 2016
|
|