Portfolio
NLP Projects
Given a set of restaurant reviews, identify and extract words or targets belonging to the user sentiment presented in it. Use domain knowledge to enhance the performance of the topic model by setting prior probabilities for the keywords.
🆕 Content-Based Filtering: NLP Based Book Recommender Using BERT-Embeddings
I created a content-based book recommendation system that, given a book name, suggests books that are similar to it. The choice is made considering concise information of the book such as its theme, author, series, and summary of the description. The succinct data of keywords that is provided to the recommender system is generated using NLP techniques such as word embeddings. Keywords that most describe the book are extracted from the book description using BERT-embeddings, this word collection is further reduced using the frequentist feature extraction method TF-IDF that ranks the words based on their frequency in the book and the corpus.
Data Science Projects
Payment Default Prediction
Using the dataset containing a series of account statements having some pre-processed features representing a time frame, I developed a logistic regression modle to predict the probability of each customer (account holder) defaulting their next payment. The final working model is built into an AWS web application that accepts the account data and display statement Id and prediction of default for it.
Click-Stream Data Anomaly Detection
In this project I identify anomalous session activities based on the click-stream collected for sessions. All the session activities are recorded for each individual user (IP Address) on a single web page. Making use of a stochastic process called Markov model, implemented using R, I build a intuition about how a categorical temporal data such as click-stream changes over a time at discrete time stamps.
Predictive Modeling For Real Estate Properties.
Implementation and comparative analysis of various predictive models (such as Multiple Linear Regression,
Support Vector Machine, Regression Trees and Random Forest) to predict features like property rent price and cost of hosting extra people using real estate data.
Determine and model the performance metrics that influence selection of a player and suggest a set of players for aiding the team selection process. We used unsupervised machine learning technique k-Means clustering to categories players and then rank them in their assigned category to evaluate their chance of selection.
Risk Factor Prediction For Cardiac Disease
Using Fuzzy C-Means clustering algorithm on lipid profile data of patients to predict their likelihood of getting cardiac disease. Statistical analysis was carried out using R.
Publication
Updating Singular Value Decomposition for Rank One Matrix Perturbation
Implemented algorithm for updating Singular Value Decomposition (SVD) for rank-1 perturbed matrix using Fast Multipole Method (FMM) in
time, where
is the precision of computation.
© 2022 Amoli Rajgor. Powered by Jekyll and the Minimal Theme.