Amoli Rajgor

Logo

Resume | LinkedIn | Researchgate | GitLab | GitHub

Data scientist with 3+ years of experience in designing, developing and deploying data science projects. My abilities as a Data Scientist are rooted in a sturdy education in mathematics.

I began with a bachelor's degree in Information Technology. My keen interest in statistics and analytics, led me to pursue Master’s degree in Computer Science & Engineering with specialization in Data Science and Analytics.

Portfolio


NLP Projects

🚧 🆕 Semisupervised Aspect Extraction: Demonstration of Topic Modelling Techniques on Restaurant Reviews

stability-wip Open Notebook

Given a set of restaurant reviews, identify and extract words or targets belonging to the user sentiment presented in it. Use domain knowledge to enhance the performance of the topic model by setting prior probabilities for the keywords.




🆕 Content-Based Filtering: NLP Based Book Recommender Using BERT-Embeddings

Open Notebook View on GitHub

I created a content-based book recommendation system that, given a book name, suggests books that are similar to it. The choice is made considering concise information of the book such as its theme, author, series, and summary of the description. The succinct data of keywords that is provided to the recommender system is generated using NLP techniques such as word embeddings. Keywords that most describe the book are extracted from the book description using BERT-embeddings, this word collection is further reduced using the frequentist feature extraction method TF-IDF that ranks the words based on their frequency in the book and the corpus.




Data Science Projects

Payment Default Prediction

Open Web App Open Notebook View on GitLab

Using the dataset containing a series of account statements having some pre-processed features representing a time frame, I developed a logistic regression modle to predict the probability of each customer (account holder) defaulting their next payment. The final working model is built into an AWS web application that accepts the account data and display statement Id and prediction of default for it.




Click-Stream Data Anomaly Detection

Read Blog View on GitLab

In this project I identify anomalous session activities based on the click-stream collected for sessions. All the session activities are recorded for each individual user (IP Address) on a single web page. Making use of a stochastic process called Markov model, implemented using R, I build a intuition about how a categorical temporal data such as click-stream changes over a time at discrete time stamps.




Predictive Modeling For Real Estate Properties.

Read Blog View on GitLab

Implementation and comparative analysis of various predictive models (such as Multiple Linear Regression, Support Vector Machine, Regression Trees and Random Forest) to predict features like property rent price and cost of hosting extra people using real estate data.




Performance Based Recommendation Of IPL Playing XI Team

Read Blog

Determine and model the performance metrics that influence selection of a player and suggest a set of players for aiding the team selection process. We used unsupervised machine learning technique k-Means clustering to categories players and then rank them in their assigned category to evaluate their chance of selection.




Risk Factor Prediction For Cardiac Disease

Using Fuzzy C-Means clustering algorithm on lipid profile data of patients to predict their likelihood of getting cardiac disease. Statistical analysis was carried out using R.




Publication

Updating Singular Value Decomposition for Rank One Matrix Perturbation

Read Paper View on GitHub

Implemented algorithm for updating Singular Value Decomposition (SVD) for rank-1 perturbed matrix using Fast Multipole Method (FMM) in Equation time, where Equation is the precision of computation.



© 2022 Amoli Rajgor. Powered by Jekyll and the Minimal Theme.