

Implemented models on Tensorflow and optimized them to perform more readable text summaries
Used three model architectures T5, BART and Pegasus derived from academic NLP papers
Validated the models using ROUGE scores and perplexity graphs
Achieved highly accurate and well performing summaries using PEGASUS
Co-wrote research paper and published it in the Drexel Research Journal
The goal of the project is to predict, from the chosen parameters, whether the respondent will have a heart attack
Cleaned dataset using Pandas and visualized graphs in order to find the ideal parameters to be used in the models
Added Min-Max Scaler to standardize unbalanced data from the dataset
Used hyperparameter tuning in conjunction with Optimal thresholding to increase the true positives rates
The models that were used are: KNN, Logistic Regression, Random Forest, Decision Tree
Achieved 92% accuracy using Logistic Regression Classification
Completed full project using Spark and Google Colab Pro for the Nvidia GPU accelerator support
Created visualizations with Pandas to find the most optimal parameters to run the ML models
Implemented Youden’s J Statistic for better threshold determination and increased model accuracy
Conducted predictions using 5 models: logistic regression, naïve bayes, random forest, decision tree and linear SVC
Applied hyper parameter tuning via SparkML to optimize the model further to decrease the false positives rate
Achieved 95% accuracy using Random Forest Classification
Imported data from the IMDB datasets and selected the CSV files corresponding to the needed data
Cleaned selected datasets using Pandas and created visualizations using matplotlib
Selected the most important parameters and created a new dataframe in order to be used for ML programs
Created a pipeline and ran ML algorithms to recommend similar movies to the selected movie
Compared similar movie results to Netflix recommendations and it was deemed to be very accurate
Conducted web scraping using BeautifulSoup to retrieve daily-updated CSV files from OpenDataPhilly website.
Extracted text from news articles using BeautifulSoup and retrieved keywords from each article
Conducted NLP sentiment analysis to locate the news articles that deal with crime in Philadelphia
Concatenated the two datasets and created a new comprehensive database for gun violence in Philadelphia
Created heatmaps on the resulting dataset using the GeoPandas and Shapely libraries
Designed a Flask powered interactive website where the user-provided stock ticker symbol will show current and future projections of stock price for that company
Created a Python script that scrapes data using BeautifulSoup from Yahoo Finance when the user enters a symbol
Implemented TensorFlow/Keras, SVM, and KNN Grid Search algorithms to create predictions on the future stock market prices for the selected companies
Created a website using Flask that tracks the amount of COVID cases in the US (updated daily)
Stored the data using PostgreSQL and connected the data to Flask using SQLAlchemy
Implemented bar charts and histograms using Leaflet to display COVID total cases, recoveries, and deaths per filter
Created an interactive bubble map using Leaflet to display COVID information by state
Utilized TensorFlow to generate predictive models for the spread of COVID in the US over the course of 2021