Tejaswini CN - Portfolio

My Projects

Diabetes Prediction Using Machine Learning and Scalable Analytics

Developed a system to predict diabetes using Streamlit that allows users to input health-related parameters and predicts the likelihood of diabetes using a pre-trained machine learning model.

Python Streamlit Pandas EDA Numpy PySpark Gradient Boosting Random Forest

View Code

Reddit ETL Pipeline

Developed an automated Reddit ETL pipeline using Apache Airflow, AWS Glue, Redshift, and Python to extract and analyze vendor-related discussions from Reddit. The pipeline pulls data via the Reddit API, performs sentiment analysis and topic tagging, and stores the results in Redshift for reporting. I built dashboards in Tableau to help compliance and procurement teams monitor vendor perception in real time, reducing manual review by 80% and enabling faster, data-driven decisions. This project demonstrates my ability to integrate cloud services, orchestrate complex workflows, and turn unstructured data into actionable business insights.

Python Airflow AWS Glue AWS S3 Redshift Athena Tableau Reddit API PostgreSQL

View Code

Football Database Management System

Football Database Management System (DBMS) developed to store, manage, and retrieve football-related data such as player details, team information, matches, statistics, and more. The system utilizes SQL (MySQL) for database creation and management, along with a Python-based command-line interface for interaction.

SQL HTML CSS Database Design Schema PostgreSQL Normalization Query Performance Tuning

View Code

Detection of Phishing websites

This project focuses on detecting phishing websites using multiple machine learning models. Various algorithms, including Logistic Regression, K-Nearest Neighbors (KNN), Decision Trees, Random Forest, Support Vector Machines (SVM), and more, are used to classify websites as phishing or legitimate based on several features.

R Programming Machine Learning Linear Regression KNN SVM Random Forest Decision Tree Feature Extraction

View Code

Land Mines Detection

This project applies machine learning techniques to classify landmines using environmental and sensor data from the UCI Machine Learning Repository. Several classification models, including Logistic Regression, Decision Trees, Random Forest, K-Nearest Neighbors, Naive Bayes, and Support Vector Machines (SVM), were tested. The best-performing model, SVM with an RBF kernel, achieved an accuracy of 61.62%.

R Programming EDA Logistic Regression Decision Tree Random Forest KNN Naive Bayes SVM Grid Search Cross Validation

View Code

Hospital Length of Stay Prediction

This project is part of the Healthcare Analytics course, where we explore hospital inpatient data during the COVID-19 pandemic. The analysis includes preprocessing datasets, building predictive models to estimate hospital length of stay (HLOS), and conducting trials using both MLP and RNN models.

Python KNN Deep Learning Multi layer Perceptron Recurrent Neural Networks

View Code

British Airways Dashboard

Developed an interactive dashboard using Tableau for British Airways Data.

Tableau EDA Interactive Dashboard

View Dashboard

Customer Analysis Dashboard

Developed an interactive dashboard using Tableau for Customer Sales Data.

Tableau EDA Interactive Dashboard

View Dashboard

Netflix Dashboard

Developed an interactive dashboard using Tableau for Netflix data.

Tableau EDA Interactive Dashboard

View Dashboard

About Me

Currently exploring the world of data science as I pursued my Master's in Engineering Science (Data Science) at the University at Buffalo (UB). With a background in software engineering, I’ve always been captivated by the intersection of technology and real-world problem-solving. My journey is fueled by a passion for using data to create meaningful change, whether it's driving equitable resource use or tackling societal challenges as an ELISS Fellow.

Beyond crunching numbers and building models, I’m a lifelong learner with a love for discovery, always on the lookout for new ways to expand my horizons. When I’m not immersed in code or analytics, you’ll probably find me on a fitness adventure or exploring new destinations. This portfolio is a glimpse into my world—where curiosity meets innovation!

Education

Master of Science in Data Science

University at Buffalo, Buffalo, NY, USA (January 2024 – May 2025)

GPA: 3.667/4.0

Coursework: Python, R, Machine Learning, Statistical Data Mining, Numerical Mathematics, Probability, Data Intensive computing, Cloud Computing, Data Models and Query Language, Big Data Tools

Bachelor of Technology in Mechanical Engineering

Jawaharlal Nehru Technological University, Anantapur, Andhra Pradesh, India (2018-2022)

GPA: 8.0/10

Coursework: C/C++ Programming, Data Structures and Algorithms, Calculus, Probability, Statistical Mathematics, Operations Management

Experience

Junior Data Engineer, Cognizant Technology Solutions Corporation

Hyderabad, Telangana (Oct 2022 – Dec 2023)

Contributed to building data pipelines in Databricks to process healthcare data from AWS S3 and on-premise sources, supporting unified access for analytics teams Cleaned and transformed large datasets using PySpark and SQL, and implemented SCD Type 2 logic under guidance to track historical changes accurately Assisted in automating daily ingestion workflows using AWS Glue and Airflow, which helped reduce manual data refresh time by ~40% Developed Python and Excel-based scripts to validate key metrics and flag anomalies across staging and curated layers Applied row- and column-level filters to safeguard PII data in compliance with HIPAA guidelines, working closely with senior engineers and QA teams

Programmer Analyst Trainee - Big Data Intern, Cognizant Technology Solutions Corporation

Hyderabad, India (Mar 2022 – July 2022)

Completed extensive training in Big Data and Data Engineering, covering key concepts like data wrangling, pipeline design, SQL querying, and cloud fundamentals Built and delivered a real-time team project involving integration and cleaning of large datasets, implementation of data quality checks, and creation of summary reports for analysis Gained hands-on experience in SQL and Python by working on data preprocessing tasks and performing exploratory analysis to identify patterns and inconsistencies Collaborated closely with teammates to evaluate basic machine learning models and contribute to project documentation, ensuring clear communication and structured delivery Demonstrated commitment to continuous learning by actively participating in use case-based learning modules and completing all assigned project milestones on time.

Hello, I'm Tejaswini

My Projects

Diabetes Prediction Using Machine Learning and Scalable Analytics

Reddit ETL Pipeline

Football Database Management System

Detection of Phishing websites

Land Mines Detection

Hospital Length of Stay Prediction

British Airways Dashboard

Customer Analysis Dashboard

Netflix Dashboard

About Me

Education

Master of Science in Data Science

Bachelor of Technology in Mechanical Engineering

Experience

Junior Data Engineer, Cognizant Technology Solutions Corporation

Programmer Analyst Trainee - Big Data Intern, Cognizant Technology Solutions Corporation

Skills

Certifications

Databricks Apache Spark Developer Associate

AWS Certified Solutions Architect Associate

AWS Certified Cloud Practitioner

Google Data Analytics Professional Certificate

Get In Touch

Contact Information

Phone

Email

Location