Portfolio

Python coding projects

This project’s goal is to prepare prepare a prototype of a machine learning model for Zyfra. The company develops efficiency solutions for heavy industry. The model should predict the amount of gold recovered from gold ore. You have the data on extraction and purification. The model will help to optimize the production and eliminate unprofitable parameters.

Oil well drilling intelligence - ML Project

You work for the OilyGiant mining company. Your task is to find the best place for a new well. Steps to choose the location:

Collect the oil well parameters in the selected region: oil quality and volume of reserves;
Build a model for predicting the volume of reserves in the new wells;
Pick the oil wells with the highest estimated values;
Pick the region with the highest total profit for the selected oil wells.

You have data on oil samples from three regions. Parameters of each oil well in the region are already known. Build a model that will help to pick the region with the highest profit margin. Analyze potential profit and risks using the Bootstrapping technique.

Bank client analytics - ML Project

Beta Bank customers are leaving: little by little, chipping away every month. The bankers figured out it’s cheaper to save the existing customers rather than to attract new ones. We need to predict whether a customer will leave the bank soon. You have the data on clients’ past behavior and termination of contracts with the bank. Build a model with the maximum possible F1 score. To pass the project, you need an F1 score of at least 0.59. Check the F1 for the test set. Additionally, measure the AUC-ROC metric and compare it with the F1.

Telecom analytics - ML Project

Mobile carrier Megaline has found out that many of their subscribers use legacy plans. They want to develop a model that would analyze subscribers’ behavior and recommend one of Megaline’s newer plans: Smart or Ultra. You have access to behavior data about subscribers who have already switched to the new plans (from the project for the Statistical Data Analysis course). For this classification task, you need to develop a model that will pick the right plan. Since you’ve already performed the data preprocessing step, you can move straight to creating the model. Develop a model with the highest possible accuracy. In this project, the threshold for accuracy is 0.75. Check the accuracy using the test dataset.

Ride share app EDA

Exploratory data analysis for a ride share app. Tasks included testing the hypothesis: “The average duration of rides from the Loop to O’Hare International Airport changes on rainy Saturdays.”

Borrowers’ Risk of Defaulting

This project is to prepare a report for a bank’s loan division. We need to find out if a customer’s marital status and number of children has an impact on whether they will default on a loan. The bank already has some data on customers’ credit worthiness.

Telecom users analytics

The task is to analyze data for telecom operator Megaline. The company offers its clients two prepaid plans, Surf and Ultimate. The commercial department wants to know which of the plans is more profitable in order to adjust the advertising budget. The task is to carry out a preliminary analysis of the plans based on a relatively small client selection. We have the data on 500 Megaline clients: who the clients are, where they’re from, which plan they use, and the number of calls they made and text messages they sent in 2018. The job is to analyze clients’ behavior and determine which prepaid plan is more profitable.

Video game success patterns

Analysis of video game market to find patterns that determine a video game’s success.

Apartment adertisement insights

Uses data from a real estate agency. It is an archive of sales ads for realty in St. Petersburg, Russia, and the surrounding areas collected over the past few years. I determine the market value of real estate properties and define the parameters that increaase or decrease the cost. This will make it possible to build an automated system that is capable of detecting anomalies and fraudulent activity.

Page template forked from evanca