A motivated, adaptable, and diligent professional with more than 4 years of experience in the IT arena. A Big Data enthusiast certified in Data Science with 2+ years experience with data warehouse technical architectures and ETL(with technologies like Hadoop,Spark, Python)
Data Ingestion project for a Mining Company
(05/2020 - Present)
Migrating 10 years of historical data from SQL to Hive using Pyspark
Extracting ongoing data from Replication tools, building OGG tools to replicate the data and load it to the EDH layer using
Kafka streaming and HBase. Ensuring data timeliness of real-time data captures and handling deployment and bug fixes(if any). Neilsen -- leading global information & measurement company based in the US (2018 - 2020)
Media Reporting- Data Analytics Migration of reporting tool from Java code to Spark using Scala Deals with two divisions: Watch and Buy. I was a part of the Media Viewing team
Proposed and Completed Spec Optimization for spark migration which reduced development hours and increased
efficiency.
Apache Spark—Real-Time Project—Marketing Analysis (2017)
A Portuguese banking institution—ran a marketing campaign to convince potential customers to invest in bank term deposits. The marketing campaigns were based on phone calls. Often, the same customer was contacted more than once by phone,
in order to assess if they would want to subscribe to the bank
term deposit or not. Loading data and creating Spark data frames, calculating market success rate, failure rate, quality of customers, feature
engineering on the age of customers and their impact on a campaign based on real-time. Creating data pipelines.
