ABOUT ME


My name is Gustavo Cunha.

I am a self-disciplined, resilient, and proactive person.
I have a military academy background (Ex-Brazilian Navy officer) and a very curious and active mind.
Some time ago, I fell in love with data science and, since then, I've been focusing my energy and time on projects to solve business challenges using data science concepts and tools.

Currently, I work as a data scientist at Nubank, as an AI mentor at the "MIT Applied Data Science Program: Leveraging AI for Effective Decision-Making" and as an AI mentor at Social Good Brasil. I am also an AWS certified machine learning specialist and AWS certified cloud practitioner.

Besides, I am committed to developing my skills in areas closely related to applied data science such as product management, entrepreneurship, business and startups.

PROFESSIONAL TIMELINE

  • 2011: I was the first rank in the Brazilian Naval Academy admission process.
  • From 2012 until 2017: I joined the Brazilian Naval Academy as an aspirant in 2012 and quit the Brazilian Navy as a second lieutenant in 2017. As a naval military, I had the opportunity to embark on Brazilian naval vessels that made missions to many states in Brazil and 16 countries (South America, Africa, Europe, North America and Central America).
  • From 2017 until 2020: I decided to quit the Navy and took a break, investing my time and energy in one personal project.
  • Mid-2021: I passed the academic and physical exams of the Brazilian Federal Police and the Brazilian Federal Highway Police admission processes (both of them with more than 300,000 candidates).
  • June 2021: I started my studies in computer science and data science to become a data scientist.
  • September 2021: I started to work as a freelance teacher in a Brazilian public exam preparation school.
  • February 2022: among more than 8 thousand candidates and only 9 months after starting to code, I was the 36th rank in the Petrobras data scientist admission process. Petrobras is a state-owned Brazilian multinational corporation in the petroleum industry.
  • March 2022: I started doing internal projects for Comunidade DS as a data scientist.
  • April 2022: I did my first volunteer in data science (as a data miner on Scoutfy).
  • April 2022: I became a teacher assistant on Le Wagon data science bootcamps.
  • June 2022: I became a teacher on Le Wagon data science bootcamps.
  • July 2022: I started working as a data scientist at A3Data (a Brazilian data intelligence consultancy company).
  • November 2022: I became an AWS Certified Cloud Practitioner.
  • January 2023: I became an AWS Certified Machine Learning Specialist.
  • January 2023: I was the Machine Learning leader of a volunteering Sri Lanka Omdena project where ML was applied to predict and screen autism in children.
  • June 2023: I was the Machine Learning leader of a volunteering São Paulo Omdena project where ML was applied to predict subway passenger demand in São Paulo city.
  • July 2023: I started working as a data scientist at the Artificial Intelligence team of Hotmart (an international tech company, leader in digital products and focused on the creator economy).
  • August 2023: I became a teacher at Comunidade DS.
  • November 2023: I became a AI mentor on Social Good Brasil (a non-profit organization, partner of the United Nations Foundation, dedicated to data literacy for society).
  • February 2024: I became an AI mentor on Technovation Girl Florianópolis (it is an initiative that, through business and technology mentoring, empowers girls aged 10 to 18, from public schools, so that they solve real-world problems by building mobile Apps).
  • February 2024: I founded an AI consultancy, L.Ai.ght, so as to share my data science experience and help local small and mid-size companies bridge the business world with AI solutions.
  • April 2024: I was a tech mentor on Startup Weekend Florianópolis (it is a 54 hours weekend immersion where people from different areas -business, technology and creativity- come together with a common goal: validating a business idea and turning it into a potential startup).
  • July 2024: I lectured an AI workshop at Prototipando a Quebrada (it is a social project focused on technological education for underprivileged youth in the peripheral regions of Florianópolis).
  • August 2024: I presented “Demystifying Artificial Intelligence: A Practical Guide for Business Leaders” as a speaker at Hacktown (the most innovative festival in Latin America which features over 800 simultaneous activities on topics like technology, people, music, entrepreneurship, and the arts).
  • September 2024: I presented “Demystifying Artificial Intelligence: Practical Foundations for Proposing Realistic AI Solutions” as a speaker at AWS User Group Florianópolis (a community of AWS service users) and the recording is available on Youtube at this link.
  • September 2024: it was my second time as a tech mentor on Startup Weekend Florianópolis (it is a 54 hours weekend immersion where people from different areas -business, technology and creativity- come together with a common goal: validating a business idea and turning it into a potential startup).
  • November 2024: I started working as a data scientist at Nubank (one of the leading technology companies in the world as well as one of the world’s largest digital banking platforms, serving more than 100 million customers across Brazil, Mexico, and Colombia).
  • November 2024: In collaboration with Great Learning and the prestigious Massachusetts Institute of Technology (MIT), I started working as a data science and AI mentor at the "MIT Applied Data Science Program: Leveraging AI for Effective Decision-Making".
  • January 2025: I was one of the speakers at the "Data-Driven Opinion Panel" during the Python Floripa meeting, where I shared insights on navigating the data industry, discussed career pathways, and fostered professional development (Python Floripa is the open Python community in Florianópolis, SC, Brazil).
  • April 2025: I talked with graduating students in a Q&A session about careers in tech and data at the Faculdade Municipal de Palhoça (Faculdade Municipal de Palhoça is the public university of Palhoça, SC - Brazil).

EDUCATION

  • PM3 Product Analytics Bootcamp (Apr 2024): I attended this bootcamp to improve my ability to connect business problems with analytics solutions so as to deliver better results and business impact as a data professional.
  • PM3 Product Management Bootcamp (Mar 2024): I attended this bootcamp to increase my business understanding and improve my translation of data science results into business results.
  • Causal Inference and Personalization (Mar 2023): it's a hands-on five-projects course that explores a variety of causal inference techniques, one technique per project, to help optimize the discounting strategy of an e-commerce business. Finally, there was an online test and an interview with the project mentor to get the certification.
  • Le Wagon Data Science Bootcamp (Jan 2022 - Mar 2022): it teaches all the skills needed for a Data Scientist to solve real-world problems in 9 intensive weeks.
  • Comunidade DS (Aug 2021 - Dec 2021): it is a community focused on teaching data science by doing data science projects to solve real companies' problems.
  • Brazilian Naval Academy (Jan 2012 - Dec 2015): bachelor's degree in Naval Sciences. The Brazilian Naval Academy is the oldest higher education degree institution in Brazil.

TECHNICAL SKILLS

Data Extraction, Storage and Processing

  • SQL, Postgres, MySQL, SQLite
  • ElasticSearch
  • MongoDB
  • Python
  • Spark (PySpark API)

Statistics and Data Visualization

  • Descriptive statistics, cohort analysis, inferential statistics, causal inference, AB testing and survival analysis.
  • Matplotlib, Seaborn and Plotly
  • Streamlit, Metabase and PowerBI

Artificial Intelligence

  • Data cleaning, feature engineering, data preparation, dimensionality reduction, addressing class imbalance, feature selection and model tunning
  • Machine Learning, Deep Learning and LLM models for classification, regression, clustering, time series, NLP and multi-agents systems
  • Performance metrics to evaluate artificial inteligence algorithms as well as model explainability to understand model predictions

Development Tools and Deployment

  • Git, Github and Gitlab
  • Linuxand MacOS
  • Continuous Integrationand Continuous Deployment
  • Flask API and FastAPI
  • Docker, MLFlow, Airflow and Telegram bot
  • AWS, Google Cloud Platform and Streamlit Cloud

CERTIFICATIONS

SOFT SKILLS

  • Problem Solving
  • Communication
  • Initiative and Proactivity
  • Resilience
  • Analytical Thinking and Critical Thinking
  • Leadership
  • Self-motivation
  • Adaptability
  • Teamwork
  • Lifelong learning

PROBLEM SOLVING MINDSET

Problem Solving Checkpoints

November 2022

After reading many books, attending many courses and doing a bunch of data science projects, I felt the need to define how I should move from real-world problems to real-world solutions in a structured way.

So, the purpose of this brief material is to share my initial summary of how to structure a problem-solving strategy. I emphasize that it is just my initial MVP about this subject. In other words, it is not supposed to be a definitive solution, not even to replace any already tested framework!

I'm sharing this compilation so anyone interested in this topic can learn or remember something relevant to solve some real problem: if this happens somehow, I would be delighted!

DATA ENGINEERING PROJECTS

Synthetic Data Ingestion

July 2022

The idea is to create synthetic data regarding customer behaviour for two groups of customers: control and treatment. We would generate this behaviour with statistical distributions (e.g. Poisson and Gamma distributions) and would ingest both the created customer behaviour and the statistical distribution params in the data engineering architecture. The data would flow throughout the architecture, e.g. data ingestion layer, a bronze layer, a silver layer, etc. As the output, we would have the data regarding the customer behaviour and its statistical distribution blueprint.

Then, we could use A/B testing tools to check if there is a statistically significant difference between the control and the treatment groups. However, once we know the original distribution of both groups, we know if they are different or not, so we will be able to check if the A/B tests would give us the correct result of not (especially regarding type I and type II errors).

Tools:

  • SQL and Postgres.
  • Python.
  • Docker and Docker-compose.
  • Git, Github, Continuous Integration and Linux.
  • FastAPI and Airflow.
  • AWS: EC2, RDS, Lambda, DynamoDB and S3;

DATA SCIENCE PROJECTS

Bottomline

March 2022

We all live in a society that produces an overwhelming amount of information daily. Information per se is valuable but it's often very challenging to spotlight the essential part of it - the bottomline, so to say. This mental-filtering process can be very time consuming and also confusing sometimes.

With our technical solution, we provide an automated service that identifies the text's most relevant sentences so as to summarize the text. Additionally, the service provides the general sentiment (positive, neutral or negative) of the text. In other words, the final product will give the user a general idea about the text content as well as its most prominent sentiment.

Tools:

  • Python.
  • Git, Github, Gitlab and Linux.
  • Machine Learning and Deep Learning models for text summarization and sentiment analyses.
  • FastAPI.
  • Docker.
  • Google Cloud Platform and Heroku Cloud.
  • MongoDB.
  • Streamlit.

Fraud Detection

December 2021

Blocker Fraud Company is a company specialized in the detection of fraud in financial transactions made through mobile devices.
The company is expanding in Brazil and, to find new customers more quickly, it has adopted a very aggressive strategy.
The strategy works as follows:

  • The company will receive 25% of each transaction value that was correctly detected as fraud;
  • The company will receive 5% of each transaction value that was detected as a fraud despite being legitimate;
  • The company will return 100% of each transaction value that was detected as legitimate despite being a fraud.

The final solution includes a Power BI reporting dashboard with answers to business questions as well as a Docker container with API implementation, made with FasAPI and PySpark, and a MongoDB database with APIs requests saved for future analyses. The estimated profit using this solution is BRL 230,133,584.05.

Tools:

  • MongoDB.
  • Spark.
  • Git, Github, Gitlab and Linux.
  • Classification machine learning algorithms.
  • FastAPI.
  • Docker.
  • Power BI.

Insiders Project

November 2021

The All in One Place company is a multi-brand outlet company that sells second-line products of several brands at a lower price through e-commerce.
Within just one year of operation, the marketing team realized that some customers buy more expensive products with high frequency and contribute to a significant portion of the company's revenue.
This project aims to determine who are the customers eligible to participate in the Insiders program. Once this list is ready, the Marketing team will carry out a sequence of personalized and exclusive actions to this group of people to increase their sales and purchase frequency.

The final solution answers business questions, validates business hypotheses, creates a Metabase reporting dashboard and implements a solution architecture in the AWS cloud.

Tools:

  • SQL, SQLite & MySQL.
  • Python.
  • Git, Github, Gitlab and Linux.
  • Clustering machine learning algorithms.
  • Airflow.
  • AWS and Streamlit Cloud.
  • Metabase.

Sales prediction

October 2021

Rossmann is a company that operates over 3,000 drug stores in 7 European countries. Its products range includes up to 21,700 items and can vary depending on the size of the shop and the location.
Rossmann store managers need daily sales predictions for up to six weeks in advance so as to plan infrastructure investments in their stores (will the next six weeks' sales be high enough to balance infrastructure investment?).

The final solution for this problem is a Telegram bot where the user just needs to type the number of the store and the bot will quickly answer the sales prediction for this given store in the next six weeks.
Besides, if the final user wants more detailed information about this six weeks prediction, he (she) could get further details on a Streamlit data App, with an interactive chart, on sales prediction over these six weeks.
Furthermore, on this data App, the user can also read the entire project overview to understand further how this prediction is made.

Tools:

  • Python.
  • Git, Github, Gitlab and Linux.
  • Forecasting with regression machine learning algorithms.
  • Flask API.
  • Heroku and Telegram bot.
  • Streamlit.

Health Insurance Cross-Sell

September 2021

Insurance All is a health insurance company and its products team is analyzing the possibility of offering a new product, automobile insurance, for its health insurance clients.
Similar to its health insurance, customers of this new insurance plan would have to pay an annual plan to be insured by Insurance All in case of an eventual car accident or damage.

In this project, I developed a Machine Learning algorithm that increases the number of contacted interested customers by 1,316 and 2,259 for 20,000 and 40,000 sales teams contacts so that the estimated revenue increases are respectively U$ 131,600 and U$ 225,900.

Tools:

  • SQL, Postgres.
  • Python.
  • Git, Github, Gitlab and Linux.
  • Rank-to-learn machine learning algorithms.
  • Flask API.
  • Heroku and Google Sheets.

CONTACT

Feel free to contact me in case of questions about my projects, data science opportunities and any other reason you think is relevant ;)