José Pedro Pinto
Background

Greetings!
I'm José Pedro Pinto

I'm a 25 year old master of data science
Currently working at INESC TEC as a machine learning researcher

Meet me!



Education and job experience CV

Bachelors

In 2019 I have completed the bachelors in Computer science at FCUP (faculdade de ciencias da universidade do porto). There was a strong focus in theoretical mathematical concepts and statistics, as well as general computer hardware and software knowledge. I completed this course with a final average of 16 (out of 20).

Masters

In 2021 I completed my masters degree in data science at FCUP (faculdade de ciencias da universidade do porto). In this course a strong focus was placed on machine learning, cloud computing and statistics. I finished with a final average of 18 (out of 20). For my dissertation a practical problem was approached, involving the identification of the most important power transformer gases for failure detection, finding a solution that significantly reduce operational costs. With my contributions to the field my dissertation was awarded a grade of 19 (out of 20).

HDP summer internship

On the summer of 2019 completed my first internship, in the german, Munich based, company HDP (health data pioneers). This was an internship on front end web development for a drag and drop form creation tool. I used javascript (ES6), REACT and docker. The MVP was finished shortly after starting, with posterior iterative improvements, both on terms of quantity as well as quality of features.

INESC TEC summer internship

On the summer of 2020 I completed my second internship, in INESC TEC, a portuguese company. The goal of the internship was the use of deep learning models as well as hyperparameter tuning and transfer learning methodologies to classify point clouds from LIDAR. This goal was accomplished with very good results, especially, taking into consideration the limitations both in terms of the dataset and available computational resources.

Efacec transformer 4.0 project

From 2020 until now I integrated in a multidisciplinary Efacec led project, transformer 4.0, which aims at at bringing the power transformer landscape into the new age of the IOT, continuous online monitoring and machine learning applications. My role was that of machine learning specialist, in which I developed a subset selection system, created a machine learning based synthetic data generation system and deployed a cluster computing machine learning infrastructure using rabbitMQ, kafka, mongoDB, mariaBD, postgreSQL and pySpark. The results from these tasks were shared in the form of multiple scientific papers and technical reports.

PROJECTS

I have, in my studies, internships and on my free time, developed a large amount of projects of varying size and content.
Below I present some of the ones I consider most interesting.
Unfortunately, due to the proprietary nature of the code and data I am unable to show content from some projects.

Masters in Data Science Thesis

Background

This thesis was developed as part of the research grant provided by the Efacec transformer 4.0 project. It entails the development of a solution for a novel problem in the field; that of identifying the most important Dissolved Gas Analysis (DGA) gases for fault prediction in power transformers.

Synthetic Data Generator

Background

An important part of my involvement in the transformer 4.0 project was the development of a machine learning driven tabular data generator. This generator works by randomly sampling rows and columns of a provided dataset, followed by data imputation techniques to fill the missing values.

DGA Paper

Background

While working on my master thesis, I wrote a paper on the same topic. This paper with the title "Optimal Gas Subset Selection for Dissolved Gas Analysis in Power Transformers" is very simply a more compact version of the thesis and was submitted to the International Journal of Health Prognostics (IJPM).

Synthetic Data Generator Paper

Background

Much like for other projects during my work at INESC TEC a paper related to the developed synthetic data generator was written. This paper showcasing the framework, innovations such as imputation-based generation and proposed API is currently under consideration for publishing.

Fault Diagnosis Paper

Background

A smaller component from my work at INESC was the study of different power transformer problems and existing solutions. From this a literature review detailing power transformer diagnosis methodologies was published in "Engineering Applications of Artificial Intelligence".

BIM point clouds Paper

Background

During my 2020 summer internship I worked with deep learning for computer vision in the task of point cloud classification. Later I made contributions to a paper entitled "Exploiting BIM objects for synthetic data generation towards in door point classification using deep learning", detailing my approach to this problem and obtained results.

Farfetch brand recommender system

Background

This was the last project of the advanced topics in data science course where anonymized Farfetch data was utilized for the development of a brand recommendation system. Association rules, collaborative filtering and an open ended student selected approach (for which I selected LSTMs) were utilized for this task.

Time Series prediction with LSTMs

Background

This was the first project of DDDM, where we had to predict for an unknown difficult time series. We could use any method we wanted, as only the final predictions counted for grade and as such I used LSTMs, obtaining the best results in class.

Optimization problem using AMPL

Background

This was the third project of DDDM, where we had to solve a difficult optimization problem in a restricted amount of computational time.
In this facility location problem my solution surpassed all others in class (including the professor's), being the only one obtaining the optimal value.

Connect 4 web app

Background

This was the final project of the TW course in my bachelors.
It involved the creation of a full stack connect 4 web game, with base functionality, AI, highscores, local scores and more.

Python pygame bundle

Background

This was one of the first projects I worked on.
With the intent of learning some programing, having some visual feedback and also a bit of fun, I created a compilation of small games and applications in python using pygame.

Want to get in touch?

More Information

For more information, check my github page, send me an email or give me a call.