Deep learning for vision and applications

Goals

The objective of this training is to learn about the latest Deep Learning architectures applied to computer vision specifically in the areas of pose detection, object detection, segmentation, depth extraction and the generation of 3D models with a practical theoretical approach.

Program

1. Introduction to computer vision

What is an image?
Multispectral imaging
Data sources
Practical examples

2. Rasterized data

Express data in matrix format
Graphs
Practical examples

3. Classical computer vision algorithmiy

Filters
Feature extraction
Statisticians
Advantages and disadvantages
Practical examples

4. Satellite images

Geolocated images
State of the art architectures
Practical examples

5. Advanced deep learning architectures

State of the art of Deep Learning in vision (Review of popular architectures, latest trends)
Models based on Transformers (Vision Transformer)
Practical examples

6. Pose detection

Bottom-up and top-down approaches
2D pose detection
3D pose detection

7. Reevant frameworks in the state of the art

Object detection
Segmentation (Segment Anything Model, SAM)
Depth and normal extraction
Transfer learning and fine-tuning of pre-trained models
Practical examples

8. 3D Reconstruction

State of the art in 3D reconstruction
Practical examples

Who is this course aimed at?

It is aimed at professional environments with an intermediate-advanced level in the field of computer vision and Deep Learning.

Requirements:

Machine Learning Fundamentals: Understand the concepts and paradigms of supervised and unsupervised learning, including the difference between them and how they are applied to vision problems.
Fundamentals of Deep Learning applied to vision: Be familiar with the basic concepts of deep neural networks and their application in computer vision problems, as well as know the fundamentals of Convolutional Neural Networks (CNN) and the most used architectures.
Image processing: Knowledge of image processing, including concepts such as convolution, stride operation, padding, normalization, color spaces, application of filters, etc.
Python and Deep Learning Libraries: Basic knowledge of Python programming and familiarity with popular Deep Learning libraries such as PyTorch.
gmail account
Laptop

Modality

Face to face

Faculty

Gorka Labarta – Big Data and Cognitive Systems Team of the Technological Institute of Aragon.
Rafael del Hoyo – Big Data and Cognitive Systems Team of the Technological Institute of Aragon.
Francisco Lacueva – Big Data and Cognitive Systems Team of the Technological Institute of Aragon.
David Abadía – Big Data and Cognitive Systems Team of the Technological Institute of Aragón.
Carlos Marañes – Big Data and Cognitive Systems Team of the Technological Institute of Aragón.

Time, date and place

Total duration: 12 hours
Dates: October 15, 16, 22 and 23, 2024
Hours: 4 to 7 p.m.
Place: Technological Institute of Aragon. C/ María de Luna, 7 (white building), 50018 Zaragoza
Maximum number of attendees: 15 people