Deep learning for
vision and applications

Goals

The objective of this training is to learn about the latest Deep Learning architectures applied to computer vision specifically in the areas of pose detection, object detection, segmentation, depth extraction and the generation of 3D models with a practical theoretical approach.

Program

1. Advanced Deep Learning Architectures (1,5 hours)

  • State of the art of Deep Learning in vision (Review of popular architectures, latest trends)
  • Models based on Transformers (Vision Transformer)
  • Practical examples

2. Pose detection 

  • Bottom-up and top-down approaches
  • 2D pose detection
  • 3D pose detection
  • Practical examples

3. Relevant frameworks in the state of the art 

  • Object detection
  • Segmentation (Segment Anything Model, SAM)
  • Depth and normal extraction
  • Transfer learning and fine-tuning of pre-trained models
  • Practical examples

4. 3D reconstruction 

  • State of the art in 3D reconstruction with NeRF (Neural Radiance Fields)
  • Practical examples

Who is this course aimed at?

This course  is aimed at professional and/or academic environments with an intermediate-advanced level in the field of computer vision and Deep Learning.

Requirements:

  • Machine Learning Fundamentals: Understand the concepts and paradigms of supervised and unsupervised learning, including the difference between them and how they are applied to vision problems.
  • Fundamentals of Deep Learning applied to vision: Be familiar with the basic concepts of deep neural networks and their application in computer vision problems, as well as know the fundamentals of Convolutional Neural Networks (CNN) and the most used architectures.
  • Image processing: Knowledge of image processing, including concepts such as convolution, stride operation, padding, normalization, color spaces, application of filters, etc.
  • Python and Deep Learning Libraries: Basic knowledge of Python programming and familiarity with popular Deep Learning libraries such as PyTorch.
  • Gmail account
  • Own computer

Faculty

  • David Abadía – Big Data and Cognitive Systems Team of the Technological Institute of Aragón.
  • Carlos Marañés – Big Data and Cognitive Systems Team of the Technological Institute of Aragón.
  • Gorka Labarta – Big Data and Cognitive Systems Team of the Technological Institute of Aragón.
  • Rafael del Hoyo – Big Data and Cognitive Systems Team of the Technological Institute of Aragón

Time, date and place

  • Total duration: 12 h
  • Dates: 15, 15, 22, 27 October 2024
  • Hours: 4 to 7 p.m.
  • Place: Technological Institute of Aragon. C/ María de Luna, 7 (white building), 50018 Zaragoza
  • Maximum number of attendees: 15 people

Registration