Explore The Unknown

My research focuses on 3D representation learning, video generation, self-supervised learning, few-shot learning, and domain adaption.

AI Research
2024 • Technical Blog

Recursive Temporal-Consistent Content Generation on Latent Variables via Alpha Diffusion Framework: Integrating Global and Local Contextual Modeling for 30-Second Sequences

Temporal-consistent Alpha blend diffusion Recursive generation

This paper introduces a novel framework for recursively generating temporally consistent content sequences of 30-second duration using an Alpha Diffusion architecture. By integrating global and local contextual modeling, our approach ensures coherence across temporal scales while maintaining high fidelity in content generation. The global context captures overarching structural patterns, while the local context refines fine-grained details, enabling seamless transitions and long-term consistency...

AI Research
2023 • Technical Blog

Diffusion-Based One-Shot Video Generation via Pose-Guided Temporal Consistency and Spatial Alignment

Temporal-consistent One-shot video generation Pose-guided

We propose a novel framework that integrates human pose information as a guiding mechanism for the T2V synthesis process. Our model leverages pose estimation and motion priors to ensure anatomically plausible and contextually relevant human movements, bridging the gap between textual semantics and visual dynamics...

AI Research
2022 • Technical Blog

Metacast

Sports Volumetric Capture Real-time 3D

Unity Metacast is a platform that uses volumetric technology, which encompasses the process of capturing, viewing, and interacting with the real-world, from moving people to static objects, in 3D. This content can then be viewed from any angle, at any moment in time, giving audiences the ability to see every bead of sweat, blow, takedown and submission, as if they were going toe-to-toe on the famous Octagon canvas themselves.

AI Research
2022 • Technical Blog

Metacast: 3D Human Pose Estimation

3D Human Pose Estimation Point Cloud Multi-person

We propose a novel point cloud-based framework for robust multi-person 3D pose estimation. Our method integrates three key innovations...

AI Research
2022 • Technical Blog

Metacast: Multi-View 3D Human Reconstruction

3D Human Reconstruction Multi-view Multi-person

To design and implement a robust deep learning framework for 3D dynamic scene reconstruction, leveraging multi-view camera systems, to model complex human interactions in densely populated environments.

AI Research
2021 • European Society for Medical Oncology

A deep radiomics approach to assess PD-L1 expression and clinical outcomes in patients with advanced non-small cell lung cancer treated with immune checkpoint inhibitors: A multicentric study

Feature Analysis Deep radiomics Transfer learning

Recent studies suggest imaging-based signatures can be a promising non-invasive approach to predict response to immune checkpoint inhibitors (ICI). In this multi-centric study, we explored the use of radiomics models to characterize the relationship between PD-L1 expression, imaging features, and clinical outcomes...

AI Research
2019 • 22th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI)

Task Adaptive Metric Space for Medium-Shot Medical Image Classification

Meta Learning Medium-shot Metric Adaptation

In the medical domain, one challenge of deep learning is to build sample-efficient models from a small number of labeled data. In recent years, meta-learning has become an important approach to few-shot image classification...

AI Research
2019 • Journal of Gastroenterology

Artificial Intelligence for Real-Time Multiple Polyp Detection with Identification, Tracking, and Optical Biopsy During Colonoscopy

Detection Tracking Real-time

Artificial Intelligence (AI) has shown great promise for colon polyp detection but high level clinical performance remains elusive. Rather than focus on simply improving AI-detection, we explored a novel approach to identify and “track” polyps in real-time.

AI Research
2019 • US Patent, Application US16/535,547 events

Apparatus and method for detecting, classifying and tracking road users on frames of video data

Classification Detection Tracking

An apparatus and a method for detecting, classifying and tracking road users on frames of video data are provided. An object detection unit identifies an object in a frame of the video data...

AI Research
2018 • Montreal AI Symposium (MAIS)

A Single Framework for Domain Adaptation and Generalization in Medical Image Analysis

Domain Adaptation MRI Meta Learning

Machine learning models perform best when tested on target (test) domains that are similar to the source (train) domains that they were trained on. However, model generalization can be hindered when there is significant covariate shift between the target and source domains. In this work, we employ a meta-learning framework to generalize across different imag- ing modalities viz. Flair, T1w, T1gd and T2w to segment lesions in brain volumes by learning domain agnostic features.

AI Research
2018 • Master Thesis • McGill University

Deep-learning-based Multiple Object Tracking in Traffic Surveillance Video

Optical Flow Tracking Kalman Filter

Multiple object tracking (MOT) is an important topic in the computer vision. One of its important applications is in traffic surveillance for examining potential risks for traffic intersections and providing analysis of road usages. In this thesis, we propose a powerful and efficient model for solving MOT problems under traffic surveillance environments...

Quantum Computing
2016 • 13th Conference on Computer and Robot Vision (CRV)

Generation of Spatial-temporal Panoramas with a Single Moving Camera

image stitching panorama VR

Development of image stitching techniques, which take multiple images and stitch them together to make natural looking panoramas, is an integral part of the new wave in visual media - the 360 surround displays, such as the Oculus Rift. However...