Papers - Data Phoenix

YOLO-World: Real-Time Open-Vocabulary Object Detection

YOLO-World boosts YOLO with open-vocabulary detection via vision-language modeling, pre-training on large datasets. Efficiently detects objects zero-shot, outperforming state-of-the-art in accuracy and speed.

Feb 12, 2024

by Sophia

Papers

OMG-Seg: Is One Model Good Enough For All Segmentation?

OMG-Seg is One Model that is Good enough to efficiently and effectively handle all the segmentation tasks, including image semantic, instance, and panoptic segmentation, as well as their video counterparts, open vocabulary settings, prompt-driven, interactive segmentation.

Feb 08, 2024

by Sophia

Papers

InstantID : Zero-shot Identity-Preserving Generation in Seconds

InstantID, powered by diffusion models, offers plug-and-play image personalization in various styles using one facial image, ensuring high fidelity. It demonstrates remarkable efficiency and performance, making it highly beneficial for applications requiring identity preservation.

Feb 05, 2024

by Sophia

Papers

Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors

QA4RE is a framework that aligns RE with question answering (QA). It enables LLMs to outperform strong zero-shot baselines by a large margin. This work illustrates a promising way of adapting LLMs to challenging tasks by aligning these tasks with more common instruction-tuning tasks like QA.

May 28, 2023

by Sophia

Papers

Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold

This paper explores a new way of controlling GANs that includes "dragging" any points of the image to reach target points in a user-interactive manner - DragGAN. It can help to deform an image with precise control over where pixels go, thus manipulating the pose, shape, expression, etc.

May 26, 2023

by Sophia

Papers

Segment Anything Model

SAM is a promptable segmentation system with zero-shot generalization to unfamiliar objects and images, without the need for additional training. The model was trained on Meta AI’s SA-1B dataset for 3-5 days on 256 A100 GPUs. Make sure that you try it!

May 17, 2023

by Sophia

Papers

BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects

BundleSDF is a near real-time method for 6-DoF tracking of an unknown object from a monocular RGBD video sequence, while performing neural 3D reconstruction of the object. The method significantly outperforms existing approaches.

May 14, 2023

by Sophia

Papers

Neural Preset for Color Style Transfer

Neural Preset is a technique that uses AI to generate and transfer color styles. It can extract color styles from given reference images, store them as presets, and apply them to other images and videos, producing output with target color styles. Check it out!

May 12, 2023

by Sophia

Papers

BloombergGPT: A Large Language Model for Finance

BloombergGPT is a 50 billion parameter language model that is trained on a wide range of financial data. It is validated on standard LLM benchmarks, open financial benchmarks, and a suite of internal benchmarks that most accurately reflect our intended usage.

Apr 25, 2023

by Sophia

Papers

S-NeRF: Neural Radiance Fields for Street Views

In this paper, the authors propose a new street-view NeRF (S-NeRF) that considers novel view synthesis of both the large-scale background scenes and the foreground moving vehicles jointly. Learn more about their approach and the results of experiments!

Mar 27, 2023

by Sophia

Papers

Universal Guidance for Diffusion Models

Typical diffusion models cannot be conditioned on other modalities without retraining. This work presents a universal guidance algorithm that enables diffusion models to be controlled by arbitrary guidance modalities without the need to retrain any use-specific components.

Mar 17, 2023

by Sophia

Papers

3D Generation on ImageNet

In this paper, the authors develop a 3D generator with Generic Priors (3DGP): a 3D synthesis framework with more general assumptions about the training data, and show that it scales to challenging datasets, like ImageNet. It is based on three new ideas. Learn them!

Mar 07, 2023

by Sophia