Skills

85,427

Reusable AI skills and capabilities for agent workflows

Sort:

Showing packages with source repositories & descriptions only

Category:

Stable Diffusion Akashic Records | Maks-s/sd-akashic

A compendium of information regarding Stable Diffusion (SD)

SkillInbox: Stable Diffusion

1 dir

rinongal/textual_inversion

repo contains the official code, data and sample inversions of Textual Inversion paper

SkillHypertechniques

1 dir

deforum-art/sd-webui-deforum

Deforum extension for AUTOMATIC1111's Stable Diffusion webui [[wiki docs]](https://github.com/deforum-art/sd-webui-deforum/wiki)

SkillHypertechniques

1 dir

GitHub - Sanster/lama-cleaner

Image inpainting tool powered by SOTA AI Model

SkillCreative Uses of Generative AI Image Synthesis Tools

1 dir

AgaMiko/pixel_character_generator

Generating retro pixel game characters with Generative Adversarial Networks. Dataset "TinyHero" included.

SkillCreative Uses of Generative AI Image Synthesis Tools

1 dir

TencentARC/GFPGAN

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration

SkillImage Restoration

1 dir

AILab-CVC/VideoCrafter

Open Diffusion Models for High-Quality Video Generation

SkillVideo and Animation

1 dir

THUDM/CogVideo

text-to-video generation

SkillVideo and Animation

1 dir

baowenbo/DAIN

Depth-Aware Video Frame Interpolation (CVPR 2019)

SkillVideo and Animation

1 dir

lucidrains/musiclm-pytorch

Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch

SkillAudio and Music

1 dir

MubertAI/Mubert-Text-to-Music

a simple notebook demonstrating prompt-based music generation via Mubert API

SkillAudio and Music

1 dir

magenta/magenta

Magenta's official GitHub repository

SkillAudio and Music

1 dir

p0n1/epub_to_audiobook

EPUB to audiobook converter, optimized for Audiobookshelf

SkillText-to-speech (TTS) and avatars

1 dir

Shaunwei/RealChar

AI Character/Companion in Realtime

SkillText-to-speech (TTS) and avatars

1 dir

KangweiiLiu/Awesome_Audio-driven_Talking-Face-Generation

A curated list of resources of audio-driven talking face generation

SkillText-to-speech (TTS) and avatars

1 dir

neonbjb/tortoise-tts

"A multi-voice TTS system trained with an emphasis on quality"

SkillText-to-speech (TTS) and avatars

1 dir

ggerganov/whisper.cpp

Port of OpenAI's Whisper model in C/C++. It can be executed locally.

SkillSpeech-to-text (STT) and spoken content analysis

1 dir

shashikg/WhisperS2T

An Optimized Speech-to-Text Pipeline for the Whisper Model

SkillSpeech-to-text (STT) and spoken content analysis

1 dir

Vaibhavs10/insanely-fast-whisper

accelerates transcription with the combination of OpenAI's Whisper Large v2, HF Transformers, Optimum, and flash attention

SkillSpeech-to-text (STT) and spoken content analysis

1 dir

facebookresearch/seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

SkillSpeech-to-text (STT) and spoken content analysis

1 dir

BradyFU/Awesome-Multimodal-Large-Language-Models

Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.

SkillMultimodal

1 dir

roboflow/awesome-openai-vision-api-experiments

Examples showing how to use the OpenAI vision API to run inference on images, video files and webcam streams

SkillMultimodal

1 dir

facebookresearch/ImageBind

ImageBind One Embedding Space to Bind Them All

SkillMultimodal Embedding Space

1 dir

gabolsgabs/DALI

a large Dataset of synchronised Audio, LyrIcs and vocal notes

SkillDatasets

1 dir