Artificial Intelligence 人工智能 AI
MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models
https://github.com/Vision-CAIR/MiniGPT-4
https://github.com/Vision-CAIR/MiniGPT-4
Stable Diffusion is a text-to-image AI model. It is trained on millions of image and text description pairs found on the internet. Because it has seen so much, the model understands what text description associates with what images. As a result, if you put in a prompt like “A Photo Read more…
CFG Scale Classifier Free Guidance scale is a parameter to control how much the model should respect your prompt. 1 – Mostly ignore your prompt.3 – Be more creative.7 – A good balance between following the prompt and freedom.15 – Adhere more to prompt.30 – Strictly follow the prompt. Below are Read more…
Two heads? Extra fingers? Here’s a guide to fix these common problems in Stable Diffusion Image Drawing. Two-head problems If you browse through AI image sites, it’s not unusual to see images with two heads connecting together in Stable Diffusion. It is usually caused by using portrait image size. Any Read more…
Build good prompts is the first step every Stable Diffusion user tackles. Anatomy of a good prompt A good prompt needs to be detailed and specific. A good process is to look through a list of keyword categories and decide whether you want to use any of them. The keyword Read more…
LoRA models are small Stable Diffusion models that apply tiny changes to standard checkpoint models. They are usually 10 to 100 times smaller than checkpoint models. That makes them very attractive to people having an extensive collection of models. What are LoRA models? LoRA (Low-Rank Adaptation) is a training technique Read more…
Hypernetwork is a fine-tuning technique initially developed by Novel AI, an early adopter of Stable Diffusion. It is a small neural network attached to a Stable Diffusion model to modify its style. Where is the small hypernetwork inserted? It is, of course, the most critical part of the Stable Diffusion model: Read more…
前言对于目前火热的ChatGPT,总是想多聊些,那就写点其前身的知识点吧。 GPT概述GPT(Generative Pre-trained Transformer)是OpenAI公司开发的关于自然语言处理的语言模型。这类模型在知识问答、文本摘要等方面的效果超群,更牛逼的是这居然都是无监督学习出来的模型。在很多任务上,GPT模型甚至不需要样本微调,就能在理解和执行效果上获得比当时最好的监督学习模型更好的性能。我们就此捋一下GPT三代的历程:GPT-1 Improving Lanugage Understanding by Generative Pre-trainingGPT-2 Language Models are UnsupervisedMultitask LearnersGPT-3 Language Models are Few Shot Learners提前假设大家都是了解NLP的术语和Transformer结构的,不清楚的可以自行补充知识。咱们一篇一篇地捋一捋,分别搞清楚基本目标和概念、训练的数据集、模型结构和应用、效果和评估,也就差不多了。先上个一揽子对比图,快乐下。 GPT-1代在此之前,大部分SOTA的NLP模型都是在特定任务上做有监督训练的,比如情感分类、文本含义等。通常来说,有监督是天然带有如下两个缺陷: 需要大量的标签数据来学习特定的任务,而这个打标签的过程是漫长而消耗财力的。特定任务专项训练,也带来了没法向其他任务场景迁移和拓展的问题。而这篇文章,提出了一个思路:用无标签数据来学习生成模型,然后根据下游任务做微调使用,比如像是分类、情感分析等。无监督学习作为有监督微调模型的预训练目标,因此被称为生成预训练。Unsupervised learning served as pre-training objective for supervised fine-tuned models, hence the name Generative Pre-training. GPT-1 学习目标和概念介绍无监督语言模型(Pre-training)目标如下:L 1 ( T ) = ∑ i l o g P Read more…
Stable Diffusion is a deep-learning model. We will dig deep into understanding how Stable Diffusion works. Stable Diffusion is a text-to-image model. Give it a text prompt. It will return an image matching the text. Diffusion model Stable Diffusion belongs to a class of deep learning models called diffusion models. They are generative models, Read more…
Many sampling methods are available in AUTOMATIC1111. Euler a, Huen, DDIM… What are samplers? How do they work? What is the difference between them? Which one should you use? You will find the answers in this article. We will discuss the samplers available in AUTOMATIC1111 Stable Diffusion GUI. What is Sampling? To Read more…