From b3a8a53f383649a4060776346e601814c78df9d4 Mon Sep 17 00:00:00 2001 From: Victor Hall Date: Sun, 9 Jun 2024 11:51:03 -0400 Subject: [PATCH] update docs --- doc/ADVANCED_TWEAKING.md | 2 +- doc/CAPTION_COG.md | 12 ++++++++++++ 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/doc/ADVANCED_TWEAKING.md b/doc/ADVANCED_TWEAKING.md index 0aa1768..ea3103e 100644 --- a/doc/ADVANCED_TWEAKING.md +++ b/doc/ADVANCED_TWEAKING.md @@ -320,7 +320,7 @@ While the calculation makes sense in how it compensates for inteval and total tr If you use `ema_strength_target` the actual calculated `ema_decay_rate` used will be printed in your logs, and you should pay attention to this value and use it to inform your future decisions on EMA tuning. -[Experimental results](https://discord.com/channels/1026983422431862825/1150790432897388556) for EMA on Discord. +[Experimental results](https://discord.com/channels/1026983422431862825/1150790432897388556) for general use of EMA on Discord. ## AdaCoor optimizer diff --git a/doc/CAPTION_COG.md b/doc/CAPTION_COG.md index f44d233..2b1a877 100644 --- a/doc/CAPTION_COG.md +++ b/doc/CAPTION_COG.md @@ -1,3 +1,15 @@ +# Synthetic Captioning + +Script now works with the following: + + --model "THUDM/cogvlm-chat-hf" + + --model "THUDM/cogvlm2-llama3-chat-19B" + + --model "xtuner/llava-llama-3-8b-v1_1-transformers" + + --model "THUDM/glm-4v-9b" + # CogVLM captioning CogVLM ([code](https://github.com/THUDM/CogVLM)) ([model](https://huggingface.co/THUDM/cogvlm-chat-hf)) is, so far (Q1 2024), the best model for automatically generating captions.