ヘルプとwiki お絵かき サインイン

Japanese Storywriting AI
AI Novelist 2.0

by Bit192

Discord / Wiki


Official Discord Server (Bit192 Official)
This is the general official server for Bit192 related works and services.
We have opened support channels, chat channels, and work sharing channels.

AI Novelist Wiki
An unofficial wiki that anyone can join and edit.
Knowledge such as detailed usage and how to deal with problems is accumulated here.



In case of trouble


☆ Q&A (user-curated) on how to use or when problems occur page in unofficial wiki

☆ For Q&A regarding accounts and billing, please refer to Frequently Asked Questions about Accounts.


Instruction manual



▼ General Tips


Amount of text AI can recognize

AI can recognize up to 4400 characters (with a free account), 6400 characters (with a Voyager member), and even up to about 23,000 characters (with a Platinum member). Giving it a longer sentence should generate a sentence with the theme you want.

AI is always influenced by writing style

AI tries to mimic the tone, style, and format of the text already written. For example, if there are many lines in a sentence, many lines will be generated, and if there are many ground lines in a sentence, many ground lines will be generated.
Also, the shorter the input sentence, the higher the possibility that new characters and proper nouns will appear. To create a stable flow, you need to input at least 40-50 lines.

Annotate the next development

By enclosing the outline of the scene you want to write in [] (brackets) in the text, AI will consider it to develop the story from there for about 30 lines to a maximum of 100 lines along the annotated content. (Compatible with SuperTrin and Damsel+RAY only).
Example: [わたしが青空マーケットで買い物をしている描写。]
Example: [春のコーヒー屋。デリダがノイマンに食ってかかる。]

change of scene

If you want to create a new development, give it a string that is often used to divide chapters (such as "*", "◆◆◆", "Chapter 1", "Line break ☆ Line break", etc.), and the AI will recognize that there has been a scene change.
If there are no strings that explicitly or implicitly change the scene, the AI tends to continue to depict the same scene in high resolution.

Generally, in novels and scripts, we don't draw everything. The parts in between are omitted. If you feel that the sentence is becoming redundant or if you feel that it is difficult to draw here, let's change the scene.

Let's proofread

If the sentence is not generated as expected, delete or correct the parts that do not match your intention. Line breaks, spaces, and symbols also influence AI's thinking.
If you leave incorrect content (such as a tone that is different from what you intended) uncorrected, the AI will continue to progress with that information. On the other hand, if you correct it properly, AI's inference will gradually stabilize.

If it becomes monotonous / the same sentence continues

If the same lines or descriptions are repeated, increase the "Repetition Penalty" in the advanced options.

Commenting out

Lines beginning with @_ are comment lines and are not sent to the AI. You can also comment out a section by enclosing it with @/ and @/. Also, lines above the line that says @break or @startpoint are uniformly treated as comment lines and are not sent to the AI.

Let the AI write from the middle of the sentence

The part below where you write @endpoint will be treated as a comment, and you will be able to write the continuation from the middle of the sentence by inserting it into the line immediately above.

Retry history

If you tap Undo 3 more times when there is no more Undo history (when it says << Undo), the retry history at that point will be collectively commented out and output. Useful for keeping good branches.

▼ Advanced Tips


You can add instructions at the end of a sentence or in a footnote (author's note) to change the point of view in the middle of a sentence or to have the characters in the work answer questions. The AI infers answers and character emotions from what has been written so far.

[ (brackets) are designated as prohibited words by default, so by enclosing explanatory text in brackets, you can make AI recognize that the enclosed text is explanatory text rather than body text. If the explanatory text is not enclosed in brackets, the content may be output verbatim and paraphrased.

Viewpoint change

[Saki's POV. First person.]
[Saki's father's POV in third person.]

Switch scenes and let the AI know what's going to happen

(Line break)
[Spring coffee shop. Derrida picks on Neumann.]

(Adding one line break lets the AI know that this is a minor scene change)

Questions and answers

Saki is answering the question.
Q: Who is the class president?
A:

Inner workings of characters

[Words' state of mind:
[Words' true intentions:
[Derrida's current situation:
[Derrida's status:
[Derrida's setting details:
[Trin's monologue]
[Trin's recollection]
[Trin's diary]
[Trin's notes]
(If it doesn't work, it's effective to add a line between it and the text)

Try writing a haiku

俳句を書いてください。
1.古池や 蛙飛びこむ 水の音
2.

(The trick is to make a list)

Summarize the story / write impressions

Summary of this story:
Synopsis so far:
Summary up to this point:
Character introduction
Impression of the work:
What I found interesting after reading this far.

(Let's try various things)

▼ Usage other than novels


You can specify to some extent what kind of content you want to generate (if it doesn't work, please select "Task" from the generation preset). For example

Name: (your favorite name)
Gender: (male, female, etc.)
Personality:

and enter (including line breaks)
Weight 64 kg. Blood type is O, zodiac sign is Cancer. Hobbies/Special Skills: Reading and watching movies. Favorite food is curry udon (especially loves spicy ones). I hate fried chicken and potato chips.
and so on will come back.

Summary article about cheesecake:
and enter (including colon)
Click here for how to make cheesecake and a collection of recipes! To make a cheesecake at home, you first need to gather the ingredients and tools. Before that, what you want to prepare is a stylish souvenir, right?
But there may also be things that you "don't want to take with you". So, this time, we will carefully select recommended sweets that you should always carry around.

and so on will come back.

I tried to find out if Turks prefer Pikachu or Oswald.
The answer is

and enter (including line breaks)
An old Turkish man said, "I prefer you to either of them."
and so on will come back.


Words and UNESCO

▼ Character Book / Command List


By using @commands, you can configure the behavior of character books in more detail.

For example, if you do the following, you can make the character book be inserted in the 5th line from the bottom when it is within the last 400 characters of the context, or you can pseudo-insert footnotes according to the situation.

Tag Lucille
Description
@range 400
@priority 5
[Lucille: I. A white cat with pure white fur]

Command list and usage examples

@range 500
Scan up to 500 characters from the bottom of the context

@range 100-1000
Scan from the 100th to 1000th character from the bottom of the context

@priority 3
Change the priority of this character book to 3

@and Trin|Weirs
Applies only when "Trin" and "Weirs" are included in the scan range in addition to the specified tag. Multiple designations are separated by |.

@not Derrida
Do not apply if "Derrida" is included in the scan range. Multiple designations are separated by |.

@random 0.4
Even if this character book can be applied, it will be ignored with a 40% chance (0.1 for 10%, 1.0 for 100%)

@addbanword mouse robot<<|>>mouse robot
When this character book is applied, add "mouse robot" and "mouse robot" to the prohibited words.
Multiple designations are separated by <<|>>.

@addbias cat,10<<|>>dog,-7
When this character book is applied, the word "cat" is more likely to appear and the word "dog" is less likely to appear.
Multiple designations are separated by <<|>>.
Bias values between -30 and 30 are recommended.

Note

Each @ command cannot be described multiple times in one tag (such as writing @and on multiple lines). Be sure to put them together on one line.

▼ What is MOD?


MOD is a function that allows you to fine-tune AI.
By loading an arbitrary corpus (text) and inserting a prompt AI that has learned the theme and style of the text between the input and output, it becomes easier to output the text you want without detailed settings.
If you train your own MOD, you can freely share it with other users and enjoy it!


Darkness and Neumann

Points

MOD is most effective when the input sentence is short. Open a new work and try it out.

On the contrary, you may not feel much change in a prompt that uses 2048 tokens to the fullest. When you want to change the MOD in the middle of a work, try using the @break command to temporarily reduce the number of tokens to be read by the AI.

A maximum of 100 tokens are reserved for MOD.


MOD training points

The corpus to be prepared must be plain text (pdf, Word and iWork document formats are not supported) and preferably noise-free. Excessive line breaks and leading spaces are automatically removed, but if unrelated sentences or html tags are mixed in, you may not get the expected results.

The character code of the text used for the corpus only supports UTF-8 (Shift-JIS and EUC-JP are not supported). Most recent texts are formatted in UTF-8, but please be careful when loading older texts.

Basically, the larger the amount of corpus, the higher the accuracy can be expected, but training is possible with at least 10 kilobytes of text. The maximum size that can be read at one time is 16MB.

The lower the loss value, the higher the accuracy. Loss of novel and highly flexible sentences tends to be high, and loss of formatted sentences such as source code tends to be low. High-precision learning has been achieved if it is 2.91 or less.

Lumina is consumed according to the number of training steps. The number of steps is automatically adjusted according to the amount of corpus, but you can also reduce the number of steps if you do not require extremely high accuracy (reducing the number of steps will also reduce the processing time).



About privacy

The corpus uploaded for MOD training is deleted from the server after training is completed, and the unique ID of the MOD is designed so that the user who uploaded it cannot be inferred.

▼ AI Model Details


SuperTrin


Language Japanese
Effective size of the corpus 2.5TB
Tokenizer Trinsama-tokenizer V3
Maximum sequence length 9216 tokens
(depending on the active membership)
Number of parameters -
MOD Not supported
Other Spherical Deity

With a new architecture, she can output at 2.5 to 4 times the speed with almost the same or better specs as Damsel (the free version has a speed limit), and theoretically it can read up to tens of thousands of tokens of context at the same time. This is the gold standard model. ☆ Variation model
[Colorful] is a model that selectively uses parameters, making it easier to create a more unconventional writing style. [Solid] has a solid and sturdy writing style that is somewhat closer to the older version. ☆ Instruct (instruction) tuning

[#ユーザー]
小津(吾輩)はどうして夢の中で黒猫になっていたの?
このような構文を与えると、それまでの本文に従って
[#アシスタント]
夢の中の小津が黒猫になっていた理由は、彼の深層心理にある不安や恐れを反映したものです。 小津は自分が周りとは違う存在であることや、いつか孤独になってしまうのではないかという恐怖を抱えており、 自分自身を猫の姿に投影することで、そのような心細さから逃れようとしていたのです。
It will answer you like this (beta version). Try the Assistant AI mode, which uses instruction tuning!

Damsel+RAY


Language Japanese
Effective size of the corpus 約3TB
Tokenizer Trinsama-tokenizer V3
Maximum sequence length 16384トークン
Number of parameters -
MOD Not supported
Other Chaser Maiden

This is the third generation final version of AI Novelist, with upgraded basic specs and a syntax style close to the older Damsel model.

・Pure Japanese model with over 90% of the total 3TB+ data consisting of Japanese corpus
・About twice the active parameters of SuperTrin
・Trained with native 16k context
・A text completion AI designed with novel generation and completion in mind, but also partially compatible with chatbot-like instruction syntax*
・Enhancement of domain knowledge (narrative, dialogue, game logic, etc.), creativity and syntax by annealing

*Instruction tuning can be used with the same syntax as SuperTrin

☆ Legacy model


Following models are discontinued and will not recieve future update.

Trin-sama 7.3B


Language Japanese
Parameters 7.3 billion
Effective size of the corpus 約1.2TB
Tokenizer Trinsama-tokenizer
Sequence length 2048 tokens
(depending on the active membership)
Attention heads 16
MOD Enabled
Branching 3WAY
Other Spherical Deity

This is the older model of AI Novelist. She's a Japanese AI that was pre-trained on a web corpus of approximately 1.2TB and a corpus of several thousand paperback books, then fine-tuned for novel generation.

However, as of early 2022, she is estimated to have read approximately 80-90% of the meaningful Japanese text data available on the internet, making it usable for applications beyond novel generation.

She incorporates a new technique called "model branching," which utilizes three models with slightly different tuning that branch based on certain conditions. This helps to prevent the repetition of similar vocabulary.

Trin-sama has sufficiently generalized various concepts, and compared to larger models, she tends to provide more straightforward responses and her behavior is relatively predictable.

Due to the limited number of attention heads, it may confuse relationships when many proper nouns are present. If you notice something incorrect, please correct it as you go.

Derrida 7B


Language Japanese
Parameters 7.3 billion
Effective size of the corpus 1.2TB + 1GB
Tokenizer Trinsama-tokenizer
Sequence length 2048 tokens
(depending on the active membership)
Attention heads 16
MOD ×
Other Ghost-kei girl

This is a poet AI created by fine-tuning the Trin-sama 7.3B model, specializing in poetry, lyrics, tanka (a type of Japanese poetry), and prose. The fine-tuning involved approximately 1GB of data.

While she can also write novels with somewhat intense and poetic expressions, please be aware that the presets differ significantly from the base model.

For lyrics and tanka, providing only short sentences tends to produce generic results. To achieve better results, try adding hints like:

[Genre: Pop, 90s]

in the first line. This will significantly improve the quality of the generated text.

Damsel 20B


Language Japanese
Parameters 20 billion
Effective size of the corpus 2TB
Tokenizer Trinsama-tokenizer V3
Sequence length 3076 tokens
(depending on the active membership)
Attention heads 60
MOD ×
Other Chaser Damsel

She is a full-scratch Japanese AI boasting a massive 200 billion parameters, making her one of the largest currently available. Her writing proficiency is excellent. Her core knowledge base and novel-writing specialization are derived from the "Trin-sama 7.3B" model.

A brand new tokenizer significantly improves how efficiently she processes Japanese text. This new tokenizer is about 6% more efficient than the previous version and allows the AI to understand complex Kanji characters that were previously unreadable.

With a vastly increased number of attention heads (a technical aspect of how the AI processes information), she can manage many names and characters at once. However, she's quite sensitive to settings that penalize repetition, especially when dealing with detailed descriptions. If the story needs to take a sudden turn, you might want to reduce the repetition penalty setting slightly (by about 0.25) to prevent issues.

Neuman 13B / Neuman 20B


Language English
Parameters 13 billion / 20 billion
Effective size of the corpus Up to 1.2TB
Tokenizer GPT-2 Tokenizer(13B)
Pile Tokenizer(20B)
Sequence length 2048 tokens
MOD ×
Other Mathematician

These are large-scale English AI models with approximately 13 billion and 20.5 billion parameters, respectively (provided for testing). The 13B model was trained on a roughly 450GB internet corpus and a paperback book corpus, while the 20B model was trained on "The Pile," an 825GB corpus provided by EleutherAI.

The 13B model boasts high generalization accuracy and excels at generating relatively straightforward sentences.

In addition to novels and short sentences, the 20B model is capable of generating code in various programming languages such as C++, C#, Ruby, PHP, and Python, thanks to its training on a corpus that includes GitHub. Also performs well in chatbot-like applications due to its training on corpora that contain Reddit and Q&A sites.

▼ User Scripts (Extensions)


User scripts that enable extensions are distributed and maintained by Siratama (requires the installation of Tampermonkey for Chrome/Firefox). Some features have been fed back to the main AI Novelist, but if you want to try out all the features, please install them.

→ Click here for the user script distribution page!

◆ Added Features (Partial)

◆ Example: Inserting images corresponding to character names

Image-attached chat prompt


▼ GUI Mode


GUI mode allows you to change the operation screen according to your purpose, such as text games or chatting with AI. Switch between them in "Output Options" under "Interface Type".

◆ "Novel" Mode

"Novel" mode is the default setting for AI Novelist and has features specialized for novels.
Enter text in the text area and press the "Continue Writing" button to send the text, and the AI will generate text according to the content.

In "Novel" mode, the following shortcuts can be used in addition to entering and editing text.

・Ctrl+Enter key
Sends the currently entered text to the AI.
This has the same function as the "Continue Writing" button.

・Ctrl+R key
Deletes the text returned by the AI in "Continue Writing" and tries to generate again with the text before the AI responded.
This shortcut has the same function as the "Retry" button.

・Ctrl+Q key
Returns to the previous text returned by the AI. If you pressed "Continue Writing", it will return one step, and if you pressed "Retry" or Ctrl+R, it will return the number of times you retried.
This shortcut has the same function as the "Undo" button.

・Ctrl+Shift+Q key
Restores the text that was undone with "Undo". It can be restored only as many times as it was undone with "Undo".
This shortcut has the same function as the "Redo" button.

・Ctrl+1 to 3 keys
Changes the current writing mode.

Ctrl+1: Default
Ctrl+2: Dialogue
Ctrl+3: Narrative


◆ "Chat/Game" Mode

"Chat/Game" mode adds a separate input field in addition to the main text area. Entering text in the input field and pressing the "Send" button or Enter key will add the entered content to the main text area and send it to the AI.
The mechanism for having the AI read the main text area after adding the input field is the same as in "Novel" mode, but by using the "Chat GUI Options" settings and scripts that can be used in the input field, you can use it for chat, text games, and other purposes besides novels.

Text sent from the input field will always have a line break added at the beginning. However, if the closing quotation mark (」) at the end of the sentence is not closed, a line break will not be added. These specifications can be changed in the "Chat GUI Options" described later.
◆ Operation Methods Specific to "Chat/Game" Mode

Unlike "Novel" mode, "Chat/Game" mode has its own operation methods.

・Enter key in the input field
Adds the entered utterance content to the main text area and sends it to the AI.

・Shift + Enter key in the input field
Inserts a line break in the input field. If you want to send multiple lines, you can insert a line break by pressing Shift+Enter.

As a supplement, since there is no writing mode in "Chat/Game" mode, the shortcuts for switching writing modes, Ctrl+1 to 3, are all disabled only in this mode.
(* The writing mode is basically fixed to "Default".)

◇ Automatic Line Breaks Based on Quotation Mark Combinations

The input field is basically designed to "not automatically add a line break if the closing quotation mark is not closed", but there are exceptions depending on the input content.
If multiple quotation marks are used, each quotation mark has a priority, and if a quotation mark with higher priority is closed, an automatic line break will be added even if other quotation marks are not closed.
The priority settings are as follows.

・High priority
「」 Quotation Marks

・Medium priority
『』 Double Quotation Marks
【】 Black Lenticular Brackets
() Parentheses
<> Angle Brackets

・Low priority
《》 Double Angle Brackets
{} Curly Brackets
[] Square Brackets

For example, even if a double quotation mark (『』) is not closed within a quotation mark (「」), if the quotation mark is already closed, an automatic line break will be added.

「よくぞ聞いてくれた。これはその名も『ウルトラ――」
~continues after line break~

Conversely, even if a quotation mark (「」) is within a double quotation mark (『』) and the double quotation mark is closed, if the quotation mark is not closed, an automatic line break will not be added.
「よくぞ聞いてくれた。これはその名も『ウルトラスーパーマックス』~continues without line break~

As an exception, half-width brackets, including brackets ([]), and quotation marks, including double quotation marks, will not have automatic line breaks added regardless of whether they are used or not.

◇ "Chat GUI Options"
When you select "Chat/Game" mode, "Chat GUI Options" will be added to the settings in "Output Options".
By configuring these settings, you can customize the text sent from the input field.

・Do not automatically add line breaks
Disables the automatic line break insertion at the beginning of the text when sending from the input field.

・Enclose utterances in quotation marks (「」)
Adds quotation marks (「」) to the beginning and end of the text entered in the input field.

If this setting is turned on, the input field will reflect the following.
Entered text: こんにちは
Text actually added: 「こんにちは」

Even if this function is turned on, as long as the "Do not automatically add line breaks" setting described above is not turned on, a line break will be added at the beginning, so the text "(line break) + 「こんにちは」" will be sent.

・Chat template
Sets the text that is inserted when accessing the work list or each time a chat message is sent.
The chat template can be set up to 2000 characters.

Please note that when you edit this text box, the text being sent will be deleted and the text of the set template will be reflected.
◇ Input Field Replacement Function
By setting the type to "Replace input field" in the script function, you can replace the text sent from the input field to the main text area when sending.
If you want to use regular expressions, set the type to "Replace input field (Regular Expression)".
(* Similar to scripts, this function requires a premium subscription to edit.)
As a supplement, the replacement process is performed before the "Enclose utterances in quotation marks" function described above.


▼ Developer API


We are offering an experimental [ Developer API ] that can be incorporated into client applications and games. See the link for details.


▼ About


This project has been made possible thanks to the cooperation of Google TRC, the provision of a large-scale GPU cluster by CoreWeave, Inc., the DeepSpeed library from Microsoft, the support of the EleutherAI community, and developers such as Ben Wang and Stella Biderman. We would like to take this opportunity to express our gratitude.




Please see the terms of use on this page.
We also have a [ guide for streamers ].


Trin-sama


Trin





Bit192 Labs