tomaarsen / twitchmarkovchain Goto Github PK

View Code? Open in Web Editor NEW

108.0 7.0 23.0 140 KB

Twitch Bot for generating messages based on what it learned from chat

License: MIT License

Python 100.00%

python twitch bot twitch-bot markov markov-chain twitchbot markovchain

twitchmarkovchain's Introduction

TwitchMarkovChain

Twitch Bot for generating messages based on what it learned from chat

Explanation

When the bot has started, it will start listening to chat messages in the channel listed in the settings.json file. Any chat message not sent by a denied user will be learned from. Whenever someone then requests a message to be generated, a Markov Chain will be used with the learned data to generate a sentence. Note that the bot is unaware of the meaning of any of its inputs and outputs. This means it can use bad language if it was taught to use bad language by people in chat. You can add a list of banned words it should never learn or say. Use at your own risk.

Whenever a message is deleted from chat, it's contents will be unlearned at 5 times the rate a normal message is learned from. The bot will avoid learning from commands, or from messages containing links.

How it works

Sentence Parsing

To explain how the bot works, I will provide an example situation with two messages that are posted in Twitch chat. The messages are:

Curly fries are the worst kind of fries Loud people are the reason I don't go to the movies anymore

Let's start with the first sentence and parse it like the bot will. To do so, we will split up the sentence in sections of keyLength + 1 words. As keyLength has been set to 2 in the Settings section, each section has 3 words.

Curly fries are the worst kind of fries
[Curly fries:are]
      [fries are:the]
            [are the:worst]
                [the worst:kind]
                    [worst kind:of]
                          [kind of:fries]

For each of these sections of three words, the last word is considered the output, while all other words it are considered inputs. These words are then turned into a variation of a Grammar:

"Curly fries" -> "are"
"fries are"   -> "the"
"are the"     -> "worst"
"the worst"   -> "kind"
"worst kind"  -> "of"
"kind of"     -> "fries"

This can be considered a mathematical function that, when given input "the worst", will output "kind". In order for the program to know where sentences begin, we also add the first keyLength words to a seperate Database table, where a list of possible starts of sentences reside.

This exact same process is applied to the second sentence as well. After doing so, the resulting grammar (and our corresponding database table) looks like:

"Curly fries" -> "are"
"fries are"   -> "the"
"are the"     -> "worst" | "reason"
"the worst"   -> "kind"
"worst kind"  -> "of"
"kind of"     -> "fries"
"Loud people" -> "are"
"people are"  -> "the"
"the reason"  -> "I"
"reason I"    -> "don't"
"I don't"     -> "go"
"don't go"    -> "to"
"go to"       -> "the"
"to the"      -> "movies"
"the movies"  -> "anymore"

and in the database table for starts of sentences:

"Curly fries"
"Loud people"

Note that the | is considered to be "or". In the case of the bold text above, it could be read as: if the given input is "are the", then the output is either "worst" or "reason".

In practice, more frequent phrases will have higher precedence. The more often a phrase is said, the more likely it is to be generated.

Generation

When a message is generated with !generate, a random start of a sentence is picked from the database table of starts of sentences. In our example the randomly picked start is "Curly fries".

Now, in a loop:

The output for the input is generated via the grammar.
And the input for the next iteration in the loop is shifted:
- Remove the first word from the input.
- Add the new output word to the end of the input.

So, the input starts as "Curly Fries". The output for this input is generated via the grammar, which gives us "are". Then, the input is updated. "Curly" is removed, and "are" is added to the input. The new input for the next iteration will be "Fries are" as a result. This process repeats until no more words can be generated, or if a word limit is reached.

A more programmatic example of this would be this:

# This initial sentence is either from the database for starts of sentences,
# or from words passed in Twitch chat
sentence = ["Curly", "fries"]
for i in range(sentence_length):
    # Generate a word using last 2 words in the partial sentence,
    # and append it to the partial sentence
    sentence.append(generate(sentence[-2:]))

It's common for an input sequence to have multiple possible outputs, as we can see in the bold part of the previous grammar. This allows learned information from multiple messages to be merged into one message. For instance, some potential outputs from the given example are

Curly fries are the reason I don't go to the movies anymore

Loud people are the worst kind of fries

Commands

Chat members can generate chat-like messages using the following commands (Note that they are aliases):

!generate [words]
!g [words]

Example:

!g Curly

Result (for example):

Curly fries are the reason I don't go to the movies anymore

The bot will, when given this command, try to complete the start of the sentence which was given.
- If it cannot, an appropriate error message will be sent to chat.
Any number of words may be given, including none at all.
Everyone can use it.

Furthermore, chat members can find a link to How it works by using one of the following commands:

!ghelp
!genhelp
!generatehelp

The use of this command makes the bot post this message in chat:

Learn how this bot generates sentences here: https://github.com/CubieDev/TwitchMarkovChain#how-it-works

Streamer commands

All of these commands can be whispered to the bot account, or typed in chat. To disable the bot from generating messages, while still learning from regular chat messages:

!disable

After disabling the bot, it can be re-enabled using:

!enable

Changing the cooldown between generations is possible with one of the following two commands:

!setcooldown <seconds>
!setcd <seconds>

Example:

!setcd 30

Which sets the cooldown between generations to 30 seconds.

Moderator commands

All of these commands must be whispered to the bot account. Moderators (and the broadcaster) can modify the blacklist to prevent the bot learning words it shouldn't. To add word to the blacklist, a moderator can whisper the bot:

!blacklist <word>

Similarly, to remove word from the blacklist, a moderator can whisper the bot:

!whitelist <word>

And to check whether word is already on the blacklist or not, a moderator can whisper the bot:

!check <word>

Settings

This bot is controlled by a settings.json file, which has the following structure:

{
  "Host": "irc.chat.twitch.tv",
  "Port": 6667,
  "Channel": "#<channel>",
  "Nickname": "<name>",
  "Authentication": "oauth:<auth>",
  "DeniedUsers": ["StreamElements", "Nightbot", "Moobot", "Marbiebot"],
  "AllowedUsers": [],
  "Cooldown": 20,
  "KeyLength": 2,
  "MaxSentenceWordAmount": 25,
  "MinSentenceWordAmount": -1,
  "HelpMessageTimer": 18000,
  "AutomaticGenerationTimer": -1,
  "WhisperCooldown": true,
  "EnableGenerateCommand": true,
  "SentenceSeparator": " - ",
  "AllowGenerateParams": true,
  "GenerateCommands": ["!generate", "!g"]
}

Parameter	Meaning	Example
`Host`	The URL that will be used. Do not change.	`"irc.chat.twitch.tv"`
`Port`	The Port that will be used. Do not change.	`6667`
`Channel`	The Channel that will be connected to.	`"#CubieDev"`
`Nickname`	The Username of the bot account.	`"CubieB0T"`
`Authentication`	The OAuth token for the bot account.	`"oauth:pivogip8ybletucqdz4pkhag6itbax"`
`DeniedUsers`	The list of (bot) accounts whose messages should not be learned from. The bot itself it automatically added to this.	`["StreamElements", "Nightbot", "Moobot", "Marbiebot"]`
`AllowedUsers`	A list of users with heightened permissions. Gives these users the same power as the channel owner, allowing them to bypass cooldowns, set cooldowns, disable or enable the bot, etc.	`["Michelle", "Cubie"]`
`Cooldown`	A cooldown in seconds between successful generations. If a generation fails (eg inputs it can't work with), then the cooldown is not reset and another generation can be done immediately.	`20`
`KeyLength`	A technical parameter which, in my previous implementation, would affect how closely the output matches the learned inputs. In the current implementation the database structure does not allow this parameter to be changed. Do not change.	`2`
`MaxSentenceWordAmount`	The maximum number of words that can be generated. Prevents absurdly long and spammy generations.	`25`
`MinSentenceWordAmount`	The minimum number of words that can be generated. Might generate multiple sentences, separated by the value from `SentenceSeparator`. Prevents very short generations. -1 to disable.	`-1`
`HelpMessageTimer`	The amount of seconds between sending help messages that links to How it works. -1 for no help messages. Defaults to once every 5 hours.	`18000`
`AutomaticGenerationTimer`	The amount of seconds between automatically sending a generated message, as if someone wrote `!g`. -1 for no automatic generations.	`-1`
`WhisperCooldown`	Allows the bot to whisper a user the remaining cooldown after that user has attempted to generate a message.	`true`
`EnableGenerateCommand`	Globally enables/disables the generate command.	`true`
`SentenceSeparator`	The separator between multiple sentences. Only relevant if `MinSentenceWordAmount` > 0, as only then can multiple sentences be generated. Sensible values for this might be `", "`, `". "`, `" - "` or `" "`.	`" - "`
`AllowGenerateParams`	Allow chat to supply a partial sentence which the bot finishes, e.g. `!generate hello, I am`. If `false`, all values after the generation command will be ignored.	`true`
`GenerateCommands`	The generation commands that the bot will listen for. Defaults to `["!generate", "!g"]`. Useful if your chat is used to commands with `~`, `-`, `/`, etc.	`["!generate", "!g"]`

Note that the example OAuth token is not an actual token, but merely a generated string to give an indication what it might look like.

I got my real OAuth token from https://twitchapps.com/tmi/.

Blacklist

You may add words to a blacklist by adding them on a separate line in blacklist.txt. Each word is case insensitive. By default, this file only contains <start> and <end>, which are required for the current implementation.

Words can also be added or removed from the blacklist via whispers, as is described in the Moderator Command section.

Requirements

Python 3.6+
Module requirements
- Install these modules using pip install -r requirements.txt in the commandline.

Among these modules is my own TwitchWebsocket wrapper, which makes making a Twitch chat bot a lot easier. This repository can be seen as an implementation using this wrapper.

Contributors

My gratitude is extended to the following contributors who've decided to help out.

@DoctorInsano - Several small fixes and improvements in v1.0.
@justinrusso - Several features, refactors and fixes, that represent the core of v2.0 and v2.1.

Other Twitch Bots

TwitchAIDungeon
TwitchGoogleTranslate
TwitchCubieBotGUI
TwitchCubieBot
TwitchRandomRecipe
TwitchUrbanDictionary
TwitchRhymeBot
TwitchWeather
TwitchDeathCounter
TwitchSuggestDinner
TwitchPickUser
TwitchSaveMessages
TwitchMMLevelPickerGUI (Mario Maker 2 specific bot)
TwitchMMLevelQueueGUI (Mario Maker 2 specific bot)
TwitchPackCounter (Streamer specific bot)
TwitchDialCheck (Streamer specific bot)
TwitchSendMessage (Meant for debugging purposes)

twitchmarkovchain's People

Stargazers

Watchers

twitchmarkovchain's Issues

No such table

I'm trying to run the bot but it just outputs this

[2021-09-20 01:10:14,417] [Database] [INFO ] - Updating Database to new version - supports better punctuation handling.
[2021-09-20 01:10:14,423] [Database] [INFO ] - Created a copy of the database called "MarkovChain_yabbe_modified.db". The update will modify this file.
Traceback (most recent call last):
File "C:\Users\wrk\Desktop\bt\TwitchMarkovChain\MarkovChainBot.py", line 570, in
MarkovChain()
File "C:\Users\wrk\Desktop\bt\TwitchMarkovChain\MarkovChainBot.py", line 30, in init
self.db = Database(self.chan)
File "C:\Users\wrk\Desktop\bt\TwitchMarkovChain\Database.py", line 96, in init
self.update_v3(channel)
File "C:\Users\wrk\Desktop\bt\TwitchMarkovChain\Database.py", line 411, in update_v3
modify_start(table)
File "C:\Users\wrk\Desktop\bt\TwitchMarkovChain\Database.py", line 331, in modify_start
data = self.execute(f"SELECT * FROM {table};", fetch=True)
File "C:\Users\wrk\Desktop\bt\TwitchMarkovChain\Database.py", line 516, in execute
cur.execute(sql)
sqlite3.OperationalError: no such table: MarkovStartA

What am i missing?

Disabling Whispers

Is it possible to disable whispers or the possibility for a setting to turn them off?

I've been fiddling with it for a while but I'm not having any luck disabling whispers which is getting the bot shadowbanned from Twitch.

Sending whispers does nothing

When I try and send !enable to my bot through a whisper, this message appears in my CLI
[2021-11-18 22:47:20,713] [__main__] [INFO ] - Your settings prevent you from sending this whisper.

I can't find anything in any of the .py files that even returns that message.

Database error

Hello,
I need help with running the bot. When I run it it gives me this error:
File "c:\Users\Uživatel\Desktop\TwitchMarkovChain-master\Database.py", line 160, in execute
with sqlite3.connect(self.db_name) as conn:
sqlite3.OperationalError: unable to open database file

Do you know how to fix it?

Thanks

SAJBTR0N

bot breaking up words with single quotation marks in them

Trying to generate sentences with !g and words such as "I'm", "haven't", "can't" etc. doesn't work, I believe they are saved properly in the database and the bot uses them normally when it picks them itself, but it breaks if you try to put them in a !g command.

Typing "!g i've" makes the bot say "I haven't learned what to do with "i 've" yet."
Typing "!g can't" makes it say "I haven't learned what to do with "ca n't" yet."

It does have data for the actual words in the database but it looks like something goes wrong with splitting the words and putting them back together when generating.

Direct recursion is disallowed when it's not guaranteed anyway

Comments with "x x x" should be entered in the database if at least k% of messages of the format "x x y" (where y can equal x) are of the format "x x y" where y ≠ x

This will allow for the bot to produce messages like "LUL LUL LUL BANNED" if "LUL LUL BANNED" and "LUL LUL LUL" are both in the database. Of course, the message could be 20 LULs, but the probability of this is ((100-k)/100)^20 which is gonna be pretty low if k is something like 50

Reduce the amount of learned information required to start outputting

The lines
https://github.com/CubieDev/TwitchMarkovChain/blob/c7b8639bf85ae129f4a1e3f0dd757ecdef75f098/Database.py#L226-L230

will pick a character and then select data that starts with that character. However, this means that the bot needs to learn sentences that start with any of the 27 possible tokens before it stops saying "There is not enough learned information yet.".
This can definitely be improved.
Obviously a select on all MarkovStart tables as opposed to just one will work, but I would love to keep the program somewhat efficient even if it has enormous amounts of information.

This can also be fixed by altering the structure of the database, which currently exists of very many tables to avoid the seemingly exponential increase in query time relative to the row count.
Perhaps switching to a different database structure will help too.

Is it possible to make the bot generate a reply after someone sends a message in the chat?

I've been using this bot for a quite a while rn and i must say it's one of the best from what i've seen out there, even if we're using portuguese instead of english with him he got the hang of the basic structure pretty quickly and is already experimenting with combining other words and phrases, but i would like to ask/know if it's possible to make a feature where the bot generates an answer after somebody sends a message in chat instead of the timer, i'm going to be honest with you i don't know how this bot works 100% that's why i'm interested to know if such a thing is possible

Thanks for the attention, Diogo

More frequent errors occuring

Since some time now (tried to recreate at which point it started, but can't) I am getting more frequent crashes of the bot and have to restart it manually. Right now it is almost daily. This is the error message that I got:

bot stuck to the wrong account

apologies if this is the wrong place for this kind of issue, but I seem to have gotten my installation of the bot stuck to the wrong account. I understand how it happened (I used the wrong OAuth password) but I can't figure out how to fix it, as no amount of modifying (or even deleting) files on my computer changes anything. presumably there are files uploaded elsewhere that would need to be modified/deleted to get this set up on the right account, but I don't know how to do that.

Send message on a timer that explains how the Bot works

A timed, toggleable message explains how the bot works by linking to https://cubiedev.github.io/TwitchMarkovChain/#how-it-works.
This will also remind users that the bot is live.

The internal Timer between a message can then also be used for implementing #8.

Login authentication fails

There have been increasing problems with the program crashing more regulary over the last months, but I was always able to restart it and it worked again just fine. But since yesterday, the authentication fails within the first three steps when I try to run it. I tried updating thr oauth token, but it didn't help. I guess the old authentication method is finally not working with twitch anymore. :/
I guess this project is dead, but I just wanted to reach out anyways. My viewers always loved the feature!

Incompatible databases

Is it possible to make databases from older versions of the bot work with the newest one? It breaks whenever I give it my database that's got the most chat messages gathered.

How to activate bot

This is a very naive question, but I have managed to download and install the bot using conda.

How do I activate it? Do I simply need to link it to a bot account, or is there something else I need to do?

how to make bot only learn messages from a specific person

i was thinking of using self.check_if_permissions(m) to accomplish this but i couldnt get it to work, any help on this?

No such table error

When I try to run the MarkovChainBot.py file in Command Prompt, I get the following text:

C:\Users\USER ONE\Documents\Kadoomed\TwitchMarkovChain-master\TwitchMarkovChain-master>python MarkovChainBot.py
[2021-11-02 14:48:53,620] [Database] [INFO ] - Updating Database to new version - supports better punctuation handling.
[2021-11-02 14:48:53,621] [Database] [INFO ] - Created a copy of the database called "MarkovChain_kadoomed_modified.db". The update will modify this file.
Traceback (most recent call last):
File "C:\Users\USER ONE\Documents\Kadoomed\TwitchMarkovChain-master\TwitchMarkovChain-master\MarkovChainBot.py", line 570, in
MarkovChain()
File "C:\Users\USER ONE\Documents\Kadoomed\TwitchMarkovChain-master\TwitchMarkovChain-master\MarkovChainBot.py", line 30, in init
self.db = Database(self.chan)
File "C:\Users\USER ONE\Documents\Kadoomed\TwitchMarkovChain-master\TwitchMarkovChain-master\Database.py", line 96, in init
self.update_v3(channel)
File "C:\Users\USER ONE\Documents\Kadoomed\TwitchMarkovChain-master\TwitchMarkovChain-master\Database.py", line 411, in update_v3
modify_start(table)
File "C:\Users\USER ONE\Documents\Kadoomed\TwitchMarkovChain-master\TwitchMarkovChain-master\Database.py", line 331, in modify_start
data = self.execute(f"SELECT * FROM {table};", fetch=True)
File "C:\Users\USER ONE\Documents\Kadoomed\TwitchMarkovChain-master\TwitchMarkovChain-master\Database.py", line 516, in execute
cur.execute(sql)
sqlite3.OperationalError: no such table: MarkovStartA

Again, I have very little experience with Python so I'm not sure what this means, but it looks like it's trying to check data in a table which hasn't been created in the database is that correct? I can't see an obvious reason for this to happen to me but not to others who are running the same script. Any ideas?

merging databases

Hey, I'm wondering if there's a way to combine the data gathered from several different channels into one db file, instead of being limited to data from one channel at a time

Support for the bot speaking without having to type !g or !generate?

Not a very important feature but could be neat to have the bot just keep talking with a cooldown between messages.

Error: Unrecognized command: /mods

Since some time now (I believe it can't be more than a few weeks) the bot seems to have problems fetching the mod list, Resulting in an error. Sometimes it runs afterwards anyway (see second picture), but sometimes it just ends in an error.

No pyvenv.cfg file

Hello

I, a total python noob, have foolishly decided to try and install this chatbot. I think I've got it all setup correctly, installed the websocket in the same folder and edited the settings file with the required info and oath code but when I try to run the MarkovChainBot.py file nothing happens. If I try to execute it in command prompt I get the error "No pyvenv.cfg file"

What am I missing?

cannot whisper !generate to bot

unsure if a bug or not implemented yet but while other commands such as !setcd can be whispered to the bot and it will listen to them, whispering !generate seems to be ignored by the bot

HTTP error 404 while installing the requirements

trying to install the requirements for the bot to run results in exit code: 128 "The unauthenticated git protocol on port 9418 is no longer supported". Requirements page seems to be unreachable, resulting in an error 404 when opened.

No spaces before quotes

The detokenizer isn't prepending spaces before quotes like it says it should in the Tokenizer.py examples.
If I use one of the examples on the detokenizer:

["He", "said", "''", "heya", "!", "''", "yesterday", "."]

it returns:

He said"heya!" yesterday.

Issue with sqlite3.OperationalError: unable to open database file

Very new to vscode/python so its very likely a me issue, but does anyone know why this might be?
Here is the error i got:
line 500, in execute_commit with sqlite3.connect(self.db_name) as conn: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ sqlite3.OperationalError: unable to open database file

Problem with TwitchWebSocket while trying to run MarkovChainBot.py

Hello! I'm a long time user of your program and so far it has been working flawlessly! But recently i formatted my SSD and while trying to set up the bot again i installed Python, Git and ran the requirement.txt command with no problems, but as soon as i tried to run the file MarkovChainBot.py it gave me this error:

Could not find platform independent libraries <prefix>
Traceback (most recent call last):
  File "C:\Users\Administrator\Downloads\TwitchMarkovChain-2.4\TwitchMarkovChain-2.4\MarkovChainBot.py", line 4, in <module>
    from TwitchWebsocket import Message, TwitchWebsocket
  File "C:\Users\Administrator\Downloads\TwitchMarkovChain-2.4\TwitchMarkovChain-2.4\TwitchWebsocket.py", line 8, in <module>
    from TwitchWebsocket.Message import Message
ModuleNotFoundError: No module named 'TwitchWebsocket.Message'; 'TwitchWebsocket' is not a package

I tried to download the TwitchWebSocket myself and copy the message and twitchwebsocket files into the folder but to no avail, rerunning the requirements command just gave back that the requirement was already satisfied, so i'm not sure what's going on or how to fix, any help is highly appreciated!

Not enough learned information

I don't seem to understand how much information is needed for the bot to work. It seems that no matter how many messages I send, !generate just gives me "There is not enough learned information yet." I really have no clue what I'm doing wrong, if anybody could help me.

no such table: MarkovStartA Error

after installing requirements and generating the settings.json.

sudo python3 MarkovChainBot.py
[2021-10-10 16:35:36,001] [Database] [INFO ] - Updating Database to new version - supports better punctuation handling.
[2021-10-10 16:35:36,003] [Database] [INFO ] - Created a copy of the database called "MarkovChain__modified.db". The update will modify this file.
Traceback (most recent call last):
File "MarkovChainBot.py", line 570, in
MarkovChain()
File "MarkovChainBot.py", line 30, in init
self.db = Database(self.chan)
File "/home/rainer/TwitchMarkovChain/Database.py", line 96, in init
self.update_v3(channel)
File "/home/rainer/TwitchMarkovChain/Database.py", line 411, in update_v3
modify_start(table)
File "/home/rainer/TwitchMarkovChain/Database.py", line 331, in modify_start
data = self.execute(f"SELECT * FROM {table};", fetch=True)
File "/home/rainer/TwitchMarkovChain/Database.py", line 516, in execute
cur.execute(sql)
sqlite3.OperationalError: no such table: MarkovStartA