Code Monkey home page Code Monkey logo

labs-countervandalism-cvnbot's Introduction

CVNBot

Support

Contribute

Found a bug? Please report it to our issue tracker.

Build

The software is written in C# and originally created as a Visual Studio Project. We use mono to run the executable and msbuild to build the executable.

Recommended installation methods:

For standalone command-line installations on Mac or Windows, see monodevelop.com.

Currently supported versions of Mono: 6.12

Once mono is installed, build the project. The below uses Debug, for local development. (See Installation for how to install it in production):

countervandalism/CVNBot:$ msbuild src/CVNBot.sln /p:Configuration=Debug

Once built, you can run it:

countervandalism/CVNBot/src/CVNBot/bin/Debug:$ mono CVNBot.exe

Versioning

We use the Semantic Versioning guidelines as much as possible. Releases will be numbered in the following format: <major>.<minor>.<patch>

For more information on SemVer, please visit https://semver.org/.

License

See LICENSE.

labs-countervandalism-cvnbot's People

Contributors

anticompositenumber avatar az1568 avatar davidkarlas avatar kashifkhan avatar krinkle avatar marcoaureliowm avatar rstular avatar universal-omega avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

labs-countervandalism-cvnbot's Issues

Include git hash in IRC 'version' command

It would be nice to somehow in the build process include the current git HEAD hash so that it can be echoed out at run-time from the CVNBot version command (Program::BotConfigMsg).

Include a reports of the text triggering "Possible gibberish?" on IRC

Here is a sample from #cvn-wp-en:

[22:28] <+CVNBot1> User [[en:User:RedSox39]] Possible gibberish? [[en:2016–17 Alabama A&M Bulldogs basketball team]] (+1608) Diff: https://en.wikipedia.org/w/index.php?diff=752428797&oldid=752422654 ""
[22:28] <+CVNBot1> User [[en:User:Bryan McLaude]] Possible gibberish? [[en:Nuestro Amor]] (+1548) Diff: https://en.wikipedia.org/w/index.php?diff=752428880&oldid=752425435 "Added 'Promotion- Singles' section."
[22:29] <+CVNBot1> IP [[en:User:2601:542:C002:FB00:149C:FBF5:615B:6E]] blanked [[en:Dick Mehen]] (-6499) Diff: https://en.wikipedia.org/w/index.php?diff=752428898&oldid=752428818
[22:29] <+CVNBot1> User [[en:User:0xF8E8]] Possible gibberish? [[en:Felipe Andreoli]] (+1757) Diff: https://en.wikipedia.org/w/index.php?diff=752428944&oldid=751293099 "fulfilling talk page edit request"
[22:30] <+CVNBot1> User [[en:User:Noha307]] used edit summary "delete(?!d)" [[en:List of surviving Focke-Wulf Fw 190s]] (-754) Diff: https://en.wikipedia.org/w/index.php?diff=752429051&oldid=752428803 "/* List of replicas / Delete 990011 & Unknown Entries – Moved to Table Below"
[22:31] <+CVNBot1> User [[en:User:Betsuperhulk]] Possible gibberish? [[en:Template:MGM]] (+3482) Diff: https://en.wikipedia.org/w/index.php?diff=752429069&oldid=752426600 ""
[22:31] <+CVNBot1> User [[en:User:Avabaz]] Possible gibberish? [[en:User:Idychang/sandbox]] (+3709) Diff: https://en.wikipedia.org/w/index.php?diff=752429117&oldid=752424271 "/
Applications /"
[22:31] <+CVNBot1> User [[en:User:Social.Team]] Copyvio? [[en:Attorney Elie Mahfoud Mahfoud]] (+3625) URL: https://en.wikipedia.org/w/index.php?oldid=752429172&rcid=886791737 "[[WP:AES|←]]Created page with '{{subst:Biography}}'"
[22:32] <+CVNBot1> User [[en:User:Betsuperhulk]] Large removal [[en:Template:MGM]] (-3231) Diff: https://en.wikipedia.org/w/index.php?diff=752429194&oldid=752429069 ""
[22:32] <+CVNBot1> User [[en:User:Michaelsmani]] Possible gibberish? [[en:Chaebol]] (+2634) Diff: https://en.wikipedia.org/w/index.php?diff=752429251&oldid=752420814 "/
Monopolistic Behavior */ Sources."

I hope that makes it obvious why I am requesting: Include a reports of the text triggering "Possible gibberish?" on IRC -- please!

Support running CVNBot on dotnet

(To new users: I've inherited this project to run and maintain as a volunteer. While I've done software development small and large for many years, I have no experience with C#, .NET and Mono specifically - beyond an annual server upgrade or little patch here and there for this one project.)

Ref #66.
Ref #13.

It is my understanding that:

  • It seems preferable not to lock into Mono-specifics. Making sure we can run on dotnet as well seems like a good long-term move.
  • The project is currently based on .NET Framework which seems to be discouraged nowadays. It is mainly intended for Windows desktop applications and is only partially implemented for Mono/Linux. Although those parts are built-in with Mono which is what currently makes that convenient. This historically makes sense as .NET Framework is generally what .NET/Mono applications were based on. But, nowdays there's a standardised and slimmer base (".NET Core") which is fully cross-platform supported.

Todo:

  • Ship the Mono libraries we use explicitly as a package dependency, instead of assuming it from the run-time
  • This is only Mono.Data.Sqlite currently I believe. And as I understand it, the DLL for Mono.Data.Sqlite would work as-is for dotnet.
  • Target ".NET Core" instead of ".NET Framework". Done by changing the TargetFramework element in the csproj file from e.g. net472 to netcoreapp3.1
  • This will presumably provide fewer and possibly different built-in APIs. I expect some changes may be needed which will need to be figured out and made accordingly.
  • Verify and integrate continuously with dotnet and mono side-by-side in CI.

CVNBot: not able to reload wikis

From https://phabricator.wikimedia.org/T152494#2863019:

[13:20]	<mafk> CVNBot8 reload ff.wikipedia
[13:20]	<CVNBot8> Unable to reload:
Unable to retrieve https://ff.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=namespaces&format=xml from server.
Error was: Error getting response stream (Write: The authentication or decryption has failed.): SendFailure

Sorry for reporting here but I don't know where else to do so.

Evaluate running from a shared binary

I'm not entirely sure, but it should be possible to have the multiple instances on the same server run from the same executable.

Right now the executable assumes a specific working directory where everything is self-contained. And I wouldn't be surprised if some parts of the code actually lookup paths relative to where the binary is stored as opposed to relative to the current working directory.

Status quo:

  • Moving the executable currently makes the process fail.
  • Not running it from the bot's own working directory also makes the process fail.

Assuming multiple instances of the same executable isn't infeasible in general, we can make it work for multiple instances either by standardising on utilizing the current working directory for everything, or (better yet) by passing the bot directory via command-line argument.

We'd need to make sure that argument gets preserved on respawn/restart.

Implement support for adminbots

Currently a bot can't be admin and an admin can't be bot.
Currently when a botadmin does, for example, mass-blocking of open proxies (like WindAdminBot on ru.wikipedia and RonaldB on nl.wikipedia), there are a few options:

Botadmin = admin (default). Actions are parsed (Autoblacklist occurs and both the action and the list-addition are reported in the channel). The reporting in the channel causes flooding + most likely a delay in the feed (huge message stack up) and will take long for the bot to catch up afterwards (if at all).

Botadmin = bot (possible right now). Actions are not parsed because in the bot the first check in Program.ReactToRCEvent() is the is_bot-check, which, if true, returns directly. No Autoblacklist, no reporting in the channel of anything.

Botadmin = bot & admin (requested feature). Action are parsed, and a new check is added to Program.ReactToRCEvent() that will cause botadmins to be acted upon in the backend (Autoblacklist) but no reporting back to the channel (ideally of both the action and of the list-addition, but just the action will already make a significant different). Hiding the list-addition success-message aswell may require a change in the AddToXX functions and/or the listman.addUserToList() aswell.


Imported from https://jira.toolserver.org/browse/SWMTBOT-25

  • Reporter: Krinkle
  • Created: 12 September 2010 21:48:49

Errors when an account is registered by another user

When a user creates another user account, CVNBot reports twice that the first account was registered.

Real example of today (March, 14) on #cvn-wp-es:

20:49:09 <+CVNBot3> Usuario nuevo [[es:User:Fffgfdghfdgfdgf]] creado. Bloquear: http://es.wikipedia.org/wiki/Special:Blockip/Fffgfdghfdgfdgf
20:50:44 <+CVNBot3> Usuario nuevo [[es:User:Fffgfdghfdgfdgf]] creado. Bloquear: http://es.wikipedia.org/wiki/Special:Blockip/Fffgfdghfdgfdgf

While CVNBot3 should have reported:

20:49:09 <+CVNBot3> Usuario nuevo [[es:User:Fffgfdghfdgfdgf]] creado. Bloquear: http://es.wikipedia.org/wiki/Special:Blockip/Fffgfdghfdgfdgf
20:50:44 <+CVNBot3> Usuario nuevo [[es:User:Fffgfdghfdgfdg]] creado. Bloquear: http://es.wikipedia.org/wiki/Special:Blockip/Fffgfdghfdgfdg

(Note the final "f"; I know this is not the clearest example, but it is a real one, a vandalism.)

You can check these registrations on https://es.wikipedia.org/wiki/Especial:Registro/Fffgfdghfdgfdgf.

Thanks in advance.

Command "reload" broken due to HTTPS

Since HTTPS is enforced on Wikimedia wikis, the reload command seems to be broken.

For example:

<Krinkle> CVNBot8 reload wuu.wikipedia
<CVNBot8> Unable to reload: Unable to retrieve http://wuu.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=namespaces&format=xml from server. Error was: Error getting response stream (Write: The authentication or decryption has failed.): SendFailure

Can not load webresources due to "SecureChannelFailure#012" occured

Oct 11 17:02:13 cvn-app9 CVNBot.exe[1486]: ERROR [Main] CVNBot.Program [CVNBot27] Add project failed#012System.Exception: Unable to retrieve https://hr.wikipedia.org/w/api.php?format=xml&action=query&meta=siteinfo&siprop=namespaces from server. Error was: Error: SecureChannelFailure (The authentication or decryption has failed.)#12 at CVNBot.CVNBotUtils.GetRawDocument (System.String url) [0x0006c] in <41b7a280f8604822832dac88d2357f10>:0 #12 at CVNBot.Project.GetNamespaces (System.Boolean snamespacesAlreadySet) [0x0002c] in <41b7a280f8604822832dac88d2357f10>:0 #12 at CVNBot.Project.RetrieveWikiDetails () [0x00000] in <41b7a280f8604822832dac88d2357f10>:0 #12 at CVNBot.ProjectList.AddNewProject (System.String projectName, System.String interwiki) [0x0022e] in <41b7a280f8604822832dac88d2357f10>:0 #12 at CVNBot.Program.Irc_OnChannelMessage (System.Object sender, Meebey.SmartIrc4net.IrcEventArgs e) [0x00596] in <41b7a280f8604822832dac88d2357f10>:0

Release 1.22

  • Promote master branch to 1.22.0-beta.1 (this allows us to distinguish with BotName version on IRC from 1.22.0-alpha).
  • Upgrade 1 CVNBot to the latest version.
    https://github.com/countervandalism/infrastructure/blob/master/tasks.yaml#L98
  • Slowly upgrade the other CVNBot instances on cvn-app servers to the latest version.
  • In case of any issues, fix the bugs, increase beta version, and start again.
  • Once applied to all instances without issues for a week, update change log, and bump master branch to 1.22.0 (stable).
  • Create tag v1.22.0 tag in git.
  • Bump master branch to 3.0.0-alpha.

Deployment for 1.22.0-beta.1

  • CVNBot18 (cvn-app8)

Deployment for 1.22.0-beta.2

  • cvn-app8
    • Cubbie
    • CVNBot1
    • CVNBot2
    • CVNBot3
    • CVNBot4
    • CVNBot5
    • CVNBot18

Deployment for 1.22.0-beta.3

  • cvn-app8
    • Cubbie
    • CVNBot1
    • CVNBot2
    • CVNBot3
    • CVNBot4
    • CVNBot5
    • CVNBot12
    • CVNBot13
    • CVNBot14
    • CVNBot18
    • CVNBot20
    • CVNBot21
    • SWBot3
  • cvn-app9
    • CVNBot6
    • CVNBot7
    • CVNBot8
    • CVNBot9
    • CVNBot10
    • CVNBot16
    • CVNBot17
    • CVNBot19
    • CVNBot22
    • CVNBot23
    • CVNBot24
    • CVNBot25
    • CVNBot26
    • CVNBot27
    • CVNBot28
    • CVNBot29

Unable to reload - returns 404

Same as #20 and #25 but this time it is not fetching api.php but:

[17:41:41] <Hauskatze> CVNBot5 reload meta.wikimedia
[17:41:41] <CVNBot5> Unable to reload: Unable to retrieve https://meta.wikimedia.org/w/index.php?title=MediaWiki:Undeletedarticle&action=raw&usemsgcache=yes from server. Error was: The remote server returned an error: (404) Not Found.

Indeed visiting https://meta.wikimedia.org/w/index.php?title=MediaWiki:Undeletedarticle&action=raw&usemsgcache=yes returns 404, while https://meta.wikimedia.org/w/index.php?title=MediaWiki:Undeletedarticle does exist.

Avoid adding duplicate keys to the message attributes

Errors like "Key duplication when adding: watchword" are popping up fairly often in the debug broadcast.

I believe they are caused by attribs.Add("watchword", ***) occurring twice (or more) during a single ReactToRCEvent() call.

Should be fairly easy to fix by if() checking with attribs.ContainsKey("watchword"), and if true, attribs.Remove("watchword") before adding another.


Imported from https://jira.toolserver.org/browse/SWMTBOT-28

  • Reporter: Krinkle
  • Created: 21 September 2010

Investigate memory leak

The bots are increasingly building up memory usage when running. Over a weeks time of running half a dozen bots, it accumulates to several gigabytes of memory being allocated.

Consider hiding deletion events in CVN channels

There are sometimes many deletion log actions by an admin, that is not a bot. Maybe it is scripted, or maybe there is just a lot to delete.

(From #countervandalism earlier today)

loftyabyss: bots could possibly handle repeated actions better... bswiki has had deletions for hours, so -sw can't be used for much else in the meantime...

loftyabyss: Krinkle: deleter wasn't a bot, but was clearly running some kind of script, but I thought maybe the cvn bots could ignore repeated actions, or perhaps ignore those on the whitelist, I tried adding them but they still appeared in the channel

Current state:

  • CVNBot hides all actions by users on its bot list by default.
  • CVNBot hides all actions made with a "flood" or "bot" flag by default.
  • CVNBot supports showing block actions, delete actions and other log actions by admins.
  • For each type of action (edit, new page, new account, delete page, protect page, block user, etc.) it is possible to decide when to show it:
    • 1= Show always.
    • 2= Hide for trusted users (local admins, local bots, global whitelist), but show the rest.
    • 3= Hide always.

For log actions like "delete page" the different ways to show it does not make much difference since the only kind of user with the ability to delete pages (generally) are admins. So if we want to hide them for trusted users, that is almost the same as hiding always.

So the questions are:

  • Is it sometimes useful for CVNBot to report page deletions?
  • Is it useful in #cvn-sw?
  • Is it useful in other channels?

The bot is configured for each channel separately, so we can change this one at a time. There is no need for all channels to agree :)

Trim whitespace when dealing with list manager

Just a simple one, but I don't know right now at which locations in the code it's done best.

When copying/pasting it prevents some confusion and speeds up the proces by ignoring whitespace around the input, both when requesting info and when inserting/deleting/updating info.

In all cases "CVNBot xx yy John" should be treated equally as "CVNBot xx yy John ".

  • trim() the input in the Command-thing
  • Before putting it live, check a database dump to get an idea of if and how many un-trimmed items there are in the database. These should either be fixed, or, alternatively, be deleted from the database since they dont work in practice anyway since wiki trims whitespace as well.

Imported from: https://jira.toolserver.org/browse/SWMTBOT-12

  • Reporter: Krinkle
  • Created: 13 June 2010 19:10:59

Fix broken handling of newusers/create2 events

It seems the pattern for these log message has changed at some point in 2009 and the logs have been full of errors about them since.

# Project.cs

            rCreate2Regex = new Regex( namespaces["2"]+@":(.*): \[\[" );
# RCReader.cs
                                 Match mc2 = ((Project)Program.prjlist[rce.project]).rCreate2Regex.Match(rce.comment);

Example of actual irc.wikimedia.org entries:

# 2016-11-02 on #nl.wikipedia

[[Speciaal:Log/newusers]] create2  * BRPots *  created new account Gebruiker:BRPwiki: eerder fout gemaakt

# 2016-11-02 on #nl.wikipedia
[[Speciaal:Log/newusers]] create2  * Sherani koster *  created new account Gebruiker:Rani farah koster

# 2017-10-13 on #en.wikipedia:
[[Special:Log/newusers]] create2  * Ujju.19788 *  created new account User:Upendhare

The MediaWiki legacy-irc message for this is newuserlog-create2-entry which contains:

created new account $1

This was changed in 2009 (diff), before that it was:

created account for $1

I guess it the user name used to be wrapped in [[ and ]], but is no longer. This makes it slightly more difficult to extract given that the log comment is also optional and appended right after an (optional) colon after the username. But I guess colons are illegal in usernames so changing the pattern as follows should work.

-            rCreate2Regex = new Regex( namespaces["2"]+@":(.*): \[\[" );
+            rCreate2Regex = new Regex( namespaces["2"]+@":([^]:+)" );

Fix unmatched block type errors

Jan  6 14:58:57 cvn-app8 CVNBot.exe[2312]: WARN  [RCReader] CVNBot.RCReader [SWBot3] 
Unmatched block type in simple.wikipedia:
#00314[[#00307Special:Log/block#00314]]#0034 block#00310 #00302#003 #0035*#003 #00303<admin>#003 #0035*#003  #00310blocked [[#00302User:<target>#00310]] with an expiration time of 3 months (anonymous users only, account creation blocked): <comment>#003

Jan  6 13:52:08 cvn-app8 CVNBot.exe[15307]: WARN  [RCReader] CVNBot.RCReader [Cubbie] 
Unmatched block type in commons.wikimedia:
#00314[[#00307Special:Log/block#00314]]#0034 block#00310 #00302#003 #0035*#003 #00303<admin>55#003 #0035*#003  #00310blocked [[#00302<target>#00310]] with an expiration time of indefinite (account creation blocked, cannot edit own talk page): <comment>#003

Jan  6 14:46:34 cvn-app8 CVNBot.exe[15307]: WARN  [RCReader] CVNBot.RCReader [Cubbie] 
Unmatched block type in commons.wikimedia: #00314[[#00307Special:Log/block#00314]]#0034 block#00310 #00302#003 #0035*#003 #00303<admin>#003 #0035*#003  #00310blocked [[#00302<target>1#00310]] with an expiration time of indefinite (account creation blocked): <comment>#003

Might be related to df39aaa

Add user to bot list without removing from whitelist

Currently, if a sysop on the whitelist is flooding a channel with deletions, we'd have to do something like this.

Savh: CVNBot10 bot add Alan p=uz.wiktionary x=10 r=flooding
CVNBot10: Alan is already on whitelist, cannot add to bot list
Savh: CVNBot10 wl del Alan
CVNBot10: Deleted Alan from global whitelist
Savh: CVNBot10 bot add Alan p=uz.wiktionary x=10 r=flooding
CVNBot10: Added: Alan is on uz.wiktionary bot list, added by Savh until 02:05, 18 January 2015 ("flooding")
Savh: CVNBot10 wl add Alan
CVNBot10: Added: Alan is on global whitelist, added by Savh until the end of time ("No reason given")
Savh: CVNBot10 wl add Alan r=Trusted
CVNBot10: Updated: Alan is on global whitelist, added by Savh until the end of time ("Trusted")

It would be nice if we could add a user to a botlist without removing from whitelist.


Similar issue with admin list: #26

Slashes in titles

I've confirmed in several channels (#cvn-wp-en and #cvn-wp-nl) that pages that contain a slash in their title (ie. "Wikipedia:Sandbox/Subpage" or "Sand/box" are NOT reported by the bot. I've been unable to find in the sourcecode why this is.

Though I can't say for sure, I think it's got something to do with line 47 in RCReader.cs:

static Regex fullString = 
new Regex(@"^\x03" + @"14\[\[\x03" + @"07(?<title>.+?)\x03" + @"14\]\]\x03" + @"4 (?<flag>.*?)\x03" + @"10 \x03"
 + @"02(?<url>.*)\x03 \x03" + @"5\*\x03 \x03" + @"03(?<user>.*?)\x03 \x03" + @"5\*\x03 (?<szdiff>.*?) \x03" + @"10(?<comment>.*)\x03$");

A subpage edit in irc.wikimedia.org/#en.wikipedia looks like the following (without the colors, but check the irc.wikimedia.org-channel to see them)

rc:
[[User talk:82.74.192.60/Archive]] http://en.wikipedia.org/w/index.php?diff=364691190&oldid=364691134 * 82.74.192.60 * (+2256) test

I'm not a master in Regex'es so I didn't see anything wrong with it on first sight, but someone else might understand why it would fail a "/".

Note: Deletions of slash-containing-pagetitles do get reported by the bot (ie. Admin Krinkle deleted [[commons:Test/Sub]]).

Also note that with all Log-events the "page" in the irc.wikimedia stream is [[Special:Log/thing]], so it doesn't completely fail at slashes.

But edits and creations of such titles seem not to be reported by the bot under any circumstances (anonymous users, watched pages, blacklisted users, large edits, anything).


Imported from http://meta.wikimedia.org/w/index.php?title=Talk:Countervandalism_Network/CVNbot&oldid=1994784#Slashes_in_titles


Imported from https://jira.toolserver.org/browse/SWMTBOT-6

  • Reporter: Krinkle
  • Created: 03 June 2010 13:51:14

#cvn-wikidata loses RCReader connection

It seems for weeks now, the CVNBot for cvn-wikidata (CVNBot18) keeps losing its irc.wikimedia.org connection. That is, within minutes or hours of restarting the bot, we stop receiving messages indefinitely. After this amount of time, something happens after which it presumably is no longer connected to irc.wikimedia.org. I say presumably since the logs say it is connected. But from my own observations, it is not in any channel, and /whois says "No such nickname online".

During this time, the logs are filled with just repeated messages every minute stating that it has Connected to irc.wikimedia.org, but there is no signs of any errors or other explanation for what happened.

Jun 22 10:21:41 cvn-app8 CVNBot.exe[21520]: INFO  [Main] CVNBot.Program [CVNBot18] [1997kB] ordered a restart
Jun 22 10:21:42 cvn-app8 CVNBot.exe[21520]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:21:42 cvn-app8 CVNBot.exe[21520]: INFO  [Main] CVNBot.Program [CVNBot18] Executing: mono /srv/cvn/services/cvnbot/CVNBot18/CVNBot.exe
Jun 22 10:21:42 cvn-app8 CVNBot.exe[11922]: INFO  [Main] CVNBot.Program [CVNBot18] Loaded main configuration from /srv/cvn/services/cvnbot/CVNBot18/CVNBot.ini
Jun 22 10:21:42 cvn-app8 CVNBot.exe[11922]: INFO  [Main] CVNBot.Program [CVNBot18] Loading messages from ./Console.msgs
Jun 22 10:21:42 cvn-app8 CVNBot.exe[11922]: INFO  [Main] CVNBot.ProjectList [CVNBot18] Reading projects from ./Projects.xml
Jun 22 10:21:42 cvn-app8 CVNBot.exe[11922]: INFO  [Main] CVNBot.Program [CVNBot18] Connected to irc.libera.chat
Jun 22 10:21:42 cvn-app8 CVNBot.exe[11922]: INFO  [Main] CVNBot.Program [CVNBot18] Joining feed channel: #cvn-wikidata
Jun 22 10:21:42 cvn-app8 CVNBot.exe[11922]: INFO  [Main] CVNBot.Program [CVNBot18] Joining control channel: #cvn-bots
Jun 22 10:21:42 cvn-app8 CVNBot.exe[11922]: INFO  [Main] CVNBot.Program [CVNBot18] Joining broadcast channel: #cvn-broadcast
Jun 22 10:21:42 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Thread started
Jun 22 10:21:42 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:21:42 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Joining 1 channels
Jun 22 10:21:52 cvn-app8 CVNBot.exe[11922]: INFO  [Thread Pool Worker] CVNBot.ListManager [CVNBot18] Tim threw away 74 items
Jun 22 10:43:13 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:44:38 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:45:42 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:46:46 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:47:50 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:48:54 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:49:58 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:51:02 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:52:06 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:53:10 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:54:14 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:55:18 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:56:22 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:57:26 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:58:30 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 10:59:34 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org
Jun 22 11:00:38 cvn-app8 CVNBot.exe[11922]: INFO  [RCReader] CVNBot.RCReader [CVNBot18] Connected to irc.wikimedia.org

Settings to suppress database connection ListManager.ClassifyEditor (CVNBlackRock)

Going to build in a setting (boolean) which when enabled will prevent the bot from opening a database connection when trying to get a usertype in ListManager.ClassifyEditor() (which will then fallback to the regex to return 3 (anon) or 4 (user)).

Reason for this is the CVNBlackRock bot (a bot that monitors a lot of big and medium wikis that dont want a channel or dont have one yet (running SWMTBot). This for the simple reason to close the gab of reading Block-events for Autoblacklisting and Greylist-triggers for Autogreylisting.
CVNBlackRock doesn't have any feedchannel output and together with this setting it will be super fast (no output queue, no delay in waiting for database) and just broadcasting to other bots.


Imported from https://jira.toolserver.org/browse/SWMTBOT-30

  • Reporter: Krinkle
  • Date: 26 September 2010 02:14:57

Evaluate whether custom message queue is still needed

Since the upgrade of Meebey.SmartIrc4net from 0.4 to 1.1 (e2d92e7), I think our custom message queue has become obsolete.

I disabled some of the more hacks that caused warnings in the new compiler after the upgrade, but I suspect the remaining custom code can be removed as well.

We currently have our own custom message queue thread, that continuously polls the high-priority and low-priority queues to send messages. And the thread is also disabled/resumed by our custom code based on how many bytes we've sent recently and when the last PING/PONG happened.

As far as I can see, this is all handled by SmartIrc4net and no longer requires us doing this. Would be good to try and get rid of it and see what happens.

Secondary account creations are displayed as if the creator's account is being created

User:A is registered already and creates an account for User:B. CVNBot will however display that User:A has been created.

  • Recent example: <CVNBot5> New user [[m:User:JGulingan (WMF)]] created. Block: https://meta.wikimedia.org/wiki/Special:Blockip/JGulingan_%28WMF%29

  • Real log: (User creation log); 22:26 . . User account EYildrim (WMF) (talk | contribs | block) was created by JGulingan (WMF) (talk | contribs | block) and password was sent by email ‎(WMF Fellow)

  • Suggested output: New user $new_user was created by $creator. Block: $blocklink

Thank you.

Ignore multiple spaces between user name and expiry in list commands

# Without expiry, fine.
CVNBot14 bl add 71.237.252.43  r=vandalism on en.wikipedia
<•CVNBot14> Added: 71.237.252.43 is on global blacklist, added by Krinkle until 02:38, 27 May 2018 ("vandalism on en.wikipedia")

# With two spaces before "x=", adds it as user name, bug!
CVNBot14 bl add 71.237.252.43  x=7440  r=vandalism on en.wikipedia
 <•CVNBot14> Added: 71.237.252.43  x=7440 is on global blacklist, added by Krinkle until 02:39, 27 May 2018 ("vandalism on en.wikipedia")

# With one space before "x=", works as expected.
<•Krinkle> CVNBot14 bl add 71.237.252.43 x=5000 r=vandalism on en.wikipedia
<•CVNBot14> Updated: 71.237.252.43 is on global blacklist, added by Krinkle until 10:43, 20 November 2018 ("vandalism on en.wikipedia")

# With two spaces before "x=", while bad, worked for "bl add", but fails for "bl del", double bug!
CVNBot14 bl del 71.237.252.43  x=7440
<•CVNBot14> Deleted 71.237.252.43 from global blacklist

This means they can be added, but not removed. Anyway, we should fix it so that multiple spaces are ignored.

The problem is at https://github.com/countervandalism/CVNBot/blob/6c60cd4bd1d0f6d8a67847b91ae7fa89570ee827/src/CVNBot/ListManager.cs#L30

^(?<cmd>add|del|show|test) (?<item>.+?)(?: p=(?<project>\S+?))?(?: x=(?<len>\d{1,4}))?(?: r=(?<reason>.+?))?$            

It only considers x = as expiry if there is no space after its value or if there is exactly one space between its value and r=. Given there were multiple spaces, it was not matched, and instead consumed by the <item> match.

Allow a global bot list

CVNBot6: There are no global admins/bots. Specify a project with the p= parameter.

Please add support for a global bot list. Not only global bots exist, but sometimes it's good to have some individual temporary ignored on the bots to avoid flooding, like when we perform global renames.

Addition to this list should be posible without the need to remove the user from any other list he's into (see #18 & #26).

Thank you!

Evaluate whether userlist-thread lock is working

Gendarme provides various warnings for the following type of issue:

190. WriteStaticFieldFromInstanceMethodRule
* Target:   System.Void CVNBot.ListManager::AddGroupToList()
* Details:  The static field 'currentGetThreadWiki', of type 'System.String'. is being set in an instance method.

..

192. WriteStaticFieldFromInstanceMethodRule
* Target:   System.String CVNBot.ListManager::ConfigGetAdmins(System.String)
* Details:  The static field 'currentGetThreadWiki', of type 'System.String'. is being set in an instance method.

199. WriteStaticFieldFromInstanceMethodRule

Problem: This instance method writes to static fields. This may cause problem with multiple instances in multithreaded applications.
* Severity: Medium, Confidence: High
* Target:   System.Void CVNBot.ListManager::BatchGetAllAdminsAndBots()
* Source:   debugging symbols unavailable, IL offset 0x010e
* Details:  The static field 'currentGetThreadMode', of type 'System.String'. is being set in an instance method.

Solution: Move initialization to the static constructor or ensure appropriate locking.
More info available at: https://github.com/spouliot/gendarme/wiki/Gendarme.Rules.Concurrency.WriteStaticFieldFromInstanceMethodRule(2.10)

I'm not entirely convinced this logic is actually working. At least in case of BatchGetAllAdminsAndBots it seems like the state is checked, but then it continues regardless.

We should see if there's a better way to deal with this.

Add support for revision-delete log actions

We already show various log actions (block, page deletion etc.). It'd be nice to also add a subscription for revision deletion and other visibility changes.

Sample:

Regular page deletion (already handled):

12:46 rc-pmtpa: [[Special:Log/delete]] delete * Krinkle *  deleted "[[MediaWiki:ArquivarPEs.js]]": Test

Revision deletion / visibility change (not yet handled):

12:46 rc-pmtpa: [[Special:Log/delete]] revision * Krinkle *  Krinkle changed visibility of a revision on page [[Main Page]]: content hidden and edit summary hidden
screen shot 2016-11-25 at 12 48 03

Code:

/src/CVNBot/RCReader.cs#L344-L347

We'll need to figure a way to parse the log summary from the localisation messages so that #cvn-sw, for example, shows all changes in the same language (e.g. English).

We'll also want inclusion of this event to be configurable as #cvn-sw might not actually want to include this in order to reduce noise.


Originally requested by Moira2 (#cvn-wp-nl / #wikipedia-nl).

Support for IP ranges in block lists

It's been suggested by mike in 2008: support for ranges should be implemented.

At the very least the ranges MediaWiki also supports in blocking, but I think it'd be better to extend this to a more flexible range, and as a bonus put in a ranges MediaWiki supports.

So that for example the following would be possible

  • CVNBot bl add 127.0.0.1-127.0.1.214

And to make make it compatible with Autoblacklistings that follow a range-block:

  • CVNBot bl add 127.0.0.0/25 x=0 r=Autoblacklist: Repeated vandalism, {{schoolblock}} on commons.wikimedia

which the software would convert to

  • CVNBot bl add 127.0.0.0-127.0.0.127

See also:
https://commons.wikimedia.org/wiki/User:Kanonkas/Tools#Range_block_calculators

Then when performing the "intel" or "bl show" command it could return something like this

  • CVNBot intel 127.0.0.50
    CVNBot: 127.0.0.50 in range 127.0.0.0-127.0.0.127 is on global blacklist , added by Krinke until the end of time ("Autoblacklist: Repeated vandalism, {{schoolblock}} on commons.wikimedia")

Imported from https://jira.toolserver.org/browse/SWMTBOT-7

  • Reporter: Krinkle
  • Created: 03 June 2010

Implement MySQL support

Because SQLite really sucks at scale, especially the way CVNBot does it right now with the .NET/C-sharp implementation that essentially re-opens and re-closes a 30MB+ file on every query and thus for every packet received from recent changes feeds.

There is also a ton of complexity right now from the 25 CVNBot instances trying to sync their lists through a private "broadcast" IRC channel. This federation feature is cool and super useful when the bots run on different servers. But for many years now we've consolidated them all under Wikimedia Cloud so they can totally use a shared database instead. The broadcasting feature could then be turned off in our primary deployment.

Protection flags displayed inside article name

<CVNBot5> Admin [[m:User:Xaosflux]] Protected [[m:User talk:A-Chinese-Wikipedian ‎[edit=autoconfirmed] (expires 13:58, 23 August 2018 (UTC))‎[move=autoconfirmed] (expires 13:58, 23 August 2018 (UTC))]] "Persistent vandalism"

the article closing ]] tags should be at the end of the article name.

Fatal error on quit

After running command CVNBot1 quit in #cvn-bots, CVNBot1 quit as expected. But Cubbie also quit as the same time, which seems rather suspicious.

Syslog for Cubbie:

Jan  6 16:53:01 cvn-app8 CVNBot.exe[15307]: FATAL [Main] CVNBot.Program [Cubbie] Error occurred in Main IRC try clause!#012
System.TypeLoadException: Failure has occurred while loading a type.#012
  at Meebey.SmartIrc4net.IrcClient._Event_PRIVMSG (Meebey.SmartIrc4net.IrcMessageData ircdata) [0x0004f] in <10085ae304b34015a0d4b345e6c4c3d5>:0 #012
  at Meebey.SmartIrc4net.IrcClient._HandleEvents (Meebey.SmartIrc4net.IrcMessageData ircdata) [0x00162] in <10085ae304b34015a0d4b345e6c4c3d5>:0 #012  at Meebey.SmartIrc4net.IrcClient._Worker (System.Object sender, Meebey.SmartIrc4net.ReadLineEventArgs e) [0x0000e] in <10085ae304b34015a0d4b345e6c4c3d5>:0 #012  at (wrapper delegate-invoke) <Module>:invoke_void_object_ReadLineEventArgs (object,Meebey.SmartIrc4net.ReadLineEventArgs)#012  at Meebey.SmartIrc4net.IrcConnection.ReadLine (System.Boolean blocking) [0x000b7] in <10085ae304b34015a0d4b345e6c4c3d5>:0 #012  at Meebey.SmartIrc4net.IrcConnection.Listen (System.Boolean blocking) [0x0000e] in <10085ae304b34015a0d4b345e6c4c3d5>:0 #012  at Meebey.SmartIrc4net.IrcConnection.Listen () [0x00001] in <10085ae304b34015a0d4b345e6c4c3d5>:0 #012
  at CVNBot.Program.Main () [0x009bc] in <b57dca1001394178a439d85f73850da0>:0

Possibly related, when I gave SWBot3 the command to quit, it did leave, but it also logged a fatal error while doing so:

Jan  6 16:53:31 cvn-app8 CVNBot.exe[2312]: INFO  [Main] CVNBot.Program [SWBot3] Krinkle ordered a quit
Jan  6 16:53:31 cvn-app8 CVNBot.exe[2312]: FATAL [Main] CVNBot.Program [SWBot3] Error occurred in Main IRC try clause!#012
System.BadImageFormatException: type 0x00 not handled in do_mono_metadata_parse_type on image /srv/cvn/services/cvnbot/SWBot3/CVNBot.exe#012
File name: 'CVNBot'#012
  at CVNBot.Program.Irc_OnChannelMessage (System.Object sender, Meebey.SmartIrc4net.IrcEventArgs e) [0x00330] in <b57dca1001394178a439d85f73850da0>:0 #012  at Meebey.SmartIrc4net.IrcClient._Event_PRIVMSG (Meebey.SmartIrc4net.IrcMessageData ircdata) [0x0004f] in <10085ae304b34015a0d4b345e6c4c3d5>:0 #012  at Meebey.SmartIrc4net.IrcClient._HandleEvents (Meebey.SmartIrc4net.IrcMessageData ircdata) [0x00162] in <10085ae304b34015a0d4b345e6c4c3d5>:0 #012  at Meebey.SmartIrc4net.IrcClient._Worker (System.Object sender, Meebey.SmartIrc4net.ReadLineEventArgs e) [0x0000e] in <10085ae304b34015a0d4b345e6c4c3d5>:0 #012  at (wrapper delegate-invoke) <Module>:invoke_void_object_ReadLineEventArgs (object,Meebey.SmartIrc4net.ReadLineEventArgs)#012  at Meebey.SmartIrc4net.IrcConnection.ReadLine (System.Boolean blocking) [0x000b7] in <10085ae304b34015a0d4b345e6c4c3d5>:0 #012  at Meebey.SmartIrc4net.IrcConnection.Listen (System.Boolean blocking) [0x0000e] in <10085ae304b34015a0d4b345e6c4c3d5>:0 #012  at Meebey.SmartIrc4net.IrcConnection.Listen () [0x00001] in <10085ae304b34015a0d4b345e6c4c3d5>:0 #012
  at CVNBot.Program.Main () [0x009bc] in <b57dca1001394178a439d85f73850da0>:0

This may have to do with the fact that I replaced the .exe files on the server as part of a deployment shortly before issuing these commands.

get(admin|bots)/batchgetusers should fully update the list of users

Apologies if this already happens.

The getadmins/getbots/batchgetusers commands should fully upgrade the list of admins, bots or admins and bots. That means not only adding missing people to the list, but also removing people no longer in those groups.

Example: we have a wiki with these sysops A, B, and C. B is desysopped and D is promoted to sysop; the list should contain in the next upgrade A, C and D, and B should be removed.

Like I said I am not sure if this is already happening, but if it is not, it'd be good if we could do it so we can have reliable list of users.

Thank you.

Avoid failing to process a message if one regex fails

When processing an event from RCReader, and one of the regexes on the BES is for some reason corrupt, we currently throw and not catch it until the outer try/catch from RCReader.rcirc_OnChannelMessage which then decides to ignore the message and move on.

Oct 13 00:04:19 cvn-app8 CVNBot.exe[8766]: ERROR [RCReader] CVNBot.RCReader Failed to handle RCEvent

System.ArgumentException: parsing "(?:A[^\A-z0-9]N[^\A-z0-9]A[^\A-z0-9]N[^\A-z0-9]I[^\A-z0-9]Z[^\A-z0-9]I|__ping__:Vito-Genovese, HakanIST)" - Unrecognized escape sequence \A.
at System.Text.RegularExpressions.RegexParser.ScanCharEscape ()
at System.Text.RegularExpressions.RegexParser.ScanCharClass (..)
at System.Text.RegularExpressions.RegexParser.CountCaptures ()
at System.Text.RegularExpressions.RegexParser.Parse (..)
at System.Text.RegularExpressions.Regex..ctor (..)
at System.Text.RegularExpressions.Regex.IsMatch (..)
at System.Text.RegularExpressions.Regex.IsMatch (..)
at CVNBot.ListManager.matchesList (System.String title, System.Int32 list) 
at CVNBot.Program.ReactToRCEvent (CVNBot.RCEvent r)
at CVNBot.RCReader.rcirc_OnChannelMessage (System.Object sender, Meebey.SmartIrc4net.IrcEventArgs e)

Instead, we should catch these within ListManager.matchesList and just return false, the same way as if the regex was working but didn't match (which is the most common case).

Greylist should not override Blacklist

When blacklisted user (in this case, by block on other channel) matches greylist rule, that should not override blacklist.

Not sure if this is intended...

image

Error "IRC: Closing Link" should be handled

# CVNBot10
Jun  8 23:44:32 cvn-app9 CVNBot.exe[3319]: INFO  [Main] CVNBot.Program [CVNBot10] Connected to chat.freenode.net
Jun  8 23:44:32 cvn-app9 CVNBot.exe[3319]: INFO  [Main] CVNBot.Program [CVNBot10] Joining feed channel: #cvn-sw
Jun  8 23:44:32 cvn-app9 CVNBot.exe[3319]: INFO  [Main] CVNBot.Program [CVNBot10] Joining control channel: #cvn-bots
Jun  8 23:44:32 cvn-app9 CVNBot.exe[3319]: INFO  [Main] CVNBot.Program [CVNBot10] Joining broadcast channel: #cvn-broadcast
Jun  8 23:44:32 cvn-app9 CVNBot.exe[3319]: INFO  [RCReader] CVNBot.RCReader [CVNBot10] Thread started
Jun  8 23:44:32 cvn-app9 CVNBot.exe[3319]: INFO  [RCReader] CVNBot.RCReader [CVNBot10] Connected to irc.wikimedia.org
Jun  8 23:44:32 cvn-app9 CVNBot.exe[3319]: INFO  [RCReader] CVNBot.RCReader [CVNBot10] Joining 144 channels
Jun  8 23:44:42 cvn-app9 CVNBot.exe[3319]: INFO  [Thread Pool Worker] CVNBot.ListManager [CVNBot10] Tim threw away 1 items
Jun  8 23:45:07 cvn-app9 CVNBot.exe[3319]: ERROR [Main] CVNBot.Program [CVNBot10] IRC: Closing Link: 185.15.56.20 (Connection timed out)

The bot process remained alive with no action. It did not try to reconnect, restart, or exit (for external auto-restart from cron).

It just stayed running without actually being in the feed channels on Freenode.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.