User talk:Legobot


I II III HB AAB TFA Hale
Legobot Legobot II Legobot III Hockeybot ArticleAlertbot TFA Protector HaleBot


Index

Can somebody help me with indexing on my talk page? I use /Archives/<year>/<month>, and have already supplied the first_archive. Index. I don't see my name in the list of errors. Am I supposed to wait for it to cycle? --pro-anti-air ––>(talk)<–– 00:35, 7 October 2025 (UTC)[reply]

MFD clerk down?

@Legoktm: Legobot has not edited WP:MFD since October 2; are you able to give it a nudge? Thanks, HouseBlaster (talk • he/they) 22:28, 7 October 2025 (UTC)[reply]

This seems to be happening again; another nudge would be appreciated :) HouseBlaster (talk • he/they) 21:09, 7 December 2025 (UTC)[reply]
@Legoktm: friendly nudge :) HouseBlaster (talk • he/they) 19:28, 16 December 2025 (UTC)[reply]
@HouseBlaster: sorry about the delay, it's fixed now. Legoktm (talk) 02:01, 21 December 2025 (UTC)[reply]

Curious if Legobot is still fixing Linter font errors

I stumbled across this Linter fix, which seemed like an easy one that Legobot should have gotten to. Is Legobot still patrolling the obsolete tag lists for all possible font tag fixes? – Jonesey95 (talk) 23:52, 4 December 2025 (UTC)[reply]

Hey @Jonesey95, it was busted for the same reason the MFD archiving task above was also broken. I should have fixed it, we'll find out tomorrow. Also, I finally deployed the code I worked on back in January(!!) to support using arbitrary regexes for fixes. Hopefully shouldn't be too difficult to expand it if everything else is working again... Legoktm (talk) 05:22, 21 December 2025 (UTC)[reply]
Fingers crossed. We have a couple million more Linter errors to fix, and many of them need a regex bot or script to handle efficiently. – Jonesey95 (talk) 17:08, 21 December 2025 (UTC)[reply]
@Jonesey95: Legobot is still skipping all userspace pages, based on a suggestion you made during the BRFA (though now that I re-read it, I think you might've meant for that to be temporary). Do you think it should keep skipping those or should I remove that restriction? Legoktm (talk) 17:27, 30 December 2025 (UTC)[reply]
I recommend working on User pages at this point. My caveat was for the initial run and deliberately used the words "for a while" to indicate that there was plenty of work to be done in other namespaces. When I edit User pages, I typically amend my edit summary to include a phrase like "I hope you don't mind this minor cleanup edit in your user space" or "Feel free to restore, preferably if errors are fixed", since User space is generally allowed more latitude for experimental and draft content and formatting. – Jonesey95 (talk) 17:33, 30 December 2025 (UTC)[reply]
I also think it's safe to let the bot fix userspace. I've hardly ever been reverted fixing issues. Gonnym (talk) 19:42, 30 December 2025 (UTC)[reply]
Thanks both, it's running now. :) Legoktm (talk) 04:05, 31 December 2025 (UTC)[reply]
How's that regex code implementation looking? I keep looking at these 4,000 pages, and the Welcome messages without ending /div tags, and some of the other giant blobs at Wikipedia:Linter/Signature submissions that require a bit of regex, and hoping that Legobot will be able to tackle some of them. I have a bunch of the regexes ready, but I don't have bot-making ability, and doing thousands of identical replacements manually gets old fast. – Jonesey95 (talk) 00:28, 14 January 2026 (UTC)[reply]

Is Legobot ignoring certain errors when checking before saving?

If I recall correctly, Legobot checks its proposed revisions before saving to see if there are any remaining Linter errors. Is it still doing that? I support the methodology; I'm asking because of the follow-up question below.

If it is doing that, is it ignoring duplicate id and background color errors? It probably should, since those are basically down-the-road errors that don't need to be fixed right now. They are also often caused by templates, so editing a given page that transcludes one of those templates won't fix the errors on that page. Am I making sense? Basically, if Legobot is checking for remaining errors, it should check only for those currently listed on the Firefly table. – Jonesey95 (talk) 18:10, 31 December 2025 (UTC)[reply]

Correct, it does check there are no more lint errors, and doesn't ignore any type. Matrix mentioned the same thing regarding the dark mode/background color ones. I haven't had the chance to fully read up on it but I was also leaning that way.
I think the duplicate ID ones Legobot shouldn't ignore, since it's using an HTML parser and that's invalid HTML. For ones where the duplication isn't via a template, I was wondering if we could just auto-renumber them, e.g. id="Foo", id="Foo2", id="Foo3", etc. Legoktm (talk) 03:15, 2 January 2026 (UTC)[reply]
The problem with duplicate ids is that many (most?) of them are caused by CS1 citation templates, and there is no desire at all to change the behavior of those templates to fix what is seen as a non-problem. And the problem with night mode errors, as I said above, is that many of them are caused by templates and can't be resolved in the wikitext of a given page. I'd be curious to know how many "Publish" operations are skipped solely because there are night mode or duplicate id errors remaining on a page. It seems a shame to force humans to perform bot-fixable edits.
Does Legobot at least ignore the "hidden" Linter errors like "large-tables" (id 20) and "missing-image-alt-text" (id 23)? Those are not reported in "Page information", primarily because the WMF developers are still working out the kinks to figure out whether the tracking of those conditions even makes sense to perform. – Jonesey95 (talk) 04:53, 2 January 2026 (UTC)[reply]
I agree with Jonesey here, the bot should ignore the duplicate id and background color as those aren't things we can really fix now. They require much more investigation and discussion (and code changes) so holding off on fixing fixable issues for them isn't really helpful to the project (and won't be a spam on watchlist as again, no fix for those two issues is on the current radar). Gonnym (talk) 19:51, 2 January 2026 (UTC)[reply]
@Jonesey95, @Gonnym: Yes, the hidden Linter error categories are ignored because they just don't appear in the API output.
As for numbers, I created a new webpage (warning: large page, will take some time to load) that groups all the pending edits by which errors are preventing it. In short: duplicate-ids is blocking 213 pages, night mode is 15k pages, duplicate-ids + night mode is 503 pages.
I do want to look at the duplicate-ids problem a bit more, e.g. this one seems trivially fixable by a bot just adjusting one of the IDs. My spot checking is that the majority are from templates, but I'm guessing there's some minority not from templates and just bad copy-paste issues. Legoktm (talk) 05:25, 4 January 2026 (UTC)[reply]
Fascinating list. I find it odd that it lists so many html5 misnesting errors that aren't actually present. We eliminated those errors from all of en.WP long ago. [edited to add: I think I get it, after working on some of the pages in the list: The html5 misnesting errors are present after Legobot applies its fixes. It is happening to me as well. They usually start out as regular misnesting errors, but the parser sees them differently after some wikitext changes are applied.] – Jonesey95 (talk) 14:49, 4 January 2026 (UTC)[reply]
Hmm, I need to look at this closer. In e.g. this page it's turning a <font> into a <span> instead of a <div>, which is causing the html5-misnesting issue...but I already have code to check for block elements, which should have it use a div. Legoktm (talk) 20:12, 4 January 2026 (UTC)[reply]
Why does it want to change [[Clay Felker]] to [[Clay Felker|<span style="color: yellow;">Clay Felker</span>]]? Gonnym (talk) 10:02, 5 January 2026 (UTC)[reply]
There's a <font size="+3" color="yellow" face="Comic Sans MS">...</font> element enclosing the whole of the page. All child elements inherit the size and face, but not all inherit the colour - a notable case being the <a>...</a> element (because clickable links should be distinguishable from plain text by their colours). Some years ago, a previous version of MediaWiki, upon encountering a construct like
<font color="yellow">[[Clay Felker]]</font>
would twist this around and emit HTML like
<a href="/wiki/Clay_Felker" title="Clay Felker"><span style="color:yellow;">Clay Felker</span></a>
MediaWiki no longer does that, which is why we launched the good ship delinter some years back. It's not just for picking up unclosed inline elements. --Redrose64 🦌 (talk) 12:12, 5 January 2026 (UTC)[reply]

Update rfcbot with a replacement anchor template

Hi @Legoktm, Happy new year! I created an issue and PR to implement WhatamIdoing's suggested improvement above this comment. I don't have an environment set up to test though. (Posting here for visibility, apologies if it's spam, also completely understand if including this change is not something you are up for, Based on this discussion near the linked comment, we'd like to keep the rfcId in the section text after the rfc template is removed. Dw31415 (talk) 14:28, 1 January 2026 (UTC)[reply]

Thanks @Dw31415, I made a small fix and deployed the change. We'll see how it goes when an RFC expires. (fyi @Redrose64) Legoktm (talk) 03:03, 3 January 2026 (UTC)[reply]
Thank you! I’ll try to get a PHP environment set up before requesting any more changes. Dw31415 (talk) 03:46, 3 January 2026 (UTC)[reply]
Honestly I'm very overdue to port this to Rust. I don't know if that's a better language for you or not. Legoktm (talk) 16:35, 3 January 2026 (UTC)[reply]
I’d love to learn Rust so I’d be interested in collaborating on that. I’ve been spending a bunch of time over the holidays on the RfC bot ideas from Marshall and Whatanmidoing. I’ll have to see what’s realistic after normal life resumes. Dw31415 (talk) 17:27, 3 January 2026 (UTC)[reply]
Success! --Redrose64 🦌 (talk) 10:09, 3 January 2026 (UTC)[reply]
Woot. I made a small change to add a newline after the anchor template so it doesn't get merged onto the same line like in this edit. Legoktm (talk) 16:03, 3 January 2026 (UTC)[reply]

RFC database now public

Probably most of interest to @Redrose64 and @Dw31415...the RFC database is now public and named: s51043__rfcbot_p. It can be queried from Quarry (example) or directly on the server by any Toolforge user. Maybe it is interesting to others, but now I'm no longer the only person with access to it.

Possibly more importantly, while moving the database over, I noticed that it was using the latin1 charset, instead of utf8mb4!!! That would be a pretty solid (and slightly embarrassing) explanation for why it couldn't handle unicode characters in titles properly! So I've switched it over to utf8mb4🤞. (If that causes any issues I have a pre-charset conversion backup.) Legoktm (talk) 04:21, 4 January 2026 (UTC)[reply]

Aha! Maybe that will eliminate the need to create redirects like Talk:Siege of Neam? CitadelTalk:Siege of Neam? Citadel. --Redrose64 🦌 (talk) 17:52, 4 January 2026 (UTC)[reply]
Thanks! I downloaded the full list: https://quarry.wmcloud.org/query/100675# Dw31415 (talk) 20:17, 4 January 2026 (UTC)[reply]

Looping on page

So sorry. Please see Wikipedia:Bots/Noticeboard#Dw31415 - DwAlphaBot - SodiumBot conflict on RfCHistory Dw31415 (talk) 19:32, 9 January 2026 (UTC)[reply]