- Hello!! I am Cyberpower678. I am your typical run of the mill user here on Wikipedia.
- I specialize in bot work and tools, but I lurk around RfPP, AfD, AIV, and AN/I, as well as RfA. If you have any questions in those areas, please feel free to ask. :-)
- I also serve as a mailing list moderator and account creator over at the Account Creation Center. If you have any questions regarding an account I created for you, or the process itself, feel free to email the WP:ACC team or me personally.
- At current I have helped to create accounts for 2435 different users.
- Disputes or discussions that appear to have ended or is disputed will be archived.
All the best.—cyberpower
||01:05, 5 December 2016
different robots.txt for http and https
In some cases domains have differnt robots.txt for http and https example:
while the https version allow nothing, the http version is much more liberal. In cases with different robots.txt the Internet Archive Crawler should try both protokolls, if the first one disallow the achiving of the page.
- https://wayback.archive.org/save/https://www.wdr.de/tv/kopfball/sendungsbeitraege/2012/1230/lippen.jsp fails
- https://wayback.archive.org/save/http://www.wdr.de/tv/kopfball/sendungsbeitraege/2012/1230/lippen.jsp works fine.
Boshomi (talk) 17:36, 1 December 2016 (UTC)
- This is something the wayback machine should do, not the bot. But also, as is proof, content can be different on different protocols, usually isn't, but it can be.—cyberpowerMerry Christmas:Unknown 14:54, 2 December 2016 (UTC)
- I'll message the devs of IA to see if they are willing to do this.—cyberpowerMerry Christmas:Unknown 14:55, 2 December 2016 (UTC)
- Thank you! I had a problem to find the right person at IA for contacting. I only know the contact page.Boshomi (talk) 19:33, 2 December 2016 (UTC)
In the above example, the IA API returns the http even though the https is requested,; the API anyway is flexible in returning whichever is available, so long as it was saved to begin with. -- GreenC 15:53, 2 December 2016 (UTC)
- this is correct. The problem is that URLs with unlucky protocol never get archived, if the are used in a wikipedia article. Normally the crawlers work fast, the majority of new linked URLs get archived hours or days after the wikipedia edit. Boshomi (talk) 19:33, 2 December 2016 (UTC)
fyi: In November we did some thousand edits per bot in dewiki to replace dead urls with IA-URLs. i.e: de:special:diff/160202002. We made this with lists of manually checked IA-URLs. The permalink of the lists are linked in the botedits. (As of now a second semi-automatic Edit is necessary for insert the correct template). Boshomi (talk) 19:33, 2 December 2016 (UTC)
WaybackMedic checked every wayback link on English (about 1 million as of September) and it kept a database of all that were inoperable due to robots.txt (about 12000) .. I just checked and only about 600 links are https. I could write a script for those 600 to how many have a working http version that can be saved to Wayback, but I don't think it's going to be very many honestly. It would be a lot of overhead during the save process for low return. Maybe a solution is to only check after the link is established blocked by robots.txt ? -- GreenC 18:55, 3 December 2016 (UTC)
InternetArchiveBot marked link "permanent dead link", but link seems okay.
Hi~ In this edit of Dia_(moon), InternetArchiveBot marked a link as "permanent dead link", but the link seemed to work fine for me. Zeniff (talk) 04:11, 3 December 2016 (UTC)
- There's a current ongoing problem in the labs environment where that site comes back dead due to a lack of response from the server. It's possible labs, where the bot lives, is blacklisted. I'm working on a workaround for that.—cyberpowerMerry Christmas:Unknown 14:37, 3 December 2016 (UTC)
Multiple IA Bot issues
- reporting bugs does not work, I got a login page. IIRC I have no WMFlabs account, certainly no phab account, and I'm not using my global MW account for a year, minimum.
- original issue reported on Talk:Baen_Free_Library#External_links_modified.
- why on earth do you want other bot issues on your talk page instead of the bot talk page?
–220.127.116.11 (talk) 14:04, 3 December 2016 (UTC)
- That's not an issue. Accounts are required to prevent abuse. This is why those that can't follow those instructions due to those restrictions can still leave me a message here.
- You reported a non-issue. If it's empty, it won't do anything. That's a date format parameter users can easily fill to change the date formats of the citation.
- I don't monitor other talk pages as I do my own. Keeping all my conversations in one place is ideal for me.
- Cheers.—cyberpowerMerry Christmas:Unknown 14:41, 3 December 2016 (UTC)
FYI: Regular hits on InternetArchiveBot
Hi Cyberpower678 and @Kaldari:. Just wanted to let you know that the archive bot is showing up on a regular basis at Special:Log/Spamblacklist (18 times last 30 days). I have cleaned up a couple where I could remove the spam url, though others are more integral. Not sure whether it matters to you, just passing the message on. If it does matter or your have suggestions, please let me know.
Interestingly the bot does show an interesting progression through enWP, and it will be a long time until we get back to the start to repeat the process. — billinghurst sDrewth 01:07, 4 December 2016 (UTC)
- It's not avoidable. IABot has no clue what's on the blacklist, and will unfortunately not be able to. On the other hand, IABot is only running at 1/28th of it's maximum speed on enwiki, so....—cyberpowerMerry Christmas:Unknown 01:09, 4 December 2016 (UTC)
Recent problems with the Wayback Machine
The Wayback Machine has been hit or miss lately (since December 3, at least in my experience) and this may be affecting InternetArchiveBot so please check that bots edits (at least selectively). Regards. – Allen4names (contributions) 22:11, 5 December 2016 (UTC)
- Most of the archives, are probably already stored on IABot's internal DB. So there isn't much need for worrying. :-)—cyberpowerMerry Christmas:Unknown 22:17, 5 December 2016 (UTC)
External links modified
Hi, "fellow Wikipedian".
There are many ways of introducing yourself, but I feel the one which your bot has chosen is perhaps one of the most annoying possible. Most bots just do their jobs quietly and unobtrusively, but yours (although does its job very well) leaps out as if it wanted very much to be a close personal friend rather than just a background process. We are all in involved WP, but I'm afraid your bot's greeting simply gets my goat, and I find myself disinclined (or even discouraged) to check its very worthwhile edits, because a bot with a personality is, to say the least, wildly irritating. See, for example, Self-Satisfied Door. Cheers, >MinorProphet (talk) 13:57, 6 December 2016 (UTC)
- Huh? It's a standard boilerplate message, asking the user to verify what it did was correct.—cyberpowerMerry Christmas:Unknown 14:08, 6 December 2016 (UTC)