Jump to content

Wikipedia talk:CSVLoader: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
→‎the edit summary: fix and grats
Line 144: Line 144:


:::Fixed the issues that you reported over e-mail. Please download the new version, 1.0.0.12. <font color="navy">— [[User:Ganeshk|Ganeshk]] ([[User talk:Ganeshk|talk]])</font> 14:46, 3 September 2011 (UTC)
:::Fixed the issues that you reported over e-mail. Please download the new version, 1.0.0.12. <font color="navy">— [[User:Ganeshk|Ganeshk]] ([[User talk:Ganeshk|talk]])</font> 14:46, 3 September 2011 (UTC)
:Thanks Ganesh, your plugin is an inevitable tool for our Tamil wiki (especially, Tamil wiktionary) experience, I appreciate your effort and prompt help.--[[User:Drsrisenthil|Senthi]] ([[User talk:Drsrisenthil|talk]]) 16:02, 15 September 2011 (UTC)


== Share CSV-files ==
== Share CSV-files ==

Revision as of 16:02, 15 September 2011

Tag extension

Would there be any benefit in developing a tag extension to process CSV (or hyphen-separated, or whatever) files, given that no spreadsheet applications I know of are equipped to handle piped wikitext? Tisane (talk) 12:47, 2 June 2010 (UTC)[reply]

Is your question related to the CSV plugin? I do not understand what you are looking to do. Ganeshk (talk) 02:09, 3 June 2010 (UTC)[reply]

Creating new pages in ta.wiktionary

 DoneThank you very much indeed Ganesh! for your teachings through phone, particularly in our mother(Tamil language) tongue.my first entry in ta .wiktionary . We are going to discuss about the upload of the one lakh words--தகவலுழவன் (talk) 02:24, 16 July 2010 (UTC)[reply]

You are welcome. I am glad to hear that the first article is out. Let me know if you have any questions. Ganeshk (talk) 12:02, 16 July 2010 (UTC)[reply]

adding more categories in the existing page

You trained me well for new productions. I am having a question about the categorization in already existing pages.Each word in the ta.wiktionary comes under many categories. Sometimes in many words, few categories to be added. How can i add those categories? There is an option to add, only one category in the more... tab of AWB.--தகவலுழவன் (talk) 02:18, 18 July 2010 (UTC)[reply]

Use the Append/Prepend Text option to add any text you want. Check "Enabled", select "Append" and enter the new categories in the box below. Ganeshk (talk) 16:45, 18 July 2010 (UTC)[reply]

I used the way. but if one of the category already exists, it writes one more time. The only one category way automatically skips perfectly. then how can i manage more than one category perfectly as the only one category way.--தகவலுழவன் (talk) 02:05, 19 July 2010 (UTC)[reply]

mmm, I guess the only way is to run the bot once for each category. Finish the first category and move to the next and so on. Ganeshk (talk) 04:49, 25 July 2010 (UTC)[reply]

Differentiate your find and replace

  1. Will you please differentiate AWB-find and replace with CSV-find and replace usage?
  2. In each word, i am in a position to replace a file (file:Example.jpg) in a template {{படம்|file:Example.jpg|{{PAGENAME}}}} with the specific file which differs from word to word.--தகவலுழவன் (talk) 14:22, 24 July 2010 (UTC)[reply]
I think AWB's Find and Replace will do for this requirement. Ganeshk (talk) 04:51, 25 July 2010 (UTC)[reply]

ta-wiktionary-test pages

Hope you well. Will you please see the 25 test pages of your CSVloder.Please furnish your view and a vote. See you then.--தகவலுழவன் (talk) 07:02, 7 October 2010 (UTC)[reply]

Thanks for making this available

I just found this page and I am quite eager to try it. I wrote some AWB modules with similar capabilities, see here, but this seem much cleaner solution. I am planning to use it for creating and updating commons:category:Creator templates on Commons. Thanks --Jarekt (talk) 14:52, 4 February 2011 (UTC)[reply]

I just tested it to make sure it is compatible with the latest AWB version. Please try it and let me know your feedback. Ganeshk (talk) 00:15, 5 February 2011 (UTC)[reply]
I tried it and it did mostly what I wanted: I managed to upload 3 pages (like this. It basically did what I needed, except for handling of foreign characters. I had word "Zasów" which was uploded as "Zas�w". Other Improvements I can see:
  • allow resizing of the CSV loader Settings window, so there is more space for "article text".
  • Pick a font with equal size letters for "article text" window
  • I found name "article text" confusing, since it only make sense if you operate in the wikipedia article namespace. I was using it in Commons creator namespace. May be "Append/Prepend/Replace text".
I am still eager to try your Find & Replace functionality. Thanks, This functionality is something I was missing from AWB since first time I tried it. May be you should try to distribute your plugin with the rest of the code? --Jarekt (talk) 04:54, 6 February 2011 (UTC)[reply]
Thanks for the feedback.
  • Please save the text file as UTF-8 to fix the foreign character issue. I tested it here. You can do this on Notepad by setting the encoding box on the Save dialog to UTF-8.
  • I will the add window resize request to my to do list.
  • I have changed the font for the text to Lucida Console. Please download the new DLL, 1.0.0.10.
  • I fixed the text caption to say, Append/Prepend/Replace text. Please download the new DLL, 1.0.0.10.
I plan to add this to rest of AWB source at some point. It will help me out with the debugging as well. Ganeshk (talk) 16:33, 6 February 2011 (UTC)[reply]
Thanks. I will let you know how is the new version working out. --Jarekt (talk) 04:21, 7 February 2011 (UTC)[reply]

My second attempt worked better. I uploaded ~100 infobox templates based on CSV data scraped from Wikipedia. I had some issues though:

  1. the %%key%% did not seem to work. It is AWB build-in shortkey for returning names based on the page name in format compatible with DEFAULTSORT.
  2. I had to do 3 separate runs to create all templates. First 2 runs only created handful of pages and skipped most of the rows. But 3rd one was a charm and created most of the templates. This issue only happen when bot autosaving was used - in manual mode no pages were skipped.
  3. Auto loading of the list based on the first column has a little of an issue - if CSV file thinks it has some empty rows at the end than "empty page names" are added to the list. There should be some check to prevent that.

Otherwise it is a great tool. Thanks again --Jarekt (talk) 20:41, 13 February 2011 (UTC)[reply]

  1. I suggest you to use ## (any symbol will work) for the fields. For example, ##key##. This will not conflict with AWB keywords.
  2. The tool will work with bot autosaving as well. Please check the skip conditions.
  3. I will look into the empty row issue.
Thanks for the feedback. Ganeshk (talk) 23:57, 13 February 2011 (UTC)[reply]

I did another round of working with CSVLoader - this time using replace feature in autosaving mode. It worked great as long as you stick precisely to the steps: create replacement rules, load the file and run from the beginning to the end. Any small alteration of the rules and the process stopped working:

  • For example I usually like to do a lot of testing of the new script before I unleash it and that usually broke the script. More precisely if I pick file #1, #100, #200 and last one from the list it works fine but then when I try file #2 than I get wrong substitution.
  • Another thing I like doing is testing and tweaking the replacement rules, and I was using several rules at the same time. Unfortunately each time I look at my rules all ##keywords## are gone replaced with the last substitution performed. I fix them all to the original state and do my tweak, but then substitution does not work without reloading the csv file. So I had to reload the file. All those steps take a while. Better way would be to allow changing the rules without need to to redo all the steps.

But even with all those limitations this tool is much better than any alternative approaches I come up with, and allows me to do semiautomatic cleanup edits in 3 phases: 1) run a bot to capture some text, like author from some collection of files or templates and save it to a file; 2) use spreadsheet to correct/unify the text; 3) use CSVLoader to replace original text with corrected version. --Jarekt (talk) 14:12, 22 February 2011 (UTC)[reply]

  • Agreed, the plugin has limitations. The plugin stores the replacement rules (with the ##keywords##) when it is started. As each article gets processed, the replacement rules are purged and replaced with the actual values from the file. That is the reason you see the last substitution made. It will be difficult for the plugin to figure out that rules have been changed in the middle of the run. Do you have any suggestions on how this can be done?
  • I will check into your issue #1.
I am glad to hear that you are finding the plugin useful. Ganeshk (talk) 00:55, 24 February 2011 (UTC)[reply]

Error message

Status New
Description
Exception:NullReferenceException
Message:Object reference not set to an instance of an object.
Call stack:
   at WikiFunctions.Parse.FindandReplace.Decode(String text)
   at WikiFunctions.Parse.FindandReplace.AddNew(Replacement r)
   at CSVLoader.CSVLoader.ProcessArticle(IAutoWikiBrowser sender, IProcessArticleEventArgs eventargs)
   at WikiFunctions.Article.SendPageToPlugin(IAWBPlugin plugin, IAutoWikiBrowser sender)
   at AutoWikiBrowser.MainForm.ProcessPage(Article theArticle, Boolean mainProcess)
AWBPlugins AWBBasePlugins ListMakerPlugins
  • CSV Loader
  • No Limits Plugin
  • UserContribsNoLimitsForAdminAndBotsPlugin
  • UserContribsUserDefinedNumberForAdminAndBotsPlugin
  • WhatTranscludesPageNoLimitsForAdminAndBotsPlugin
  • WhatTranscludesPageAllNSNoLimitsForAdminAndBotsPagePlugin
  • CategoryNoLimitsForAdminAndBotsPlugin
  • CategoryRecursiveNoLimitsForAdminAndBotsPlugin

Jarekt (talk) 16:11, 21 March 2011 (UTC)[reply]

To duplicate: [encountered while processing page [1]]
Site URL: http://commons.wikimedia.org
Operating system Microsoft Windows NT 5.1.2600 Service Pack 3
.NET FW Version 2.0.50727.3615
AWB version AutoWikiBrowser (5.2.0.0), WikiFunctions (5.2.0.0), revision 7471 (2010-12-17 01:03:47)
Workaround deleting all replacement rules not using data pulled out of the file
Fixed in version


Ganeshk, This is error message I get if one of the replacement rules do not involve data pulled out of the file. Also I do not seem to be able to use your tool with "Find and replace" "Advanced Settings". I usually use "Advanced Settings" for everything since I find them more readable. Greetings. --Jarekt (talk) 16:11, 21 March 2011 (UTC)[reply]

Hi Jarekt, I will take a look at this. I am a little busy in RL right now. Ganeshk (talk) 00:54, 27 March 2011 (UTC)[reply]

Another issue

I run into another issue where I am not sure I understand the cause: The tool worked fine doing find and replace for first dozen or two of records but then something broke and started inserting the same text to remaining files not matching values in the spreadsheet. I suspect that the problem might be caused by the fact that I used skip option (when some template is present) with you tool. May be there is a way to detect when process breaks and stop it. --Jarekt (talk) 16:17, 30 March 2011 (UTC)[reply]

Two images, minor order change

Made a minor order change, moving the walkthrough up below the download subsection, and the history moved to bottom.

Here are two images to add at the beginning for the example page: [2][3] I would upload them myself but I don't upload images anymore because veteran editors here always delete them. Errectstapler (talk) 04:02, 14 April 2011 (UTC)[reply]

Hi Errectstapler, Thanks for the changes. The DLL need not be loaded each time if the CSVLoader.dll is copied to the AutoWikiBrowser folder. There is no loading required. That was the reason the walkthrough does not list that step. Ganeshk (talk) 10:52, 14 April 2011 (UTC)[reply]

clarification

RE: Copy the downloaded CSVLoader.dll file to the AutoWikiBrowser folder

Is this the plugins folder? Errectstapler (talk) 15:57, 27 April 2011 (UTC)[reply]

No, it is the root folder where the AutoWikiBrowser.exe resides. Ganeshk (talk) 03:58, 31 August 2011 (UTC)[reply]

Also the pictures on User:Ganeshk/CSVLoader/Walkthrough are of a text document not a csv document. The instructions describe a csv document. Errectstapler (talk) 16:21, 27 April 2011 (UTC)[reply]

CSV file is a text file with delimited data. Ganeshk (talk) 03:58, 31 August 2011 (UTC)[reply]

merged instructions into walkthrough

I boldly merged the instruction into the walkthrough page. These were almost the same instructions, repeated twice :) Errectstapler (talk) 16:41, 27 April 2011 (UTC)[reply]

the edit summary

we(Sodabottle,Drsrisenthil & me) are creating new pages in ta.wiktionary as you guided. 90% work load is reduced because of your CSV loader.Thank you very much indeed.The remaining 10% workload lays in the edit summary section of AWB and making internal links. As you instructed, we are using open office spreadsheet.column A for heading, column B for its meaning.Is it possible to past automatically the column B content in the edit summary of AWB for the new word? Because, the edit summary differs according to the new word. When we patrol in the recent changes of the ta.wiktionary page, it will be easy. Otherwise, every time we have to open every new page to verify. please, make a option button (i.e. also paste in the Edit summary) in the CSV loader.We are constantly moving ta.wiktionary ahead.The ta.wiktionary position among all other wiktionaries now. Thanks in advance.தகவலுழவன் (talk) 00:04, 13 June 2011 (UTC)[reply]

Okay. I will look into adding this functionality. Ganeshk (talk) 11:14, 13 June 2011 (UTC)[reply]
 Done This has been implemented. Please download the new version, 1.0.0.11. Ganeshk (talk) 03:42, 31 August 2011 (UTC)[reply]
Great.By this implementation, you have been reduced our patrolling time.Thanks indeed.--தகவலுழவன் (talk) 03:05, 3 September 2011 (UTC)[reply]
Thanks for this nifty feature Ganesh. As Tha.Uzhavan says, this has reduced patrolling/verifying time greatly :-)--Sodabottle (talk) 12:23, 14 September 2011 (UTC)[reply]
Glad to hear that. Happy to help. :) Just noticed that tawikt moved to 9th position. Congrats. Ganeshk (talk) 04:30, 15 September 2011 (UTC)[reply]
Fixed the issues that you reported over e-mail. Please download the new version, 1.0.0.12. Ganeshk (talk) 14:46, 3 September 2011 (UTC)[reply]
Thanks Ganesh, your plugin is an inevitable tool for our Tamil wiki (especially, Tamil wiktionary) experience, I appreciate your effort and prompt help.--Senthi (talk) 16:02, 15 September 2011 (UTC)[reply]

Share CSV-files

Please post your .csv files in your namespace and post it below. Other projects could use the raw data to create articles in their language. Thanks in advance - Grashoofd (talk) 20:05, 2 July 2011 (UTC)[reply]

i too expecting those files --தகவலுழவன் (talk) 07:19, 3 July 2011 (UTC)[reply]
By the way, usable databases for bot writing are also VERY welcome. It's a maze out there.. Grashoofd (talk) 16:32, 3 July 2011 (UTC)[reply]