Wikipedia talk:WikiProject Spoken Wikipedia

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Any userboxes?[edit]

I've just signed up and contributed my first article, and I'd like to advertise the project by way of a userbox (I have a... thing... for userboxes.) Anybody designed one yet? Chris W. (talk) 00:15, 18 January 2013 (UTC)

Hey Chris, you can check out some of the userbox options for this project by clicking here. Hope that helps you! Welcome to the project! Cognate247 (talk) 00:39, 18 January 2013 (UTC)
Cheers! Chris W. (talk) 01:59, 18 January 2013 (UTC)

HI is there any video to learn how to make audio for wiki pages. I have ediited one page and also tried to edit some others . I am learner.

Spoken Wikivoyage[edit]

There is a discussion going on at the Wikivoyage sister site about spoken articles at voy:Wikivoyage_talk:Image_policy#Audio_files, if anyone would care to share their experiences from WP --Inas (talk) 22:32, 7 March 2013 (UTC)

better categorization of attributes of the speaker (e.g. age, gender, native language, accent/dialect)[edit]

If articles could be filtered by those attributes, it would greatly improve the importance of spoken Wikipedia. Plainly listing the articles is insufficient for the majority of people (i.e. people who don't have dyslexia and/or are blind) because whether or not the text version is preferable over the spoken version depends on the subjective experience, which is influenced only by attributes of the speaker. E.g. simply due to my sexuality, I can pay better attention to female speakers and less attention to male speakers compared to reading text, which is very important to me because I have attention issues. In my native language (German), there are several dialects and I cannot (properly) understand all of them. For example Bavarian, Swiss or Austrian dialects are as understandable to me as Dutch, although it is considered to be the same language, often even if the speaker tries to not let their local dialect influence their speech. Even if not for those major problems, to be able to filter by those attributes would in any case generally improve how easy and/or pleasant it is to listen to the article, hence improve the experience and attract more people towards it. 77.181.213.56 (talk) 08:37, 25 April 2013 (UTC)

Btw. here is some improvised bash script to detect sex by the first minute or so. It has issues with noise and microphone hums and speed is slow (2-5 seconds per file on i5-2500), but it works quite well. I tested it on 100 files.

 
#!/bin/bash
if [[ "$1" == "" ]]; then echo "USAGE: DIR [ADDITIONAL COMMAND]"; exit 0; fi
YELLOW_="\033[01;33m"
GREEN_="\033[01;32m"
RED_="\033[01;31m"
NORM_="\033[00m"
 
THRESHOLD_F=10; 
THRESHOLD_M=50; 
 
THRESHOLD_OF_DOUBT=26; 
 
IFS=$(echo -en "\n\b")
for MYFILE in $(bash -c 'ls -w 1 *.ogg'); do 
	PROGOUT="$(aubionotes -i "$MYFILE" -p mcomb | head -n 400 | grep "^[0-9]*\.[0-9]*[[:space:]]*[0-9]*\.[0-9]*" | grep -os "^[0-9]*.[0-9]*" | tail -n 90 )"
	MYSTERYVALUE=0;
	for i in $PROGOUT; do 	
		if [ "${i/.*}" -lt "50" ]; then 
			MYSTERYVALUE=$(($MYSTERYVALUE+1)); 
		fi; 
	done;
 
	echo -e "${YELLOW_}MYSTERYVALUE: $MYSTERYVALUE ${NORM_}" 1>&2 
 
	if [ "$MYSTERYVALUE" -lt "$THRESHOLD_F" ]; then 
		echo -en "$MYFILE\t ${GREEN_}female"; 
	elif [ "$MYSTERYVALUE" -gt "$THRESHOLD_M" ]; then 
		echo  -en "$MYFILE\t ${RED_}male"; 
	else
 		if [ "$MYSTERYVALUE" -lt "$THRESHOLD_OF_DOUBT" ]; then 
			echo -en "$MYFILE\t ${GREEN_}likely female"; 
		else
			echo  -en "$MYFILE\t ${RED_}likely male"; 
		fi 
	fi; 
	echo -e "${NORM_}"
	if [[ "$2" == "mplayer" ]]; then $2 "$MYFILE" >& /dev/null; fi
done

77.181.243.178 (talk) 11:58, 25 April 2013 (UTC)

Could someone improve my recording, please?[edit]

Hello! I recorded myself reading the article Pacific Southwest Airlines Flight 1771, and I've been editing it. However, I discovered some words that I said weren't really pronounced -- I think -- in a way one could understand them. So I re-recorded, a few days later, the sentences on which those words were included. I've already removed the background noise, but the two recordings overposed sound differently. I mean, there's a sudden change in terms of sound. Do you get me? Well, even if you don't, if you hear it, you'll get me. I didn't want to publish this file, because it's quite uncomfortable to hear... Can anyone recommend someone I could talk to, in order to see if they can uniformise my recording? If it's really unrecoverable, I have another version, but some words are mispronounced... I could also read the whole article again, and then edit everything again, in only one recording, and try to pronounce the words correctly. But it would take another several hours of my time, and I don't really want to do it... And there's the risk that I'll mispronounce some words again, without noticing at the time. So, I'd prefer to ask someone to improve this recording, which is well pronounced, but uncomfortable to hear. Whom can I contact to ask the favour of doing so? Thanks in advance for any reply! -- Sim(ã)o(n) * Wanna talk? See my efforts? 10:39, 20 June 2013 (UTC)

Possible donation of professional narrator time by audiobook producer[edit]

Hi everyone -- I'm posting as a volunteer here, not in my role as a WMF staffer. This is unrelated to any WMF business.

I had a conversation with some people who work at a large provider of audiobooks. They are interested in recording Wikipedia articles with their professional narrators. I'm an audiobook addict myself, and thought this would be great. They need volunteers to help them figure out the details though -- where to put files, license issues, etc... I told them about how some institutions, like the British Museum, have hired "Fellows" from the Wikimedia community to advice and help with that kind of thing.

If anyone is interested in helping them, please let me know. Please email me at [my name]@gmail.com

They weren't ready to announce this publicly yet, so I'm not putting their name here. But I think they'll be happy to announce as soon as they can find someone from the community to be their guide in this unfamiliar space. Zackexley (talk) 19:58, 25 June 2013 (UTC)

One of your project's articles has been featured[edit]

Today's Article For Improvement star.svg

Hello,
Please note that Child, which is within this project's scope, has been selected as one of Today's articles for improvement. The article was scheduled to appear on Wikipedia's Main Page in the "Today's articles for improvement" section for one week, beginning today. Everyone is encouraged to collaborate to improve the article. Thanks, and happy editing!
Delivered by Theo's Little Bot at 00:08, 12 August 2013 (UTC) on behalf of the TAFI team

Albert Bridge[edit]

Someone has expressed an interest in speaking and recording the Albert Bridge, London article. Please see Talk:Albert Bridge, London. Simply south...... fighting ovens for just 7 years 09:31, 16 August 2013 (UTC)

Use Speech Synthesis To Read All Pages Aloud Automatically[edit]

Now that Chrome and Safari (Firefox is on its way, I heard) support the Speech Synthesis part of Web Speech, perhaps a addon/plugin or a simple box can be added to all of the English pages that plays back the text. I am willing to investigate it (starting with attempt to implement in JavaScript in a regular web page), if the project thinks it might play well. PhistucK (talk) 11:28, 7 February 2014 (UTC)

Just completed my first recording[edit]

It isn't perfect by any means and I'm unfamiliar with audio editing software at the moment, but I'd appreciate any feedback on my recording of our British Empire article. Sorry about my voice, it kind of drones. Here it is

. --Andrew 13:19, 7 March 2014 (UTC)

I think your recording is just fine, also the sound level is inconsistent. Make sure you never change any settings while recording and manipulate the files only AFTER you have cut together everything. I edited your recording and uploaded it. I did the following:
  • aligned the sound levels (still not perfect, a lot of clipping, I’m also little experienced)
  • removed a ugly „fiep“ at min 44
  • changed from stereo to mono
  • reduced the bitrate (130 kbit/s is far too much for a voice recording in ogg-format)
If you have any questions, feel free to ask. --LordOider (talk) 14:11, 12 March 2014 (UTC) Normal 0 21 false false false DE X-NONE X-NONE

Could someone help and/or advise with Jaeger (clothing)?[edit]

I may have a go at recording but probably need to do more reading as I am struggling a bit with IPA and how it works. Would somebody be able to take a look at Jaeger and possibly contribute an opinion/recording. I'm British, like the brand, and have always known and heard it pronounced as 'yayger' – I'm not sure if the phonetic rendition as stands matches this so would really welcome help (the article was marked as having multiple issues and I'm working through them). If appropriate, I'd be really grateful to have a speech version on here as it is a tricky one for anyone unfamiliar with the brand. Many thanks for any support you can offer. Also sorry if I've posted request in wrong place, but I couldn't find the right place for IPA general help. Libby norman (talk) 12:55, 16 April 2014 (UTC)

External website that might help this project.[edit]

Hi. I just wanted to share something with you that I found this evening while recording a piece of audio in Maltese. It's a website called CuePrompter. Basically, you take your text, load it into the website, and it opens a second window on your browser, fully controllable for speed, forwards, backwards and pause. There is also no character limit on how much text you can load.

This second screen acts as a teleprompter, meaning you don't have to concentrate on using your mouse to scroll through text while reading an article. I thought I'd bring it to your attention in case it's useful to you. The site says it can be used for either commercial or non-commercial purposes, in which case, I think Wikipedia should be ok with it. Have a look and throw some thoughts at me :) Many thanks, CharlieTheCabbie (talk) 22:47, 14 May 2014 (UTC)

Thanks Charlie - this is a good resource. Arbitrarily0 (talk) 01:57, 10 August 2014 (UTC)

Ogg vorbis, or not?[edit]

Hi there, i was just having a look at doing some recording. Wikipedia:WikiProject Spoken Wikipedia says Ogg vorbis "is neither very well supported nor lossless", but then at Wikipedia:WikiProject Spoken Wikipedia/Recording guidelines it says to use Ogg vorbis. So...anyone able to explain what I should be using? hamiltonstone (talk) 01:00, 10 August 2014 (UTC)

Ogg Vorbis is the preferred format, see also Commons:Audio. Lossless recording produces huge files, which makes streaming impossible for people with a slow internet connection. Also uploading would take very long and the upload limit of currently 100 MB would limit the length of recordings to about 20 minutes. Also spoken articles are typically not edited, so if at all the narrator itself should save lossless versions of it's recordings. Furthermore IMHO the average recording quality is that low, that using a lossless format is absurd. Only the compatibility is still a problem, especially in Apples iWorld AFAIK.
Thanks for your note, I'll edit this section on the project page. --LordOider (talk) 11:24, 10 August 2014 (UTC)
Thank you, Lord. hamiltonstone (talk) 11:40, 10 August 2014 (UTC)

Mechanical reading of articles.[edit]

Hi, Has the mechanical reading of articles been previously discussed? And if not do people think this is worth pursuing? AbhiSuryawanshi (talk) 12:50, 26 September 2014 (UTC)

AbhiSuryawanshi I have tried to search the archives but was unable to find any discussion. There are a lot of archives here which are not sorted. For the automation issue, I think that eventually someone should manually check the archives to see what past discussion exists. Blue Rasberry (talk) 14:33, 26 September 2014 (UTC)
I'm not sure what advantage would be gained by creating audio files of mechanically-read articles. I would guess the more useful things to do would be more on the developer side, such as implementing tools that can generate new audio versions on the fly and optimizing the pages so that screenreader software can easily parse it. I haven't been impressed with the speech synthesis engines I've seen before, though, so there's probably still a good deal of utility to be had by having humans read the articles.0x0077BE [talk/contrib] 14:47, 26 September 2014 (UTC)
Yes agree human read is best but machine read is better than nothing. I have actually listened to a great deal of machine read text and enjoy it. It would be nice to have the option to have it read on the fly or to download it for those with an intermitent connection Doc James (talk · contribs · email) (if I write on your page reply on mine) 19:02, 26 September 2014 (UTC)
What my point was was that mechanical text and human-read text should be two separate endeavors, for the most part. One person could feasibly write a program that adds the ability to listen to or download any Wikipedia article you want and accepts various screenreader voice pack formats, etc, and that would cover all of Wikipedia at once in a versatile way. I don't think that should be mixed in with the spoken / audio versions of articles, because it's a different solution to the same problem with its own advantages and disadvantages.
If we were to just use existing screenreader software to manually generate the audio versions packaged with articles, that would be the worst of all possible solutions, because it takes everything that's bad about human-read audio files - you can't choose the accent / gender / quality / speed of the voice for yourself, has to be manually updated whenever the article is updated, only operates on a curated set of articles - and everything that's bad about machine-read articles - primarily just that the emphasis and pronunciation are likely to be off, especially with regards to names and non-native sounds.
Even as a temporary measure I think it's a bad idea, because there's no way to tell, with the current interface, whether something is a machine-reading of the article or not, and someone generating a lot of low-quality machine-read articles and uploading them could make it very hard to tell which articles still need spoken versions of them. 0x0077BE [talk/contrib] 19:33, 26 September 2014 (UTC)

Have a look at Pediaphon. This service is linked on the German project page of the spoken Wikipedia, might be worth being linked here as well. --LordOider (talk) 19:19, 26 September 2014 (UTC)

Breakdown by section[edit]

I was thinking of taking on the NASA article, per a request on that page, and I notice it's incredibly long. Likely what I'm going to do is read it one section at a time, then stitch those together in the final version. I think I can probably include some sort of chapter-break metadata in the final output file to delineate it by section, but it seems like it would almost be better in general to break up articles by section and have separate files for each section, which are automatically stitched together in software. Even the most stable articles are probably going to see some change here and there over time, and I imagine it would make it easier to update the spoken version of articles if you didn't have to re-record the entire thing every time you make a change.

Of course, the downside is that you could very easily end up with a sort of "voice fragmentation" where an article is read by 10 different people as it gets updated piecemeal, but we could always address that in the review process - i.e. if the change in voice is too disruptive reject the update. 0x0077BE [talk/contrib] 17:56, 30 September 2014 (UTC)

As there are hundreds of thousands of interesting articles and only a view hundred of them has been read aloud, I think re-recording isn't a issue. Also the spoken Wikipedia doesn't and can't claim being up to date. However splitting long recordings is recommendable for several reasons.
Chapters would be great, but as far as I know neither common media players nor the Wikimedia software could handle/display this. --LordOider (talk) 22:15, 7 October 2014 (UTC)

Comment on the WikiProject X proposal[edit]

Hello there! As you may already know, most WikiProjects here on Wikipedia struggle to stay active after they've been founded. I believe there is a lot of potential for WikiProjects to facilitate collaboration across subject areas, so I have submitted a grant proposal with the Wikimedia Foundation for the "WikiProject X" project. WikiProject X will study what makes WikiProjects succeed in retaining editors and then design a prototype WikiProject system that will recruit contributors to WikiProjects and help them run effectively. Please review the proposal here and leave feedback. If you have any questions, you can ask on the proposal page or leave a message on my talk page. Thank you for your time! (Also, sorry about the posting mistake earlier. If someone already moved my message to the talk page, feel free to remove this posting.) Harej (talk) 22:48, 1 October 2014 (UTC)

Request for feedback[edit]

Hello! I decided I wanted to try my hand at recording an article, and I would like to know what I could do to improve future recordings. I apologize if this is not the correct place to do this, as I am relatively uninitiated, and Wikipedia's somewhat labyrinthine guidelines can be overwhelming. Please let me know if I'm overlooking anything.

Sspungy (talk) 02:53, 2 October 2014 (UTC)

Listened to it. I think you did a great job. Really high sound quality, very clearly read. Here are my notes:
  • I don't think you needed to read the table of contents, that seems like extraneous information, though I don't think it's particularly bad, either.
  • At 8m18s, the recording volume temporarily increases drastically, for some reason. You may want to re-record that section or just make it consistent with the rest of the recording.
  • It's possible that this could use dynamic range compression. I tried running it on Audacity to see if it sounds better, but I was having some problems with that. Other than the glitch at 8:18, the sound seemed pretty consistent to me, I also listened to it in isolation, so you may want to download a few other articles or some high-quality podcasts and just make sure that it's mastered at approximately the "industry standard" level (i.e. it doesn't sound "too quiet").0x0077BE [talk/contrib] 15:05, 2 October 2014 (UTC)
You have very good voice. It's pleasant to listen this audio article. Thank you!
If someday you will make an audio article for the article Raoul Wallenberg, or any other "Good article", then it will be great! :) -- Andrew Krizhanovsky (talk) 14:44, 2 October 2014 (UTC)

Mom & Me & Mom[edit]

Hello, thanks to User:Sspungy for recording this article; it was a very cool thing to do. However, I need to bring to your attention that you've mispronounced Maya Angelou's name. Please refer to the first sentence in her bio article; it demonstrates the correct way, along with references supporting it. I have no idea how you'd go about fixing it, but I thought you'd like to know. Christine (Figureskatingfan) (talk) 00:12, 6 October 2014 (UTC)

Agh, I was worried I would end up doing something silly like that. I'll go back and re-record the whole thing, it's honestly not a problem for me. I'm pretty embarassed about mispronunciations of proper nouns in general, so I prefer someone tell me if I did something like that rather than letting it slide.
Also, I recently tried to provide a spoken word version of another recently featured article, Meerkat Manor, and I'm once again deathly worried about mispronunciations (despite having to pause the recording to open a new tab to consult Merriam-Webster every 30 seconds). If anyone happens to have the time and patience, I'd appreciate feedback there as well, as I have no issue with doing a full re-recording. I know it's not exactly a perfect solution, as the featured article status has already come and gone.
I'll be sure to keep all of this in mind when I try recording another article.
Sspungy (talk) 04:12, 6 October 2014 (UTC)
I think it's absolutely exaggerated and needless to re-record a whole article, just because of the wrong pronunciation of a name mention two times. If at all, re-record the two sentences containing the name and cut them in your existing recording. If they sound slightly different because your setting has changed, that would be bearable.
You are neither a professional speaker, nor there is any review process here, so mistakes are most likely to be more rule than exception. --LordOider (talk) 22:51, 7 October 2014 (UTC)
I was hoping, when I brought it to your attention, that there was some way to slice the correct pronunciation into the recording. Angelou's name is mentioned more than two times in this article; it's mentioned several times. The name of a book's author and the central figure in it (since it's an autobiography) is important to pronounce correctly. Perhaps what you can do next time is get feedback, if possible, from an article's main editor. If I knew of your intention to record the article, I probably would've warned you about it, since it's a mistake many people make--even President Obama when he presented Angelou the Presidential Medal of Freedom! ;) Christine (Figureskatingfan) (talk) 06:06, 8 October 2014 (UTC)
Oh sorry, I searched the article for her full name. Her last name appears so often, it might be less work re-recording the whole article. --LordOider (talk) 11:41, 12 October 2014 (UTC)

Infoboxes, units[edit]

I took a crack at File:En-Deepwater_Horizon-article.ogg, and while I was doing so, I noticed that while in the article all units are given first in imperial units, then, parenthetically, in metric units; it may be just because I am, admittedly, not particularly good at this (one reason I'm doing it is to get better at this sort of thing), but I found that when speaking the article, giving both sets of units really disturbed the flow of the article, so I just left out all the metric ones. What do people think about this unit situation? In the future, should I keep in the secondary units even at the cost of the flow of the article?

Another issue I have is with infoboxes and other tabular information. I included the infobox on Deepwater Horizon, but I don't know that it's adding anything to the spoken version of the article. I'm planning on re-mastering it anyway (the speech volume leveling I used added more distortion than I originally thought), and I'm considering either clipping out the infobox entirely or at least moving it to the end of the article. For small infoboxes with sort of useful information (like infoboxes on person articles), I can see a case for inclusion, but for things that are essentially long tables of numbers, I think it might be worthwhile to advise against their inclusion. 0x0077BE [talk/contrib] 14:19, 20 October 2014 (UTC)