Checking article quality through number of visitors

Discussion:

John Erling Blad

2008-03-10 22:00:39 UTC

Is there anyone that has done any research on how the number of visitors
relates to the article quality? I believe it is related somehow but I'm
not sure how it can be modeled. It works by counting the visitors that
reads a particular segment of the article, and then will accept the
particular segment as correct when a sufficient number of visitors has
been visiting. It can work together with a system for writer grading,
were this system will change the grade from whatever the writer has.

Compared to this a "stable versions" is like having a visitor with
ultimate power to mark the revision as good. This system does not give
the visitors such ultimate power, and in fact will not give give them
more than a small fraction of the power necessary to claim the revision
is free of vandalism. Combined I guess it is possible to make a system
that will be better than anyone of them alone.

Any real vandalism will most likely never be marked as good, because the
limit can be set so high that it will be found by someone long before it
is marked as "patrolled", and then most likely nothing or very little of
the revision will survive so the revision itself will never be marked as
patrolled. If a known good writer contributes a revision, then it will
get a flying start and it will need few visitors ("anonymous
patrollers") before it is marked as "good". If the writer is unknown the
revision will need a lot of visitors before it is marked as good.

Even very seldom read articles have several visitors each week, and
through a year this will add up to a considerable amount of visitors.

John

Travis Kriplean

2008-03-10 22:08:00 UTC

Permalink

You can check out Priedhorsky's work at
http://www.cs.umn.edu/~reid/papers/group282-priedhorsky.pdf. They
synthesize page view data and create a model of value and damage based
on it. They also make that page view data available so other people can
play with it if they want.

Wilkinson's work might also be relevant:
http://ws2007.wikisym.org/space/WilkinsonHubermanPaper. I think that
they use page view data in order to calculate a page-rank of sorts for
each article.

Travis

Post by John Erling Blad
Is there anyone that has done any research on how the number of visitors
relates to the article quality? I believe it is related somehow but I'm
not sure how it can be modeled. It works by counting the visitors that
reads a particular segment of the article, and then will accept the
particular segment as correct when a sufficient number of visitors has
been visiting. It can work together with a system for writer grading,
were this system will change the grade from whatever the writer has.
Compared to this a "stable versions" is like having a visitor with
ultimate power to mark the revision as good. This system does not give
the visitors such ultimate power, and in fact will not give give them
more than a small fraction of the power necessary to claim the revision
is free of vandalism. Combined I guess it is possible to make a system
that will be better than anyone of them alone.
Any real vandalism will most likely never be marked as good, because the
limit can be set so high that it will be found by someone long before it
is marked as "patrolled", and then most likely nothing or very little of
the revision will survive so the revision itself will never be marked as
patrolled. If a known good writer contributes a revision, then it will
get a flying start and it will need few visitors ("anonymous
patrollers") before it is marked as "good". If the writer is unknown the
revision will need a lot of visitors before it is marked as good.
Even very seldom read articles have several visitors each week, and
through a year this will add up to a considerable amount of visitors.
John
_______________________________________________
Wikiquality-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikiquality-l

#HU MEIQUN#

2008-03-11 01:59:47 UTC

Permalink

Dear John and all,

With regard to your discussion on examining Wikipedia article quality,
the idea somehow overlaps with our intuition of measuring article
quality based on the authority of authors/reviewers. We have been
researching from this perspective. Below attached is the abstract of our
research paper, titled "Measuring Article Quality in Wikipedia: Models
and Evaluation", published in CIKM 2007:

"
Wikipedia has grown to be the world largest and busiest free
encyclopedia, in which articles are collaboratively written and
maintained by volunteers online. Despite its success as a means of
knowledge sharing and collaboration, the public has never stopped
criticizing the quality of Wikipedia articles edited by non-experts and
inexperienced contributors. In this paper, we investigate the problem of
assessing the quality of articles in collaborative authoring of
Wikipedia. We propose three article quality measurement models that make
use of the interaction data between articles and their contributors
derived from the article edit history. Our Basic model is designed based
on the mutual dependency between article quality and their author
authority. The PeerReview model introduces the review behavior into
measuring article quality. Finally, our ProbReview models extend
PeerReview with partial reviewership of contributors as they edit
various portions of the articles. We conduct experiments on a set of
well-labeled Wikipedia articles to evaluate the effectiveness of our
quality measurement models in resembling human judgment.
"

We would appreciate your comments and suggestion.

Regards,
Meiqun HU

-----Original Message-----

Date: Mon, 10 Mar 2008 23:00:39 +0100
From: John Erling Blad <john.erling.blad-SMwfOI/***@public.gmane.org>
Subject: [Wikiquality-l] Checking article quality through number of
visitors
To: wikiquality-l-RusutVdil2icGmH+5r0DM0B+***@public.gmane.org
Message-ID: <47D5AF87.2030501-SMwfOI/***@public.gmane.org>
Content-Type: text/plain; charset="iso-8859-1"

Is there anyone that has done any research on how the number of visitors
relates to the article quality? I believe it is related somehow but I'm
not sure how it can be modeled. It works by counting the visitors that
reads a particular segment of the article, and then will accept the
particular segment as correct when a sufficient number of visitors has
been visiting. It can work together with a system for writer grading,
were this system will change the grade from whatever the writer has.

Compared to this a "stable versions" is like having a visitor with
ultimate power to mark the revision as good. This system does not give
the visitors such ultimate power, and in fact will not give give them
more than a small fraction of the power necessary to claim the revision
is free of vandalism. Combined I guess it is possible to make a system
that will be better than anyone of them alone.

Any real vandalism will most likely never be marked as good, because the
limit can be set so high that it will be found by someone long before it
is marked as "patrolled", and then most likely nothing or very little of
the revision will survive so the revision itself will never be marked as
patrolled. If a known good writer contributes a revision, then it will
get a flying start and it will need few visitors ("anonymous
patrollers") before it is marked as "good". If the writer is unknown the
revision will need a lot of visitors before it is marked as good.

Even very seldom read articles have several visitors each week, and
through a year this will add up to a considerable amount of visitors.

John
-------------- next part --------------
A non-text attachment was scrubbed...
Name: john.erling.blad.vcf
Type: text/x-vcard
Size: 181 bytes
Desc: not available
Url :
http://lists.wikimedia.org/pipermail/wikiquality-l/attachments/20080310/
8b907d20/attachment.vcf

------------------------------

Lars Aronsson

2008-03-11 14:37:34 UTC

Permalink

It works by counting the visitors that reads a particular
segment of the article, and then will accept the particular
segment as correct when a sufficient number of visitors has been
visiting.

How do you count visitors to segments? There are now available
statistics for visitors to articles, but that doesn't go down to
segment level. It is also much harder to determine edits on the
segment level, than to just count edits per article. Is it really
worth the extra effort?

--
Lars Aronsson (lars-***@public.gmane.org)
Aronsson Datateknik - http://aronsson.se

John Erling Blad

2008-03-11 16:46:55 UTC

Permalink

If such a thing would be done, then I guess some extra code would be
necessary. It is not necessary to check which part of an article someone
has visited, as a plain count of visits to the overall article would be
a viable simplification. I think such a simplification is possible, but
it will increase the number of necessary visits. Especially before edits
on the last part of a large article can be flagged as clean, or even
correct.

Probably there has to be made some code to make a model describing which
parts of an article is read by a visitor, given the size of the article.

John

Post by Lars Aronsson

It works by counting the visitors that reads a particular
segment of the article, and then will accept the particular
segment as correct when a sufficient number of visitors has been
visiting.