Scribd
Scribd
Scribd
www.charlestonco.com 31
Scribd
doi:10.5260/chara.12.2.31 Date of Review: August 27, 2010
Composite Score: HHH Reviewed by Marie-Elise Waltz
Special Projects Librarian, Center for Research Libraries
<mwaltz@crl.edu>
their publications were being illegally shared. To prevent illegal up- ally too broad to do a good job because every document in Scribd is
loads Scribd has created a Copyright Management System (CMS) required to be assigned to one of only sixteen categories. This means
which compares metadata of new documents with that of copyrighted every category is used a lot. To illustrate the problem, a search on
works already contained in the CMS. Publishers and any user can up- World War II results in 1.4 million documents. If one narrows on the
load metadata for copyrighted materials to the CMS. When someone category Business/Law (yes business and law are combined into one
uploads a document, the CMS compares the uploaded text to what is category), results are reduced to 39,900 documents. Alternatively, if
already in the database. Works that duplicates metadata already pres- one narrows on the category Government documents, results are re-
ent in the CMS are deleted. Documents whose rights cannot be trans- duced to 24,380 documents. No user will be willing to go through this
ferred by the content producer will be removed. number of documents.
Metadata is also a part of the content that is contributed by those who A Search Guide Would Be Good Scribd needs a search guide for
contribute documents. Content producers who upload their docu- its search tools. A guide would explain how the metadata is created,
ments add their own document description information. Producers and how to effectively search and browse documents. The advanced
choose a title and a controlled field from sixteen categories and ad- search screen provides some short instructions on how to use some of
ditional subcategories (creative writing––poetry) at the time of up- the advanced search algorithms and operation, but there is a need for
load. No additional metadata is required of the document producer. In clear, step-by-step instructions.
fact, producers are not required to submit an author’s name. The name
field is populated with the document producers Scribd handles. This Alternative Ways to Find Documents Other methods for discover-
sometimes makes it difficult to know the source of a public docu- ing documents in Scribd do not involve the search tools. The Groups
ment. At the time of upload providers may choose to add additional feature is used to share documents between all Scribd users or for
information in two free text fields: Tags, which are single keywords private groups. Group has the potential to be useful as a method for
that describe a document, and Description, which is a free text field building smaller collections of Scribd documents. It avoids the scat-
with unlimited characters. tershot results of the search tools. In one example, there is a group that
is collecting Federal Registers found among Scribd’s documents.
Discovery
Another option for those who do not elect to use search tools are the
Although documents can be found through Google and other search Explore pages. Explore offers three pages for discovery. These pages
engines, Scribd also provides search tool on its own site. Scribd has are great for a new user who wants to know what is available with-
basic and advanced search tools with Boolean and limited searching in Scribd. Users can browse documents in three subareas, Category,
on document title, content, tags, and description. There is also the Trending, and Topic. Trending uses statistical information collected
option to use proximity operators and “fuzzy matching” to search. by Scribd to present the most popularly viewed Scribd documents.
Search strategies based on keywords or metadata fields submitted by Trending is an interesting option because it exposes popular docu-
the document producer are also available. ments––sometimes unexpected and interesting. Topic assembles pag-
Often when searching Scribd the problem is not finding results, it es based on tags, a voluntary field assigned by the document producer
is sorting through them. A novice user may be intimidated when a at the time of upload. Scribd takes the most popularly assigned tags
search on the words “Chicago History” returns 112,540 documents and exposes them on this page. Popular tags include the words “Hol-
with no obvious tools for narrowing these results. Scribd’s results lywood” and “self-help.” These tags are not useful for those research-
page is confusing. It presents an image of each document’s cover or ing scholarly subjects, though perhaps are of interest to a more gen-
first page, a title, an excerpt from either a short description by the con- eral audience. Fewer documents are available on the Explore pages
tent provider or the first words from the publication. When a producer because many document producers do not include tags or a descrip-
has not provided a description the available information is sometimes tion when uploading a document. Scribd’s plans to create tools that
nonsensical. It appears this nonsensical text is created when the field will encourage users to tag documents they read, and, if this goal is
is taken from the beginning of the document, which is often a table achieved, it has the potential to make Scribd’s Explore pages more
of contents or a dedication. Other information on the results page in- relevant.
cludes the producer’s Scribd handle, the number of comments about Once a user has identified a document he wants to read, Scribd of-
the document, and the date the document was uploaded. The informa- fers different options for access to the documents. Documents can
tion presented is not always useful for helping a user judge the docu- be read at the Scribd site, downloaded to a local device or printed
ment’s value quickly. Instead, he often must click into the document out, depending on the content contributor’s permissions and the user’s
to assess its usefulness. In addition, since the author’s Scribd handle choice. Purchased publications have the same platform options.
is often an alias, it is difficult to identify the source of the document.
To add to this problem, the author may have closed her Scribd ac-
count, and so she is no longer available through the Scribd network.
Critical Evaluation
Not having a source means the content is not verifiable and so may Scribd’s strength is in its open content and its ability to provide users
not be usable for scholarly purposes. with a resource for easily storing an unlimited number of documents
online. Scribd offers document owners several important benefits,
Narrowing Results Scribd offers some tools for narrowing results, among them, unlimited storage, archiving capabilities and control
though they are not explained and are not always useful. The user can over their content (users may disable comments, printing and copy-
narrow results by restricting them to category, length (number of pag- ing of their documents.) Scribd offers small publishers (for example,
es), file type, language, or price (free or for sale). He cannot combine Pratham Books mentioned earlier) exposure to a more global audi-
several narrowing terms, however, which limits the benefits of these ence. It also offers a place for electronic grey literature to be distrib-
tools. Narrowing by length of document or format is not always par- uted. Services like Groups have a great deal of potential for providing
ticularly useful, and the potential for excluding relevant documents is a way for people to share content across the world. It is certainly a
high. Narrowing by category is the most useful option, but it is usu- great Web site for someone who seeks free reading matter. There are
The Charleston Advisor / October 2010 www.charlestonco.com 33
Content: HHH
Quantity trumps quality on the Scribd Web site. A lot of unreliable anonymous content is present, but there is also a lot of
useful content. Among what is useful and free is case law, government reports, and science data.
User Interface/Searchability: HH
The Scribd search tools need further refinement. Results do not provide enough tools for quick decision making or search
refinement. Browse pages are interesting and help new users to get a handle on what content is available. In particular the
Trends page is interesting. An online search guide is needed.
Pricing: N/A
Prices for documents and commercial publications are determined by the contributor and so Scribd is not accountable on
pricing. A quick, informal survey shows prices for documents on Scribd are usually similar to those on other online book-
store sites such as Amazon and Barnes and Nobel.
interesting documents to browse as you wait for your next flight. This transferred to outside their original jurisdiction into the U.S. or other
is all very useful especially considering it is free. countries. Scribd conforms to the Digital Millennium Copyright Act
(DMCA), which requires that it protect copyright and content hold-
Scribd’s biggest weakness is in its search tools. The tools do not pro-
er’s rights. Users who repeatedly infringe on these rights are suspend-
vide the right search capabilities for slicing through the huge number
ed. Content is for personal use and is owned by the producer. Scribd
of documents that reside within the system. Paging through results is
reserves the right to use uploaded content to promote its services.
slow and the available options for viewing pages and narrowing re-
sults are not good enough. Also controlled search terms for categories Users can browse Scribd and read documents without an account. An
are too broad to be effective for searching, categories such as busi- ID and password are required for those who want to download an e-
ness/law and magazines/newspaper are too broad and do not define book, leave a comment, or upload content. Users can sign up via their
the subject matter adequately. The absence of good searching docu- Face book account or with an e-mail address.
mentation and guidance is a problem that needs to be addressed. Cat-
egories and topics need to be explained, as do the algorithms behind References
how these pages are constructed. Harvard School of Engineering and Applied Sciences. Q&As with
notable SEAS alumni, Trip Adler ‘06, A.B. (Computer Science). Har-
Contract Provisions and Authentication vard University. <http://www.seas.harvard.edu/news-events/publica-
Scribd’s terms of use are fairly standard. One must be at least 13 years tions/qa/trip-adler>.
old, or over the age of majority in the user’s jurisdiction, and not have Kim, Theodore. “Plano Blog.” [Weblog entry.] The Plano Morning Jog.
been previously suspended or removed from Scribd. Organizational Dallas Morning News. 12 Aug. 2010. <http://planoblog.dallasnews.com/
representatives must warrant that they are an authorized representa- archives/2010/08/the-plano-morning-jog-thurs-au-1.html>. 27 Aug. 2010.
tive. All content uploaded to Scribd must be original or appropriately Laird, Michelle. Scribd Marketing Department, Scribd. Personal in-
licensed to the producer. Privacy terms include the disclosure that in- terview. 11 Aug. 2010.
formation provided in a user profile is available to organizations for Quantcast. The Scribd Network. Quantcast. http://www.quantcast.
marketing or promotional use may be collected and used by others com/scribd.com
without restriction. Documents uploaded from non-U.S. locations are