AAAS > International > Africa

   
   

Contents

Introduction

What Is Feasibility?

Methodology

Summary of Results

A Closer Look at Each University:
Zambia
Makerere
Ghana
Cheikh Anta Diop

Recommendations

Conclusions and Next Steps

Acknowledgements

 
 

Methodology

Our methodology was simple: We went to each university and downloaded journal articles under typical, rather than ideal, local conditions. We did in some cases also test journals access during late evening or weekend hours, when the access center would normally be closed, for the sake of comparison--and achieved better results. However, since actual local users would not ordinarily enjoy similar access, only those results obtained during the usual, higher traffic times are used to reflect the current feasibility of journals access from these locations.

The journals we accessed represent the two basic types of online journals that exist: those formatted in PDF files and those formatted in HTML. Articles in PDF format need to be downloaded in their entirety, graphics and all, before they can be viewed, and then additional software (e.g., Adobe Acrobat Reader) is needed to view them. The PDF files tend to be large, typically in the 250-700 kilobyte (Kb) range and often a megabyte or more, although smaller articles do also exist. By contrast, HTML files are downloaded like a web page, with text tending to appear very quickly and graphics coming in more gradually over time. The graphics on the HTML articles first appear as small versions, with an option to download larger versions, so that the total "weight" of an HTML article is invariably much less than a similar PDF article (see Tables 1 and 2). Science magazine and Nature are HTML-based journals, but they also offer a PDF download option; there was no HTML option with any of the IDEAL catalog's PDF-based journals. Note that there is much less variation in overall file size with HTML articles than with PDF files. HTML articles are consistently in the 80-150 kb range (including graphics), whereas PDF files vary considerably between less than 100 kb and more than 3 megabytes.

Table 1. Typical file sizes in kilobytes: HTML vs. PDF

Article

HTML version (default)

HTML version (large)

PDF version

1

89

143

105

2

77

129

106

3

75

120

138

4

130

314

355

5

120

312

420

6

147

455

498

7

110

237

656

8

89

143

3077

Note: HTML files are composites of multiple text and graphics files; the size given in the first column ("default") is the sum of the text plus the initial default graphics in the article, i.e., the article as it appears in its entirety without the extra step of downloading the larger versions of the same graphics. The second column ("HTML large") shows the total number of kilobytes counting the larger, optional graphics. A PDF file is a single file that includes text and high-quality graphics.

 

Table 2. Download times for same article: HTML vs. PDF

Article

 

Size (PDF version)

HTML download time

PDF download time

Text only

with graphics

1

80 kb

45 seconds

58 seconds

2 min. 30 seconds

2

83 kb

53 seconds

1 min. 22 seconds

1 min. 47 seconds

3

99 kb

48 seconds

1 min. 28 seconds

2 min. 31 seconds

4

198 kb

1 min. 5 seconds

1 min. 42 seconds

4 min. 12 seconds

5

303 kb

58 seconds

1 min. 31 seconds

9 min. 45 seconds

6

530 kb

52 seconds

2 min. 20 seconds

11 min. 4 seconds

7

828 kb

45 seconds

2 min. 44 seconds

25 min. 32 seconds

Note: These figures include experiments conducted at the University of Ghana and the University of Cheikh Anta Diop (Dakar, Senegal)

 

We logged our results both manually, i.e., timing the download and noting the times of completion, and automatically, using the proxy server's logging capability. The automatic logging function indicates to the millisecond how long it took to download a file completely. This function is particularly useful for evaluating the download time for PDF files, since they are single files containing entire articles, graphics and all, and they do not become usable until they are completely downloaded. By contrast, HTML-based articles, as composites of separate text and graphic files, are logged by the proxy server as multiple file downloads, with no way to tell which group of files belongs together as a single article. In addition, the HTML articles become useful before all the component pieces are fully downloaded, since the text tends to appear first and can be read while the graphics are still on the way. Therefore the automatic logging function is less useful as a measure of how quickly HTML articles download, since the point of download "usefulness" is different from the point of download completion. In the case of HTML articles, then, the manually timed entries are perhaps more enlightening although less "accurate" than the raw transfer rate data.

In addition to testing the Internet connections by downloading journal articles, we also (successfully) endeavored in each case to improve the connections. We brought with us a FreeBSD Unix system with Squid proxy server, and transferred it to three of the four universities with markedly improved performance results. The fourth university (in Senegal) was committed to using Linux and a different proxy server, but we were able to improve their authentication server, enhancing network performance again.

   
 

AAAS > International > Africa