IT 223: Data Analysis and Statistical Software
Fall 2005

Assignment 1

Due Friday January 13 before 11:30pm by DL Web Submission

Overview

For this assignment, you will collect two sets of data and compare them using simple descriptive statistics and graphs prepared by software such as Excel. Data collection consists of using ping to time the round-trip travel of a packet over the internet.

Data Collection

Perform the following steps to collect 2 sets of at least 30 timings:

  1. Select a unix-based computer with internet access. Linux, Unix or Mac OS X (which runs on Unix) will work. Consider using your student email (students.depaul.edu) account, a CTI Linux account (hostnames are ctilinux1.cstcis.cti.depaul.edu and ctilinux2.cstcis.cti.depaul.edu), or a hawk account. The ping tool on MS Windows does not provide enough precision for this assignment. If you are not having any success running ping on a unix/linux computer, I have created a web application that runs the command for you on a linux server. Here is another web application. Perhaps this one is best since it always shows fractions of milliseconds regardless of time.
  2. Use the command line interface for the computer. On a Macintosh, it can be found as Terminal in the following menu hierarchy: Applications-->Utilities. For unix and linux, you are probably already using a command line interface when you login.
  3. Identify a pingable server in a far away place. For example, by typing in australia university into Google, I found links to Australian Web servers. Most of these Web servers (e.g. www.rmit.edu.au) were pingable.
  4. Run ping so that it issues round-trip timings. On Unix, this command is ping hostname. For hostname, you'll need to specify the name of the server you indentified in the previous step. To obtain help on ping, type man ping on Unix/Mac/Linux.
  5. Execute ping at least 30 times and record the time of the round trip in milliseconds, keeping at least one tenth of a millisecond (note that Unix/Mac/Linux gives you fractions of milliseconds while MS Windows rounds to the nearest millisecond).
  6. Select a different host or a different computer to collect a second data set of 30 timings.

For both sets of data, make sure you note the conditions for collecting the data. This includes, the Operating System, time of day and what you know about the internet connection.

Statistics and Graphs

Calculate basic descriptive statistics (mean, median, standard deviation, and extreme values) and create a graph that shows the distribution of each data set. Choose good bin sizes for revealing the distribution. Make sure that your graphs are labeled well and allow for an easy comparison between your two sets of data. Links to creating histograms using Excel can be found on this page.

Summary

Write a paragraph that describes your results. The descriptions should include the basic descriptive statistics and some general comments on the shape of the distribution. Provide a possible interpretation to any differences you find between your two sets of data. Make sure you reference and explain your graphs. Be very careful with how you support your interpretations with your results.

Submission

For submitting your assignment, assemble a brief report that consists of the following:

Your report should be well written and formatted. Please place all contents in one document using a common format (e.g. Word, PDF, RTF, HTML). You only need to submit this one document.