IT 223: Data Analysis and Statistical Software
Fall 2005
Assignment 1
Due Friday
January 13 before 11:30pm by DL Web Submission
Overview
For this assignment, you will collect two sets of data and compare
them using simple descriptive statistics and graphs prepared by
software such as Excel. Data collection consists of using ping to time the
round-trip travel of a packet over the internet.
Data Collection
Perform the following steps to collect 2 sets of at least 30
timings:
- Select a unix-based computer with internet access. Linux, Unix or
Mac OS X (which runs on Unix) will work. Consider using your student
email (students.depaul.edu) account, a CTI Linux
account (hostnames are ctilinux1.cstcis.cti.depaul.edu and
ctilinux2.cstcis.cti.depaul.edu), or a hawk account.
The ping tool on MS Windows does not provide enough precision for this
assignment. If you are not having any success running ping on a unix/linux computer, I have created a web application that runs the command for you on a linux server. Here is another web application. Perhaps this one is best since it always shows fractions of milliseconds regardless of time.
- Use the command line interface for the computer. On a Macintosh,
it can be found as Terminal in the following menu hierarchy:
Applications-->Utilities. For unix and linux, you are
probably already using a command line interface when you login.
- Identify a pingable server in a far away place. For example, by
typing in australia university into Google, I found links to
Australian Web servers. Most of these Web servers
(e.g. www.rmit.edu.au) were pingable.
- Run ping so that it issues round-trip timings. On
Unix, this command is ping hostname. For
hostname, you'll need to specify the name of the server you
indentified in the previous step. To obtain help on ping, type man
ping on Unix/Mac/Linux.
- Execute ping at least 30 times and record the time of the round
trip in milliseconds, keeping at least one tenth of a millisecond
(note that Unix/Mac/Linux gives you fractions of milliseconds while MS
Windows rounds to the nearest millisecond).
- Select a different host or a different computer to collect a
second data set of 30 timings.
For both sets of data, make sure you note the conditions for
collecting the data. This includes, the Operating System, time of day
and what you know about the internet connection.
Statistics and Graphs
Calculate basic descriptive statistics (mean, median, standard
deviation, and extreme values) and create a graph that shows the
distribution of each data set. Choose good bin sizes for revealing
the distribution. Make sure that your graphs are labeled well and
allow for an easy comparison between your two sets of data. Links to creating histograms using Excel can be found on this page.
Summary
Write a paragraph that describes your results. The descriptions
should include the basic descriptive statistics and some general
comments on the shape of the distribution. Provide a possible
interpretation to any differences you find between your two sets of
data. Make sure you reference and explain your graphs. Be very
careful with how you support your interpretations with your results.
Submission
For submitting your assignment, assemble a brief report that consists
of the following:
- Descriptions of data collection process and conditions.
- Written summary of the results
- Appropriate tables and graphs pasted from the output of
your statistics program
- Possible reasons for your results
- A listing of your data in the appendix of your report
Your report should be well written and formatted. Please place all
contents in one document using a common format (e.g. Word, PDF, RTF,
HTML). You only need to submit this one document.