Introduction to the Internet and the World Wide Web


What is the Internet

  • Network: Communications system connecting two or more computers.
  • LAN:  Local Area Network connects, usually by cable, a group of desktop PCs and other devices, such as printers, in an office or a building. 
  • MAN:  Metropolitan Area Network   A data network intended to serve an area the size of a large city. Such networks are being implemented by innovative techniques, such as running optical fiber through subway tunnels.
  • WAN:  Wide Area Network A communications network that uses such devices as telephone lines, satellite dishes, or radio waves to span a larger geographic area than can be covered by a LAN.
  • Internet:  A network of networks. It is a worldwide network that connects more than 400,000 smaller indepenedent networks or WANs in more than 200 countries. It joins many government, university and private computers together and providing an infrastructure for the use of E-mail, bulletin boards, file archives, hypertext documents, databases and other computational resources.

Basic Concepts of Internet and WWW

Internet consists of:

  • Computer hosts (or servers): providing information such as HTML pages, or services file access, email, etc.
  • Routers and switches: specialized computers that route traffic on the Internet between clients and servers, and among different hosts.
  • Communication Channels: Leased phone lines (such as T1, T3, etc.); Cable; Satellite; Wireless. Together these channels make the communication "backbone" for the internet.

Bandwidth by Connection

  • Bandwidth: the capacity or amount of data that can be transmitted across a communication channel. The higher the capacity, the quicker web pages or files download on your computer. Usually measured in terms of bits per second (bps).
  • Telephone lines - 28 Kbps (kilobits per second) to 56 Kbps
  • DSL (Digital Subscriber Line) - 512 Kbps to 1.5 Mbps (megabits per second)
  • Cable modems - up to 10 Mbps
  • T1 leased line - 1.54 Mbps
  • T3 leased line - 44.7 Mbps
  • ATM (Asynchronous Transfer Mode) - 622 Mbps

Client-Server Model

  • Internet is based on client / server model
  • Client is end user’s computer or workstation with software that sends requests to a server
  • Server (or host) is remote computer with software that handles requests from clients
  • In the case of the Web: client software is a Web browser (running on the client computer), and the server software is a Web (or HTTP) server, running on the host computer. Examples of browsers are Internet Explorer and Netscape. Examples of Web servers are Microsoft IIS and Apache.
  • In the case of Email: client software is an email client software (such as Outlook), and the server software is a mail server.

TCP/IP

  • Transmission Control Protocol / Internet Protocol
  • Protocol (set of rules) used for formatting, ordering and error checking data sent over a network
  • TCP - divides data into packets
  • IP - handles delivery of packets
  • All computers connected to Internet must "speak" TCP/IP

Additional Internet Protocols

  • http - HyperText Transfer Protocol: rules for transferring web pages
  • ftp - File Transfer Protocol: rules for downloading files like software programs
  • telnet - rules for logging into remote computers connected to the Internet
  • smtp - Simple Mail Transfer Protocol: rules for transferring email messages
  • wap - Wireless Application Protocol: used by wireless devices when they are used online
  • mailto - used on the client-side for sending email (via mail client software)
  • file - used for local file access (on the client computer)

IP Addresses

  • Each Internet host must have unique IP address (like a phone number)
  • Consists of set of 4 numbers, each separated by a period (dot)
  • Numbers range from 0 to 255
  • IP address for the main CTI server is: 140.192.32.136
  • The "domain name" corresponding to this address is "www.cs.depaul.edu"

Domain Name System (DNS)

  • System for converting IP addresses into alphanumeric characters
  • Organized as a hierarchy of domains and subdomains
  • Domain names have format:

hostname.subdomain.toplevel-domain

  • Hostname - name given to the host computer (often www but not always)
  • Subdomain - name of a network (or subnetwork) to which the host computer belongs
  • Top Level Domains:
    .edu for education
    .org for non-profit organization
    .gov for government
    .com for commercial
    .net for network
  • New top level domains have been approved (.biz, .name, .info, etc.)


 

Packet Switching

  • Basic method used for all data transmission (email, web pages, etc.) via TCP/IP
  • Files and even email messages broken into small packets
  • Each packet includes IP address of sender and IP address of destination
  • Packets may travel different paths to destination
  • Packets are reassembled after all arrive at the destination computer

  • Packet switching demo:

http://www.pbs.org/opb/nerds2.0.1/geek_glossary/packet_switching_flash.html

URL Uniform Resource Locator

  • Address in a format that identifies an individual object (web page, image, sound file, etc.) on the Internet

  • Analogy is a phone number

  • URL is unique for each object

  • Must be typed exactly (often case sensitive)

  • Format:   protocol://server name/path/file name

  • protocol = rules for transferring the data (for example, http for a web page)

  • server = fully qualified domain name (or IP address) of the host computer where object is located

  • path = folder or directory in the host computer where object will be found

  • file = file name of the object (web page, image, sound , etc.)

Client-Server Architecture

OR


Short History of Internet and WWW

When Did It All Start?

  • In 1945, Vannevar Bush wrote an article “As We May Think” describing a machine, Memex, containing human collective knowledge organized with “trails” linking materials of the same topic.

  • The article revolutionized information technology before even the existence of modern computers.

  • Memex is a hypothetical machine based on a dream: The information stored ought to be accessible.

  • We haven’t fulfilled the dream yet, but much has been achieved in 50 years.

Hypertext-Hyperlink-Hypermedia

  • Following Memex idea, Ted Nelson developed the Xanadu project which aimed at placing the entire world’s literary corpus on-line.

  • Ted Nelson coined the term hypertext in 1965.

  • A document is not contiguous but is a set of connected parts of documents. Hyperlinks are links that connect subdocuments. Hypermedia is a multimedia hypertext document.

ARPAnet

  • In the heart of the cold war, ARPA (Advanced Research Projects Agency) was created (1957).

  • The purpose was to outrun the Russians in the race for mastering rocket launching.

  • In 1969, it was decided to link sensitive computer centers by a network in order to withstand a possible nuclear attack. The idea was to allow centers to communicate even after a centre is destroyed.

  • It connected government labs, major research centers and universities.

  • It existed until 1988 and was officially dismantled in 1990.

  • Backbone Network speed: 64Kbits/second

  • Major achievements: TCP/IP, Domain Name Service, e-mail (SMTP), FTP, Telnet, etc.

NSFnet and the Internet

  • DARPA, the Defense Advanced Research Projects Agency, still exists and the military have their own network but the original ARPAnet was integrated into the current Internet.

  • The National Science Foundation in the USA funded the NSFnet which was created in 1985.

  • Backbone Network speed: T1 (1.5mb/sec.) to T3 (45mb/sec.)

  • It originally connected 5 major universities with supercomputer centers, but rapidly included other universities, research centers and private companies.

  • Replaced ARPAnet as the backbone of Internet in 1990.

  • Other networks existed in North America and Europe and other places in the world.

  • BitNet, for instance, connected many research centers and universities.

  • Bridges connected these networks to create a larger international network: the Internet.

  • Late 90s: Internet2, funded by US universities, a sequel to NSFnet with new protocols.

Explosive Growth of Internet

The World Wide Web (WWW)

  • In 1990, Tim Berners-Lee developed a on-line hypertext-based system to help researchers at CERN in Switzerland share information across a diverse computer network.

  • He came up with first versions of HTML (based on SGML) and the HTTP protocol.

  • HTTP and HTML catapulted the Internet to new heights.

  • The WWW revolutionized the use of the Internet thanks to a multimedia user friendly interface: a web browser.

  • Mosaic was developed in NCSA by students at the University of Illinois in 1993, among them Marc Andreessen who created Netscape in 1995.


HTTP - Hypertext Transfer Protocol

HTTP Basics

  • HTTP is the protocol responsible for transferring and displaying web (HTML) pages.

  • Uses the client/server model of computing. The client is the user’s web browser (I.E, Netscape). The server is the web server where the page resides. (www.nyt.com).

  • HTTP protocol involves the following elements:

    • Request ("I want something”)

    • Response ("Here it is" or "Not found")

    • Headers

    • Body