G22.3033-008 Web Services and Applications Fall 2003 Class: Tuesdays, 5-7pm, WWH 102 Office hours: by appointment Robert Grimm rgrimm@cs.nyu.edu [ Overview | Schedule | Assignments | Resources ] Overview Web Services and Applications explores how to build a web where content is dynamic and different services interact directly with each other. This course is motivated by computing industry's current interest in web services (witness Microsoft's .NET). However, while exploring the relevant standards (SOAP, WSDL, and UDDI), the course is not limited to them and tries to be a more general exploration of web technologies. More specifically, the goals for the course are: * To understand current web technologies, including the underlying protocols and how to build services for a global audience. * To hatch ideas for future research on web services and applications. * To develop a methodology for building complex systems. Course Components To meet these goals, the course has three components: * Reading assignments to introduce topics. * Lectures and in-class discussions to deepen students' understanding. * Programming assignments to provide students with hands-on experience. In other words, the course is a combination of a research seminar and a systems building course. Reading assignments mostly cover research papers. Each reading assignment needs to be completed before the corresponding lecture. Students also need to produce a written summary for each assigned paper. The reading summary should be about a paragraph long and describe in the student's own words (1) the main idea expressed in each paper, (2) the innovations (if any) described in the paper, (3) the student's criticisms (regarding soundness, methodology, elegance, etc.), and (4) possible research directions. Note that paraphrasing a paper's abstract or outline is not sufficient. Summaries are submitted by emailing them in plain-text format (with hard line breaks after each line) to g22_3033_008_fa03-readings@cs.nyu.edu. Make sure the subject specifies the paper title. The archive of all summaries is here. Summaries are due at 10am on the day of the corresponding class! Programming assignments are substantial, written in Java, build on each other, and are performed in teams of four (4) students. In the course of the semester, students explore the complete web services infrastructure (notably, the HTTP and SOAP engines) and also build their own applications. Furthermore, groups will perform significant testing and measurement studies, including regular interoperability testing (where one group tests another group's HTTP or SOAP engine). There are no exams. Prerequisites Proficiency in Java, including how to write socket-based and multi-threaded code, is required. Experience with building large software systems and the ability to read and digest research papers are highly helpful. Grading Policy 50% programming assignments, 30% reading assignments, 20% class participation. Collaboration Policy Students are encouraged to discuss class topics and readings with each other. However, each student must write his/her reading summary individually. Students in different groups may help each other with general programming questions and interoperability testing. However, students in different groups must not exchange code or use code from outside sources (such as the Internet). The java.net.URL and java.net.URLConnection classes in the Java SDK are off-limits. Schedule Introduction 9/2/03 Slides: PowerPoint, PDF. Required readings: * Ed Ort. Web Services and the Sun ONE Developer Platform. Sun Microsystems, November 2002. Further readings: * Ethan Cerami. Web Services Essentials. Chapter 1, Introduction. * Microsoft Coporation. Overview of the .NET Framework. .NET Framework Developer's Guide, Microsoft Corporation. HTTP 9/9/03 Slides: PowerPoint, PDF. Required readings: * Simon E. Spero. Analysis of HTTP Performance Problems. W3C, July 1994. * Balachander Krishnamurthy, Jeffrey C. Mogul, and David M. Kristol. Key Differences between HTTP/1.0 and HTTP/1.1. Proceedings of the WWW-8 Conference, Toronto, Canada, May 1999. * Henrik Frystyk Nielsen, James Gettys, Anselm Baird-Smith, Eric Prud'hommeaux, Håkon Wium Lie, and Chris Lilley. Network Performance Effects of HTTP/1.1, CSS1, and PNG. Proceedings of the ACM SIGCOMM '97 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pages, 155-166, Cannes, France, 1997. Further readings: * James Marshall. HTTP Made Really Easy. August 1997. * E. James Whitehead Jr. and Yaron Y. Goland. WebDAV: A Network Protocol for Remote Collaborative Authoring on the Web. Proceedings of the European Computer Supported Cooperative Work Conference, 1999. * James Whitehead. Lessons from WebDAV for the Next Generation Web Infrastructure. WWW-7 HTTP-Future Workshop, April 1998. Building Fast Servers 9/16/03 Slides: PowerPoint, PDF. Required readings: * Vivek S. Pai, Peter Druschel, and Willy Zwaenepoel. Flash: An Efficient and Portable Web Server. Proceedings of the 1999 USENIX Annual Technical Conference, pages 199-212, Monterey, California, June 1999. * Atul Adya, Jon Howell, Marvin Theimer, William J. Bolosky, and John R. Douceur. Cooperative Task Management without Manual Stack Management. Proceedings of the 2002 USENIX Annual Technical Conference, Monterey, California, June 2002. Further readings: * Gaurav Banga and Jeffrey C. Mogul. Scalable Kernel Performance for Internet Servers under Realistic Loads. Proceedings of the 1998 USENIX Annual Technical Conference, New Orleans, Louisiana, June 1998. * Matt Welsh, David Culler, and Eric Brewer. SEDA: An Architecture for Well-Conditioned, Scalable Internet Services. Proceedings of the 18th ACM Symposium on Operating Systems Principles, pages 230-243, Banff, Canada, December 2001. Clusters 9/23/03 Slides: PowerPoint, PDF. Required readings: * Eric Brewer. Lessons from Giant-Scale Services. IEEE Internet Computing, 5(4):46-55, July/August 2001. * Yasushi Saito, Brian N. Bershad, and Henry M. Levy. Manageability, Availability, and Performance in Porcupine: A Highly Scalable, Cluster-Based Mail Service. ACM Transactions on Computer Systems, 18(3):298-332, August 2000. Further readings: * Thomas E. Anderson, David E. Culler, David A. Patterson, and the NOW Team. A Case for NOW (Networks of Workstations). IEEE Micro, 15(1):54-64, February 1995. * Armando Fox, Steven D. Gribble, Yatin Chawathe, Eric A. Brewer, and Paul Gauthier. Cluster-Based Scalable Network Services. Proceedings of the 16th ACM Symposium on Operating Systems Principles, pages 78-91, Saint-Malo, France, October 1997. * Vivek S. Pai, Mohit Aron, Gaurav Banga, Michael Svendsen, Peter Druschel, Willy Zwaenepoel, and Erich Nahum. Locality-Aware Request Distribution in Cluster-Based Network Servers. Proceedings of 8th International Conference on Architectural Support for Programming Languages and Operating systems, pages 205-216, San Jose, California, 1998. Caching 9/30/03 Slides: PowerPoint, PDF. Required readings: * Lee Breslau, Pei Cao, Li Fan, Graham Philips, and Scott Shenker. Web Caching and Zipf-Like Distributions: Evidence and Implications. Proceedings of IEEE Infocom '99. * Alec Wolman, Geoffrey M. Voelker, Nitin Sharma, Neal Cardwell, Anna Karlin, and Henry M. Levy. On the Scale and Performance of Cooperative Web Proxy Caching. Proceedings of the 17th ACM Symposium on Operating Systems Principles, pages 16-31, Kiawah Island Resort, South Carolina, December 1999. Further readings: * Anawat Chankhunthod, Peter Danzig, and Chuck Needaels. A Hierarchical Internet Object Cache. Proceedings of the USENIX 1996 Annual Technical Conference, pages 153-163, San Diego, California, January 1996. * Renu Tewari, Michael Dahlin, Harrick M. Vin, and Jonathan S. Kay. Design Considerations for Distributed Caching on the Internet. Proceedings of the IEEE International Conference on Distributed Computing Systems, pages 273-284, Austin, Texas, June 1999. * Li Fan, Pei Cao, Jussara Almeida, and Andrei Z. Broder. Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol. IEEE/ACM Transactions on Networking, 8(3):281-293, June 2000. Content: XML 10/7/03 Slides: PowerPoint, PDF. Required readings: * Dare Obasanjo. An Exploration of XML in Database Management Systems. 2001. * Jérôme Siméon and Philip Wadler. The Essence of XML. Proceedings of the 30th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 1-13, 2003. Further readings: * Norman Walsh. A Technical Introduction to XML. xml.com, October 1998. * Tim Bray. XML Namespaces by Example. xml.com, January 1999. * John Boyer. Canonical XML. W3C Recommendation, March 2001. * John Cowan and Richard Tobin. XML Information Set. W3C Recommendation, October 2001. * Eric van der Vlist. Using W3C XML Schema. xml.com, October 2001. * H. Kennedy. Binary Lexical Octet Ad-hoc Transport. Internet Engineering Task Force, RFC 3252, April 2001. * Hartmut Liefke and Dan Suciu. XMill: An Efficient Compressor for XML Data. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pages 153-164, Dallas, Texas, 2000. Content: Multimedia 10/14/03 Slides: PowerPoint, PDF. Stefan Saroiu's OSDI talk slides: PowerPoint, PDF. Required readings: * Stefan Saroiu, Krishna P. Gummadi, Richard J. Dunn, Steven D. Gribble, and Henry M. Levy. An Analysis of Internet Content Delivery Systems. Proceedings of the 5th Symposium on Operating System Design and Implementation, pages 315-327, Boston, Massachusetts, December 2002. * Steven McCanne, Van Jacobson, and Martin Vetterli. Receiver-driven layered multicast. Proceedings of the ACM SIGCOMM '96 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pages 117-130, Palo Alto, California, 1996. Small Devices 10/21/03 Slides: PowerPoint, PDF. Required readings: * Armando Fox, Ian Goldberg, Steven D. Gribble, David C. Lee, Anthony Polito, and Eric A. Brewer. Experience with Top Gun Wingman: A Proxy-Based Graphical Web Browser for the 3Com PalmPilot. Proceedings of the IFIP International Conference on Distributed Systems Platforms and Open Distributed Processing (Middleware '98), Lake District, England, September 1998. * Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, and Wei Hong. TAG: A Tiny Aggregation Service for Ad-Hoc Sensor Networks. Proceedings of the 5th Symposium on Operating System Design and Implementation, pages 131-146, Boston, Massachusetts, December 2002. Further readings: * Jason Hill, Robert Szewczyk, Alec Woo, Seth Hollar, David Culler, and Kristofer Pister. System Architecture Directions for Networked Sensors. Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 93-104, Cambridge, Massachusetts, November 2000. RPC 10/28/03 Slides: PowerPoint, PDF. Required readings: * Andrew D. Birrell and Bruce Jay Nelson. Implementing Remote Procedure Calls. ACM Transactions on Computer Systems, 2(1):39-59, February 1984. * Dave Winer. XML-RPC Specification. UserLand Software, June 1999. * Dave Winer and Jake Savin. A Busy Developer's Guide to SOAP 1.1. UserLand Software, April 2001. Further readings: * Ethan Cerami. Web Services Essentials. Part II, XML-RPC. * Ethan Cerami. Web Services Essentials. Part III, SOAP. * Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy. Lightweight Remote Procedure Call. ACM Transactions on Computer Systems, 8(1):37-55, February 1990. * Andrew Birrell, Greg Nelson, Susan Owicki, and Edward Wobber. Network Objects. SRC Research Report 115, December 1995. * Leendert van Doorn, Martin Abadi, Mike Burrows, and Edward Wobber. Secure Network Objects. Proceedings of the 1996 IEEE Symposium on Security and Privacy, pages 211-221, Oakland, California, May 1996. * Sun Microsystems. Java Remote Method Invocation Specification (revision 1.8). Sun Microsystems, 2002. Security 11/4/03 Slides: PowerPoint, PDF. Required readings: * Alex C. Snoeren, Craig Partridge, Luis A. Sanchez, Christine E. Jones, Fabrice Tchakountio, Beverly Schwartz, Stephen T. Kent, and W. Timothy Strayer. Single-Packet IP Traceback. IEEE/ACM Transactions on Networking, 10(6), December 2002. * Stuart Staniford, Vern Paxson, and Nicholas Weaver. How to 0wn the Internet in Your Spare Time. Proceedings of the 11th USENIX Security Symposium, San Francisco, California, August 2002. * David Moore, Vern Paxson, Stefan Savage, Colleen Shannon, Stuart Staniford, and Nicholas Weaver. The Spread of the Sapphire/Slammer Worm. Technical report, January 2003. Further readings: * Eugene H. Spafford. The Internet Worm Program: An Analysis. Purdue Technical Report CSD-TR-823, Purdue University, December 1998. * David Moore, Geoffrey M. Voelker, and Stefan Savage. Inferring Internet Denial-of-Service Activity. Proceedings of the 10th USENIX Security Symposium, Washington, DC, August 2001. * Stefan Savage, David Wetherall, Anna Karlin, and Tom Anderson. Network Support for IP Traceback. IEEE/ACM Transactions on Networking, 9(3):226-237, June 2001. Descriptions 11/11/03 Slides: PowerPoint, PDF. Required readings: * Ethan Cerami. WSDL Essentials. Chapter 6, Web Services Essentials, O'Reilly, February 2002. * Tim Bray. What is RDF? xml.com, January 2001. * Deborah L. McGuinness and Frank van Harmelen. OWL Web Ontology Language Overview. W3C Candidate Recommendation, August 2003. Further readings: * Renato Iannella. An Idiot's Guide to the Resource Description Framework. January 1999. * Rank Manola and Eric Miller. RDF Primer. W3C Working Draft, November 2002. * Anupriya Ankolekar, Mark Burstein, Jerry R. Hobbs, Ora Lassila, David Martin, Drew McDermott, Sheila A. McIlraith, Srini Narayanan, Massimo Paolucci, Terry Payne, and Katia Sycara. DAML-S: Web Service Description for the Semantic Web. Proceedings of the First International Semantic Web Conference, pages 348-363, Sardinia, Italy, June 2002. * Roxane Ouellet and Uche Ogbuji. Introduction to DAML: Part I. xml.com, January 2002. * Roxane Ouellet and Uche Ogbuji. Introduction to DAML: Part II. xml.com, March 2002. * Uche Ogbuji and Roxane Ouellet. DAML Reference. xml.com, May 2002. * Michael K. Smith, Chris Welty, and Deborah McGuinness. OWL Web Ontology Language Guide. W3C Candidate Recommendation, August 2003. Discovery 11/18/03 Slides: PowerPoint, PDF. Required readings: * uddi.org. UDDI Technical White Paper. September 2000. * William Adjie-Winoto, Elliot Schwartz, Hari Balakrishnan, and Jeremy Lilley. The Design and Implementation of an Intentional Naming System. Proceedings of the 17th ACM Symposium on Operating Systems Principles, pages 186-201, Kiawah Island Resort, South Carolina, December 1999. * Steven E. Czerwinski, Ben Y. Zhao, Todd D. Hodes, Anthony D. Joseph, and Randy H. Katz. An Architecture for a Secure Service Discovery Service. Proceedings of 5th Annual ACM/IEEE Internation Conference on Mobile Computing and Networking, pages 24-35, Seattle, Washington, August 1999. Further readings: * Ethan Cerami. Web Services Essentials. Chapter 7, UDDI Essentials. Active Everything 11/29/03 Slides: PowerPoint, PDF. Required readings: * Pei Cao, Jin Zhang, and Kevin Beach. Active Cache: Caching Dynamic Contents on the Web. Proceedings of the IFIP International Conference on Distributed Systems Platforms and Open Distributed Processing (Middleware '98), pages 373-388, Lake District, England, September 1998. * Amin Vahdat, Michael Dahlin, Thomas Anderson, and Amit Aggarwal. Active Names: Flexible Location and Transport of Wide-Area Resources. Proceedings of the 2nd USENIX Symposium on Internet Technologies and Systems, pages 151-164, Boulder, Colorado, October 1999. * David Wetherall. Active Network Vision and Reality: Lessons from a Capsule-Based System. Proceedings of the 17th ACM Symposium on Operating Systems Principles, pages 64-79, Kiawah Island Resort, South Carolina, December 1999. Pulling Back: REST vs. SOAP 12/2/03 Required readings: * Tim Berners-Lee, James Hendler, and Ora Lassila. The Semantic Web. Scientific American, 284(5):34-43, May 2001. * Paul Prescod. Roots of the REST/SOAP Debate. Proceedings of Extreme Markup Languages, Montréal, Canada, August 2002. Assignments Remember that all programming assignments are written in Java (without using java.net.URL and java.net.URLConnection) by groups of four students. For each assignment, each group needs to: * Design and implement a server and a client. The server should be designed for a specific set of goals (such as performance, scalability, low resource consumption, resilience to denial of service attacks) and the client should exercise the server accordingly. * Perform interoperability testing. Each group needs to test its client with another group's server and test its server with another group's client. * Track its efforts. Each group needs to track the time in hours spent on (1) preparation, (2) design, (3) implementation, (4) basic testing and debugging, (5) interoperability testing, and (5) documentation and write-up. Each group also needs to track the lines of code (using JavaNCSS) and the number of bugs (including when they were introduced/fixed). * Document its efforts. Each group needs to produce an approximately five page extended abstract, which provides an overview of the design (including goals) and convinces me that the implemented client and server meet the goals. The report also needs to report on the results of interoperability testing and the project's statistics (time spent, resulting lines of code, number of bugs). Finally, it should share any interesting anecdotes. * Hand in its report and code. Each group needs to hand in an electronic copy of its report (PDF is preferred) and a ZIP file with all code. Note that each group's code is made available for research on cheating detection. Assignments: * #1: The HTTP Client. Due 9/23/03 before class! * #2: Better Support for HTTP/1.1. Due 10/7/03 before class! * Intermezzo: Fixing Bugs in Munin. Due 10/14/03 before class! * #3: XML to S-Expressions. Due 10/28/03 before class! + The test document (use "Save Target As..." or similar to download). * #4: The SOAP Calculator. Due 11/11/03 before class! * #5: Your own application. Due 12/2/03 before class! Resources Books Note that neither book is required, they are both optional. * Ethan Cerami. Web Services Essentials. O'Reilly, February 2002. * Elliotte Rusty Harold and W. Scott Means. XML in a Nutshell, 2nd edition. O'Reilly, June 2002. Mailing Lists * g22_3033_008_fa03 * g22_3033_008_fa03-readings Authorized Code * Munin (v. 1.1.0), the NYU CS event-driven web server. Build and configuration instructions. * httperf, a tool for measuring web server performance. * Xerces, an XML parser maintained by the Apache XML project. * Crimson, another XML parser maintained by the Apache XML project, part of the Java 2 Platform SDK 1.4. * The TcpTunnelGui tool from the Apache SOAP project. Axis Webcams * Developer information. * Camera 1: http://orwell1.cs.nyu.edu/ - facing one elevator lobby at 715 Broadway, 7th floor. * Camera 2: http://orwell2.cs.nyu.edu/ - facing the other elevator lobby at 715 Broadway, 7th floor. * Camera 3: http://66.93.85.13/ - facing uptown from Silver Towers. Standards * HTTP/1.0 * HTTP/1.1 * XML * Namespaces in XML * XML Schema Part 1: Structures * XML Schema Part 2: Datatypes * DOM * XML-RPC * SOAP * WSDL * UDDI