A Brief Introduction to The Web Services Protocol Stack: SOAP, WSDL and UDDI

Today, SOAP (Simple Object Access Protocol), WSDL (Web Services Description Language), and UDDI are emerging as the Internet de facto standards for Web services. SOAP has been accepted and is being standardized by the World Wide Web Consortium (W3C). WSDL has been submitted to the W3C for standardization, and is emerging as the de facto standard language for the description of Web services. UDDI is poised to be the de facto standard for the Web service repository and will be submitted to standards bodies in 2002.

This post presents a brief technical overview of SOAP, WSDL, and UDDI. It provides
pointers to the specification for each protocol and where to get additional information.

SOAP is a lightweight protocol initially proposed by Microsoft, IBM, and others, including Don Box and Dave Winer, who also contributed to the SOAP 1.1 specification. There also have been a number of W3C submissions related to SOAP, including the “SOAP with Attachments” note created by John Barton.

The SOAP protocol supports XML document exchange and provides a convention for Remote Procedure Call (RPC) using XML messages. SOAP specifies a wire protocol for facilitating highly distributed applications. SOAP is similar to DCOM and CORBA in that it provides an RPC mechanism for invoking methods remotely. SOAP differs in that it is a protocol based on open XML standards and XML document exchange rather than being an object model relying on proprietary binary formats. Both DCOM and CORBA use binary formats for their payload (NDR and CDR, respectively). The SOAP gateway performs a similar function to DCOM and CORBA stubs – translating messages between the SOAP protocol and the language of choice. As a result, SOAP offers vendor, platform, and language independence. With SOAP, developers can easily bridge applications written with COM, CORBA, or Enterprise JavaBeansTM.

From the specification, SOAP is composed of three parts:

  • A framework describing how SOAP messages should be constructed
  • A set of encoding rules for exchanging data types
  • A convention for representing remote procedure calls

The encoding rules defined for various data types can be serialized across SOAP requests. The SOAP 1.1 specification bases its data encoding on XML Schema Structures and XML Schema Data Types, but also allows arbitrary data encoding like RDF. Supported types include simple types, like strings and enumerations, and complex types of structures and arrays. The specification also describes a convention for performing RPC interactions using XML. SOAP messages can be sent over any transport protocol including HTTP(s), SMTP, and FTP.

A SOAP message contains three primary pieces: an envelope; a header for adding application-specific features to a SOAP message, including authentication, transaction management, and payment; and a body that contains information intended for the recipient. An application receiving a SOAP message must identify all parts intended for it, verify that those parts are complete, and process them. Because a SOAP message can travel through a number of intermediaries, the SOAP actor attribute is used to indicate the ultimate recipient of the message. The specification also defines a mustUnderstand attribute that indicates whether a specific header entry has to be understood and processed by the recipient.
The following illustrates how a simple request/response message could be written with SOAP:

Request
POST /StockQuote HTTP/1.1
Host: www.stockquoteserver.com
Content-Type: text/xml
Content-Length: nnnn
SOAPAction: “Some-URI”
<SOAP:Envelope
xmlns:SOAP="urn:schemas.xmlsoap.org:soap.v1">
<SOAP:Header>
<t:Transaction xmlns:t=”URI”
mustUnderstand=”1”>5</t:Transaction>
</SOAP:Header>
<SOAP:Body>
<m:GetLastTradePrice
xmlns:m="URI">
<symbol>DIS</symbol>
</m:GetLastTradePrice>
</SOAP:Body>
</SOAP:Envelope>
Response
HTTP/1.1 200 OK
Content-Type: text/xml
Content-Length: nnnn
<SOAP:Envelope
xmlns:SOAP="urn:schemas.xmlsoap.org:soap.v1">
<SOAP:Header>
<t:Transaction xmlns:t=”URI”
xsi-type=”xsd:int”
mustUnderstand=”=”>5</t:Transaction>
</SOAP:Header>
<SOAP:Body>
<m:GetLastTradePriceResponse
xmlns:m="URI">
<return>34.5</return>
</m:GetLastTradePriceResponse>
</SOAP:Body>
</SOAP:Envelope>

Because SOAP messages can be carried over the HTTP protocol, they can easily pass through firewalls. Unlike other distributed object models that rely on dynamically assigned ports, SOAP can use HTTP’s standard port for transmitting data. The SOAP specification does not directly address security but relies on either the underlying protocol or the conventions of securing data within the business payload. Using the HTTPS protocol, SOAP message exchanges can be kept private. A proposal for extending SOAP that uses W3C’s XML Digital Signature standard has been made by IBM and Microsoft (http://www.w3.org/TR/SOAP-dsig/) but has not yet been endorsed by W3C. SOAP messages can be easily filtered at the firewall because the header information can be mandated to contain meta-information of the interaction that is being made. This saves a firewall application from having to read the entire XML body to manage the request.

SOAP was never intended to provide a complete distributed object architecture. SOAP does not mandate a specific object model, so the specification does not address such issues as distributed garbage collection, message batching, objects-by-reference, object activation, or type safety. Although it is possible to handle request and response messages, the SOAP specification does not directly handle asynchronous communication. It is, however, possible to implement asynchronous communication using the SOAP protocol; for example, Microsoft’s BizTalk messaging profile uses SOAP as its wire format and implements asynchronous message exchange.

SOAP will need to be extended with standardized mustUnderstand header fields to encode attributes like “message sender,” “message recipient,” and “message co-relation id” in order to claim native support for asynchronous messaging. BizTalk and ebXML are examples of initiatives that have extended SOAP to handle asynchronous messaging. Although SOAP is protocol-neutral, it currently only describes transport bindings over the HTTP protocol.

SOAP is clearly becoming one of the de facto standards for Web services. Because of its platform and language independence, and its ease of integration, many companies are embracing SOAP as a backbone for their Web services strategy. Since the initial release of the SOAP 1.1 specification in May 2000, more than 30 SOAP toolkits and development environments have been developed to support a wide variety of development platforms (J2EE, .NET, CORBA) and languages (Java, C#, C++, VisualBasic, Python). In addition, the W3C has formed a working group to draft the next generation of the SOAP protocol. A working draft of the SOAP 1.2 specification was released in July 2001.

WSDL is a language for describing the capabilities of Web services. Proposed by IBM and Microsoft, WSDL combines the best of IBM’s NASSL (Network Accessible Services Language) and Microsoft SOAP Contract Language. WSDL is based on XML and is a key part of the UDDI initiative. The WSDL document specification helps improve interoperability between applications, regardless of the protocol or the encoding scheme. The WSDL 1.1 specification defines WSDL as “an XML grammar for describing network services as collections of communication endpoints capable of exchanging messages.”

Essentially, a WSDL document describes how to invoke a service and provides information on the data being exchanged, the sequence of messages for an operation, protocol bindings, and the location of the service. A WSDL document defines services as a collection of endpoints, but separates the abstract definition from the concrete implementation. Messages and port types provide abstract definitions for the data being exchanged and the operations being performed by a service. A binding is provided to map to a concrete set of ports, usually consisting of a URL location and a SOAP binding. Figure 2 illustrates the various components in a WSDL document.webservice

The WSDL 1.1 protocol specification has been submitted to W3C as a note. The W3C Web site (http://www.w3.org/TR/wsdl) has details. At a recent W3C Web services workshop, the Web services community recommended to W3C that an XML Service Description work group be formed to standardize a language for describing Web service interfaces and behavior. It is widely expected that WSDL 1.1 will be used as the starting point for this work group.

Ariba, IBM, and Microsoft developed the first version of UDDI, the Universal Description, Discovery and Integration specification. As the name suggests, UDDI allows a business to describe the services it offers and to discover and interact with other services on the Web. UDDI is also a cross-industry open specification that is built on top of existing standards like TCP/IP, XML, HTTP, DNS, and SOAP. At the heart of UDDI is the UDDI Business Registry, an implementation of the UDDI specification. With the registry, a business can easily publish services it offers and discover what services other businesses offer. The registry is created as a group of multiple operator sites. Although each operator site is managed separately, information contained within each registry is synchronized across all nodes.

As the following figure illustrates, there are four key data structures described in the UDDI specification.

webservice-uddi

 

Business entities describe information about a business, including their name, description, services offered, and contact information. Business services provide more detail on each service being offered. Each service can have multiple binding templates, each describing a technical entry point for a service; for example, mailto, http, ftp, fax, and phone. Finally, tModels describe what particular specifications or standards a service uses. With this information, a business can locate other services that are compatible with its own system. UDDI also provides identifiers and categories to mark each entity using various
classification systems; for example, D&B, NAICS, and SIC.

The specification also defines a Programmer’s API containing 30 messages for interacting with UDDI registries. Inquiry APIs are provided to locate businesses, services, bindings, or tModels. Publishing APIs are included for creating and deleting UDDI data in the registry. To invoke any of the Publisher APIs, valid authentication to the operator sites is required.

The UDDI APIs are also based on SOAP. Specifically, the SOAPAction HTTP header is required, and operator sites must support the default namespaces in any SOAP documents. XML namespaces are used to distinguish different XML sets or vocabularies. In SOAP, the namespaces defining the Envelope and Body tags are defined in http://schemas/xmlsoap.org/soap/envelope.

The following illustrates how a UDDI call might be accomplished within a SOAP envelope:

POST /get_bindingDetail HTTP/1.1
Host: http://www.someoperator.org
Content-type: text/xml: charset=”utf-8”
Content-Length: nnnn
SOAPAction: “”
<?xml version=”1.0” encoding=”UTF-8” ?>
<Envelope xmlns=http://schemas.xmlsoap.org/soap/envelope/>
<Body>
<get_bidingDetail generic=”1.0” xmlns=”urn:uddi-org:api”>
</Body>
</Envelope>