Mashup Data Sources for Yahoo! Pipes
 
Mashups
It’s been called the essence of Web 2.0. It’s the ability to combine pieces of different web sites to create something new, something meaningful.  Something for you and the people who have your tastes. Your social network. Not some mass market portal built by corporate programmers who think that they know you and your personal tastes.
 
Referred to as a composite web site by some and Mashup site by others, we call it amalgamating web data through the process of transcoding. Whatever. It’s about giving you the data that you want on your mobile phone or desktop browser. It’s Web 2.0. It’s about you.
 
Yahoo! Pipes
Providing conduits through the internet backbone, they combine and filter data feeds to create for you a stream of useful information. They're Yahoo! Pipes and now <alt> Mashup technologies can channel HTML content through them.
 

 
 
 
Screen Cast and PDF Version
 
A screen cast of the Mashup Data Sources for Yahoo! Pipes is available here:
 
If you are viewing an HTML version of this document, some images may have been corrupted during the conversion process. For best viewing, a PDF version is available here:
 
 
 

 
RSS Designer Features
 
The Mashup Designer for RSS Feeds extends the reach of our Mashup development tools to include RSS feeds as deployment clients. In addition to creating the Dynamic XSL that will generate the RSS feed-- dynamically extracting remote content from one or more web sites, the Mashup Designer for RSS Feeds also provides an easy-to-use visual tool that helps filter the Mashup content. The filter choices defined by the user in the Mashup Designer for RSS Feeds are used in the code generation of the Dynamic XSL.
 
To understand the underlying technology of our Mashup tools—and in fact to build a Mashup Widget—you will use our StableDOM technology. The StableDOM product documentation is located at http://altmobile.com/pdf/StableDOM.pdf. The XML Transcoding product documentation is located at http://altmobile.com/pdf/XML%20Transcoding.pdf. Review the DOM Browser product document to understand how to create enterprise Mashups that integrate SQL data, Web Services data, and HTML content. It is located at http://altmobile.com/pdf/DOM%20Browser.pdf. And finally, our Mashup monitoring and metadata implementation is documented here:  http://altmobile.com/pdf/Mashup%20Tools.pdf.
 
Leveraging the StableDOM and code generation technologies, our Mashup Designer for RSS Feeds provides the following major capabilities:
 
  1. 1.An Opera browser-based design tool.
  2. 2.Fill-in-the-blank form to define your RSS feed generation properties.
  3. 3.Fill-in-the-blank form to define your content extraction properties.
  4. 4.Viewing of the Mashup content by accessing live data.
  5. 5.Generation of a Dynamic XSL to create your RSS Feed.
 
The following is a screen shot of the Mashup Designer for RSS Feeds when first launched:
 
 
As will be discussed later, the Mashup Designer for RSS Feeds will dynamically reconfigure itself depending on the Mashup content. The above screen shot was generated when viewing TABLE content. The Mashup Content Viewer will display the element name in its title bar which is highlighted in red above.
 
 
Understanding HTML Elements
HTML elements may be categorized in several ways:
  1. 1.Elements and text nodes. In HTML, text is always contained by an element. When we generate the RSS feed, we can extract the text nodes from the Mashup content or we can leave the HTML as is. For example, we can extract the text from a table column or leave intact the TD element which contains the text.
  2. 2.Visual elements and non-visual elements. For example, a META element is a non-visual node but a LI element (that is, list item) containing text content is a visual element.
  3. 3.Container elements and non-container elements. For example, a P element may contain text content directly or perhaps it might contain a B element which contains some text. An IMG element on the other hand can never contain any other element or text node.
  4. 4.Collection elements and non-collection elements. The HTML elements TABLE, OL, UL, and DT are classified as collection elements because they are defined to contain homogeneous child elements, that is elements of the same type. For example, the TABLE element may contain one or more TR elements, which are the table rows. The list elements OL and UL may contain one of more  LI elements which are list items.

Defining RSS Feed Generation Properties
    
The Mashup Designer for RSS Feeds allows you to define the following feed generation properties:
 
  1. 1.Generate an RSS feed containing one item. The RSS item will contain the root HTML element generated by the Mashup.

    In the below screen shot, the Mashup result is an HTML
    TABLE. The user has selected not to filter the result and therefore the RSS feed will be generated with one item that contains the HTML TABLE element.

    The feed properties that should be selected:



    And the Mashup result-- which is a
    TABLE element-- will be contained in the feed.



    Let's see this example with a simple
    P element:



    And an
    IMG element:



  2. 2.Generate an RSS feed containing one item for each child element of the root HTML element generated by the Mashup. This option is only available for HTML collection elements such as TABLE, OL, and UL.

    In the below screen shot, the Mashup result is an HTML
    TABLE. The user has selected to filter the result and therefore the RSS feed will be generated with one item for each aspect of the TABLE element specified in the filter:



    In this screen shot, the user defined a column filter:



    And for an
    OL or UL element, there is one filter option:


    which is to include all
    LI elements.
  3. 3.Generate an RSS feed containing one item for each child element of the root HTML element generated by the Mashup but apply a filter to exclude some child elements. This option is only available for the TABLE element.

    Though it is better to set column headers in the
    THEAD, most website define the headers as the first row. Consequently, the RSS feed should probably exclude the first row.



    In this screen shot, the user can set this filter in the TR Exclusion Rule section:



 
Defining RSS Feed Generation Properties
 
The Mashup Designer for RSS Feeds allows you to define the following text extraction properties:
 
  1. 1.Extract Element. Selecting this option will preserve the HTML element and its text content as returned from the Mashup or any filters defined in the RSS Feed Generation Properties. This is the sole option for elements that cannot contain text content such as the IMG element as seen here:




    Other elements such as the
    TABLE, OL, and UL elements also exhibit this constraint:


  2. 2.Extract InnerText. Selecting this option will extract the child text nodes from their container HTML elements. In this screen shot example, the user wants to create an RSS feed containing one RSS item for each LI element. The RSS item will only contain the text content of the LI and not the LI  element:



Actually, it is more correct to say that these options are used by the code generation technology to create a Dynamic XSL that contains the filtering and extraction logic in addition to the StableDOM functions. When invoked, the Dynamic XSL will generate the desired RSS feed.
 
 
Viewing the Mashup Content
    
The Mashup Designer for RSS Feeds allows you to interactively view the Mashup content. To fetch the Mashup content, select the "Fetch Content" link as highlighted in this screen shot:
 
 
The Content Viewer will contain the Mashup content as seen here:
 
 
Generating Your RSS Feed
 
The Mashup Designer for RSS Feeds generates a Dynamic XSL which will be displayed in a new editor window. To invoke this service, select the link "Generate RSS Feed" as seen here:
 

 
The Mashup Designer for RSS Feeds will notify the user that the Dynamic XSL was successfully created as seen here:
 
 
 
And an example editor window containing the Dynamic XSL is seen here:
 
 
You can manually customize the Dynamic XSL as needed. For example, the default name of each RSS item is "feed item" as seen here:
 
 
It is also important to note that the code generation technology will ensure that if the Mashup source contains HTML entities in its text that it will be enclosed in a CDATA node as seen here:
 
 
 
You can then publish the Dynamic XSL and view its content in an RSS reader such as FireFox as seen here:

or integrate with Yahoo! Pipes or Google to do perform GeoRSS.
 
Latter in this document, we show how to integrate with Yahoo! Pipes.
 
Launching the Mashup Designer for RSS Feeds
    
As mentioned earlier, the Mashup Designer for RSS Feeds extends the reach of our Mashup development tools to include RSS Feeds as deployment clients. To launch the Designer, you should have already created a monitored Mashup by using the StableDOM Browser to visualize and select the remote HTML content.
 
The StableDOM HTML transcoding product documentation is located at http://altmobile.com/pdf/StableDOM.pdf. The XML Transcoding product documentation is located at http://altmobile.com/pdf/XML%20Transcoding.pdf. Review the DOM Browser product document to understand how to create enterprise Mashups that integrate SQL data, Web Services data, and HTML content. It is located at http://altmobile.com/pdf/DOM%20Browser.pdf. And finally, our Mashup monitoring and metadata implementation is documented here:  http://altmobile.com/pdf/Mashup%20Tools.pdf.
 
From the Mashup Monitor Tool, select the appropriate "RSS Icon link" as seen in the following screen shots:



Subsequently, the following events occur:
 
  1. 1.An intermediate Dynamic XSL—representing the Mashup source—will be generated capable of dynamically accessing the remote content. This will be displayed in an editor widow.
  2. 2.An RSS Designer Object Server will be launched on a random port serving the Dynamic XSL. Much like a normal Object Server provides a micro web server enabling remote access to the XML or XSL content of the text editor—transcoding as needed from remote sites—the RSS Designer Object Server also allows you to launch a micro web server.

    In addition to all the features, HTTP debugging/monitoring, and manageability of a standard
    Object Server, the RSS Designer Object Server provides the following additional capabilities:
    1. a.Generates the Mashup Designer for RSS Feeds web page customized to support RSS development.
    2. b.Implements the previously described "Fetch Content" service which is accessible through the Mashup Designer for RSS Feeds. The Fetch Content service returns the updated content generated by the Dynamic XSL.
    3. c.Implements the previously described "Generate RSS Feed" service which is accessible through the Mashup Designer for RSS Feeds.
  3. 3.Launches the default browser-- which should be the latest version of the Opera browser-- to access the RSS Designer Object Server.
This is a screen shot of the Dynamic XSL editor window displaying the IP address and port number of the   spawned RSS Designer Object Server:


 
The next section describes another RSS feed generation, called the RSS Object Server, technology which is better suited to a create a feed from multiple Mashup data sources and/or static content. Additionally, the RSS Designer Object Server provides access—via the Mashup Designer for RSS Feeds—to the Dynamic XSL source code that generates the RSS feed and the RSS Object Server does not expose the program that generates the RSS feed.
 
RSS Object Server Design Features
 
Much like the RSS Designer Object Server, the RSS Object Server extends the reach of our Mashup development tools to include RSS feeds as deployment clients and thereby enabling any Mashup to be a data source for a Yahoo! Pipe or other RSS consumers. It differs from the RSS Designer Object Server
as documented below.
To understand the underlying technology of our Mashup tools—and in fact to build a Mashup—you will use our StableDOM technology. The StableDOM product documentation is located at http://altmobile.com/pdf/StableDOM.pdf. The XML Transcoding product documentation is located at http://altmobile.com/pdf/XML%20Transcoding.pdf. Review the DOM Browser product document to understand how to create enterprise Mashups that integrate SQL data, Web Services data, and HTML content. It is located at http://altmobile.com/pdf/DOM%20Browser.pdf. And finally, our Mashup monitoring and metadata implementation is documented here:  http://altmobile.com/pdf/Mashup%20Tools.pdf.
 
Leveraging the StableDOM and code generation technologies, our RSS Object Server provides the following major capabilities:
 
  1. 1.Intelligent transformation of HTML and XHTML Mashup content to RSS.
  2. 2.Encoding of HTML character entities from HTML Mashup content.
  3. 3.Optionally wraps Mashup content in a CDATA node.
 
 
 
Launching the RSS Object Server
 
As mentioned earlier, the RSS Object Server extends the reach of our Mashup development tools to include Yahoo! Pipes as deployment clients. To launch the RSS Object Server, you should have already created a Mashup and generated the Dynamic XSL.
 
The StableDOM HTML transcoding product documentation is located at http://altmobile.com/pdf/StableDOM.pdf. The XML Transcoding product documentation is located at http://altmobile.com/pdf/XML%20Transcoding.pdf. Review the DOM Browser product document to understand how to create enterprise Mashups that integrate SQL data, Web Services data, and HTML content. It is located at http://altmobile.com/pdf/DOM%20Browser.pdf. And finally, our Mashup monitoring and metadata implementation is documented here:  http://altmobile.com/pdf/Mashup%20Tools.pdf.
 
Once a Dynamic XSL has been created, you should launch an RSS Object Server by selecting the menu item "RSS Object Server>>With CDATA Wrapper" or "RSS Object Server>>Without CDATA Wrapper" from the popup menu as seen in the following screen shot:

 
    

 
Much like a normal Object Server provides a micro web server enabling remote access to the XML or XSL content of the text editor—transcoding as needed from remote sites—the RSS Object Server also allows you to launch a micro web server. This is seen in this screen shot:
 
 
 
In the above screen shot, you see how to launch an Object Server on a specific port. Unix users will be alerted about using restricted ports.
 
Review the DOM Browser product document to understand how to use the Object Server functionality: http://altmobile.com/pdf/DOM%20Browser.pdf.