Submitting Content

THE WORK (AS DEFINED BELOW) IS PROVIDED UNDER THE TERMS OF THIS LICENSE. THE WORK IS PROTECTED BY COPYRIGHT AND/OR OTHER APPLICABLE LAW. ANY USE OF THE WORK OTHER THAN AS AUTHORIZED UNDER THIS LICENSE OR COPYRIGHT LAW IS PROHIBITED.

NANOHUB.ORG ACCEPTS AND AGREES TO BE BOUND BY THE TERMS OF THIS LICENSE. THE LICENSOR (YOU) GRANTS NANOHUB.ORG THE RIGHTS CONTAINED HERE IN CONSIDERATION OF NANOHUB.ORG'S ACCEPTANCE OF SUCH TERMS AND CONDITIONS.

  1. Definitions
    1. "Collective Work" means a work, such as a periodical issue, anthology or encyclopedia, in which the Work in its entirety in unmodified form, along with a number of other contributions, constituting separate and independent works in themselves, are assembled into a collective whole. A work that constitutes a Collective Work will not be considered a Derivative Work (as defined below) for the purposes of this License.
    2. "Derivative Work" means a work based upon the Work or upon the Work and other pre-existing works, such as a translation, musical arrangement, dramatization, fictionalization, motion picture version, sound recording, art reproduction, abridgment, condensation, or any other form in which the Work may be recast, transformed, or adapted, except that a work that constitutes a Collective Work will not be considered a Derivative Work for the purpose of this License. For the avoidance of doubt, where the Work is a musical composition or sound recording, the synchronization of the Work in timed-relation with a moving image ("synching") will be considered a Derivative Work for the purpose of this License.
    3. "Licensor" means the individual or entity that offers the Work under the terms of this License.
    4. "Original Author" means the individual or entity who created the Work.
    5. "Work" means the copyrightable work of authorship offered under the terms of this License.
  2. License Grant

    Subject to the terms and conditions of this License, Licensor hereby grants nanoHUB.org a worldwide, royalty-free, non-exclusive, perpetual (for the duration of the applicable copyright) license to exercise the rights in the Work as stated below:

    1. to reproduce the Work, to incorporate the Work into one or more Collective Works, and to reproduce the Work as incorporated in the Collective Works;
    2. to create and reproduce Derivative Works;
    3. to distribute copies or phonorecords of, display publicly, perform publicly, and perform publicly by means of a digital audio transmission the Work including as incorporated in Collective Works;
    4. to distribute copies or phonorecords of, display publicly, perform publicly, and perform publicly by means of a digital audio transmission Derivative Works;

    The above rights may be exercised in all media and formats whether now known or hereafter devised. The above rights include the right to make such modifications as are technically necessary to exercise the rights in other media and formats. All rights not expressly granted by Licensor are hereby reserved, including but not limited to the rights set forth in Sections 4(d) and 4(e).

  3. Representations, Warranties and Disclaimer

    UNLESS OTHERWISE MUTUALLY AGREED BY THE PARTIES IN WRITING, LICENSOR AND NANOHUB.ORG BOTH OFFER THE WORK AS-IS AND MAKE NO REPRESENTATIONS OR WARRANTIES OF ANY KIND CONCERNING THE WORK, EXPRESS, IMPLIED, STATUTORY OR OTHERWISE, INCLUDING, WITHOUT LIMITATION, WARRANTIES OF TITLE, MERCHANTIBILITY, FITNESS FOR A PARTICULAR PURPOSE, NONINFRINGEMENT, OR THE ABSENCE OF LATENT OR OTHER DEFECTS, ACCURACY, OR THE PRESENCE OF ABSENCE OF ERRORS, WHETHER OR NOT DISCOVERABLE. SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION OF IMPLIED WARRANTIES, SO SUCH EXCLUSION MAY NOT APPLY IN ALL SITUATIONS.

  4. Limitation on Liability

    EXCEPT TO THE EXTENT REQUIRED BY APPLICABLE LAW, IN NO EVENT WILL LICENSOR HOLD NANOHUB.ORG LIABLE ON ANY LEGAL THEORY FOR ANY SPECIAL, INCIDENTAL, CONSEQUENTIAL, PUNITIVE OR EXEMPLARY DAMAGES ARISING OUT OF THIS LICENSE OR THE USE OF THE WORK, EVEN IF LICENSOR AND/OR NANOHUB.ORG HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

  5. Termination
    1. This License and the rights granted hereunder will terminate automatically upon any breach by nanoHUB.org of the terms of this License. Individuals or entities who have received Collective Works or Derivative Works from nanoHUB.org under this License, however, will not have their licenses terminated provided such individuals or entities remain in full compliance with those licenses. Sections 1, 2, 5, 6, 7, and 8 will survive any termination of this License.
    2. Subject to the above terms and conditions, the license granted here is perpetual (for the duration of the applicable copyright in the Work). Notwithstanding the above, Licensor reserves the right to release the Work under different license terms or to stop distributing the Work at any time; provided, however that any such election will not serve to withdraw this License (or any other license that has been, or is required to be, granted under the terms of this License), and this License will continue in full force and effect unless terminated as stated above.
  6. Miscellaneous
    1. Each time nanoHUB.org distributes or publicly digitally performs the Work, a Collective Work, or a Derivative Work, the Licensor offers to the recipient a license to the Work on the same terms and conditions as the license granted to nanoHUB.org under this License.
    2. If any provision of this License is invalid or unenforceable under applicable law, it shall not affect the validity or enforceability of the remainder of the terms of this License, and without further action by the parties to this agreement, such provision shall be reformed to the minimum extent necessary to make such provision valid and enforceable.
    3. No term or provision of this License shall be deemed waived and no breach consented to unless such waiver or consent shall be in writing and signed by the party to be charged with such waiver or consent.
    4. This License constitutes the entire agreement between the parties with respect to the Work licensed here. There are no understandings, agreements or representations with respect to the Work not specified here. Neither Licensor nor nanoHUB.org shall be bound by any additional provisions that may appear in any communication between the parties. This License may not be modified without the mutual written agreement of the Licensor and nanoHUB.org.

PUNCH: Purdue University Network Computing Hub

Developed at:Purdue University
Deployed:1995
Retired:April, 2007
Usage:4600 users, 395,000 simulations

PUNCH began as a project to create an Internet computing platform that would provide a distributed computing portal on the early World Wide Web. Applications could be installed "as is" with simple web form interfaces for uploading and editing input files, running the application in a batch mode, and then retrieving, viewing, or post-processing output files, or applications could have more elaborate web interfaces constructed within PUNCH to activate an infinite variety of different application flags or preprocessors and postprocessors automatically. Later versions of PUNCH added virtual filesystem support to execution nodes, support for VNC-based graphical, interactive application use, and even began linking to Condor and Globus submission. PUNCH was designed to allow users access to complex software packages without suffering through difficult installation processes or familiarizing themselves with arcane commands.

Early implementations of PUNCH focused on scientific simulation applications for three different disciplines: Computational Electronics (CE Hub), Computer Architecture Research and Education (NETCARE), and Electronic Design Automation (EDA Hub). Of these, the Computational Electronics Hub was the most successful, first brought online in 1996, and a few years later it was transformed into a hub more specialized on Nanotechnology, becoming the first incarnation of "nanoHUB".

Architecture

The architecture of a PUNCH system consisted of a web server and one or more backend execution nodes where applications were run. In the PUNCH environment, all users' operations ran in the context of captive shadow accounts. The applications provided in this environment were necessarily highly restrictive so that an individual user could not exploit flexibilities in the software to either run arbitrary commands or access other users' files.

Later developments for PUNCH included support for graphical, interactive sessions via VNC, with connections made directly to the applications running on the backend systems. In order to enable this, a virtual filesystem subsystem was also developed to export user home directories to remote execution hosts. The job-scheduler was broken out into a distinct subsystem, and some primitive load balancing was done there to improve the overall efficiency of PUNCH jobs. Early testing on integrating Condor and Globus submission was complete by 2001.

Although it was a significant achievement for its time, PUNCH suffered from a number of shortcomings. First, it required a specialized web server in order to work, and the operations of PUNCH were integrated into its own HTTP 1.0 server. Because of this, modern features of Apache could not be fully utilized, and active development of the HTTP server portions of PUNCH fell behind. Also, PUNCH was designed in a research project environment and robustness, reliability, and maintainability of the codebase was not a high priority. Over time, this led to a great deal of maintenance necessary to ensure proper operation of the services. Additionally, although simple input file applications could be quickly and easily added into PUNCH, constructing a better, more intuitive form input system for an application in PUNCH could very extremely time-consuming. Finally, while graphical interface support was added, the principle mode of operation was still form-based, a user interface which modern computer users found increasingly undesireable, and form-based batch application support code in PUNCH was, like the HTTP code, becoming just unnecessary baggage.

Because of these shortcomings of PUNCH, several projects emerged in an attempt to improve upon it. In spite of all its problems, PUNCH remained in operation concurrent with all the other middlewares to follow it listed here below until 2007, and its 12-year operational lifespan is a testament to its success.


In-VIGO: In-Virtual Grid Organizations

Developed at:University of Florida, ACIS
Deployed:May, 2005
Retired:February, 2006
Usage:1300 users, 11,000 sessions

The In-VIGO project built upon lessons learned from the development of PUNCH. Namely:

  • The maintainability of the system is predicated on the use of standard components.
  • The isolation of accounts, networks, and applications was crucial to the security of the system.

As a result, the In-VIGO system was designed with the following features:

  • All hosts used in the system were (VMware) virtual machines in order to separate the In-VIGO environment from the physical hosting environment.
    • A front-facing web server provided a form-based interface for application sessions. It was built from standard software components, such as Apache, Java, Tomcat, and MySQL.
    • A firewall host provided a static mapping of ports to internal VNC servers. This allowed for greater isolation between execution hosts and the outside world.
    • One or more backend systems provided application execution environments.
    • A fileserver provided central storage space.
  • Applications ran on each execution node in one of several shadow accounts so that one application could not affect another application running on the same host.
  • A virtual file system based on NFS provided the ability to map directories on the file server to any shadow account on any execution host. This allowed all files to be stored as a single user (the owner of the web server) on the file server.

In-VIGO was used through nanoHUB by bridging operations from one web server to the other and occasionally directly modifying entries in the In-VIGO database. Because the combined system was spread out over two web servers and databases, session handling was complicated, application startup required several seconds and error handling was tenuous. In the event of a failure, it was difficult for the end user to diagnose the problem.

In-VIGO has a rich capability for specification of form-based interaction which was, unfortunately, not directly applicable to the mostly graphical applications used in nanoHUB. Nevertheless, deployment of any application still required configuration of the command-line based tool. Additional operations were necessary at invocation time to access a graphical interface for the application. This style of specification also contributed to the delays in both application deployment and middleware session startup.

Further information about In-VIGO is available at the In-VIGO page.


Narwhal

Developed at:Purdue University
Deployed:January, 2006
Retired:May, 2007
Usage:6000 users, 60,000 sessions, 220,000 simulations

Design goals

Narwhal was built with the following goals in mind:

  • All sessions are only graphical. Provide no support for form-based interaction.
  • Leverage the existing nanoHUB LDAP authentication mechanisms so that first-class accounts and file storage could be used for all applications.
  • Make the system as small, fast, and reliable as possible.
  • Automatically recover from failure as much as possible.

Architecture

Narwhal was built in the following manner:

  • A program running on the nanoHUB web server was used to control operations.
    • All authorization was performed by the web server. No further authorization was done by narwhal.
    • Operations and accounting was done through the nanoHUB web server's existing MySQL database.
    • All interaction between hosts was done using SSH.
  • Additional service software ran on the following hosts:
    • One or more backend execution hosts provided application environments.
    • A central file server provided file storage using conventional multi-user NFS services.
    • A firewall host was used to dynamically map network ports from the outside world to VNC servers running on the backend execution hosts. Dynamically mapping ports avoids the error-prone maintenance of static mapping. The flexibility of this mapping also allowed for the simple deployment of an embedded file transfer mechanism integrated with Rappture.
  • All hosts ran on (Xen) virtual machines in order to separate the nanoHUB environment from the physical hosting environment.

Benefits

Because the Narwhal system constituted a session controller for Unix systems, it could naturally provide a number of other benefits. Primarily, it allowed for the invocation of workspace sessions consisting of a typical Unix/X11 window manager in an environment that was the same as the general application environment. This allowed for ease of transition between development and deployment of applications. Because all files were owned by the respective users, it was possible to employ per-user quotas on the fileserver so that one user could not accidentally or deliberately use all of the storage space.

Components of Narwhal acted in parallel to allow for fast session creation. For instance, dynamic port mapping could be done at the same time as filesystem configuration. Furthermore, pre-caching of services was done as much as possible. VNC servers were allocated in advance of their use so that when a user requested a new application session, an appropriate execution host could be selected and the application started without having to wait for the display server to start. The user-perceived application start time was reduced to 4-8 seconds.

Narwhal incorporated a mechanism that monitored presence of a viewer on each VNC server. If a particular application's VNC server lacked an observer for longer than a configured timeout, the application was terminated. This allowed the system to automatically avoid unchecked growth in resource consumption and removed the burden from the user of having to manually terminate every application that he or she was no longer interested in. The cumulative time that each session was viewed was reported to the middleware and recorded in the statistics database where it could be used for further analysis by application developers.

Reliability

Reliability of the Narwhal system was achieved by redundancy, retries, and triage of subsystems. The core philosophy of the development of the systems was that some systems will occasionally fail, and that the end user should not be made aware of this unless there was no other means of recourse. For instance, VNC servers sometimes fail at startup due to locking failure, or unknown race conditions. When a host is selected on which to start a VNC server, the status of the operation is checked. If failure is detected, the operation is retried. If failure persists, the conditions are recorded and the system avoids use of that particular subsystem. A secondary or tertiary choice for a host system is then used and the process continues. When all avenues are finally exhausted, the user and administrator are notified that some larger, general failure has occurred. Because resources such as VNC servers are pre-started in advance of their need, the additional delay in creating the server is hidden from the end user.

Testability

Narwhal offered the ability to be able to stress-test systems in advance of production use. A development environment could be prepared in which an administrator could start, for instance, 1000 random application sessions in rapid succession. Alternatively, 100 sessions could be started simultaneously (as would happen when students in a laboratory classroom started to use the system). By conducting tests of this nature, bugs in the software were found and corrected. The expected behavior and capacity of the system could also be established in advance of production needs.

About the name

The name "narwhal" was not an acronym and did not refer to anything. It was a name chosen at random in the hope that someone would come up with something more meaningful. For lack of a better name, most people referred to it, colloquially, as "In-VIGO Lite."


Maxwell

Developed at:Purdue University
Deployed:May, 2007

Maxwell is a middleware system designed to provide the next generation capabilities of a true science gateway. Despite the stability and reliability of the Narwhal system, its shortcomings were clear and were increasing in importance:

  • Users with browsers that lacked necessary software (Java and Javascript) were afforded no clues as to why their application session did not work. Such problems were responsible for a significant portion of our support requests.
  • Remote users who were behind a filtering firewall often could not connect to the broad range of ports used by Narwhal's dynamic VNC port router. The use of such firewalls is becoming increasingly common in both corporate and academic environments. This problem was also responsible for a significant portion of the incoming requests for support.
  • Security of VNC transport was weak due to lack of end-to-end encryption of the network connection. A motivated attacker could eavesdrop on the network connection and interpret the keyboard and mouse activity. Although this was of little concern for typical nanoHUB applications, improved security of workspace sessions was a high priority.
  • File transfer operations were also not encrypted and could be inspected by eavesdroppers and potentially exploited.
  • Preconfigured sizes for VNC sessions was a limiting factor to load balancing as well as application configuration. New sessions would always have be allocated to the execution hosts which held the pre-started VNC servers of the appropriate display geometry. Furthermore, the inability to make a running application "just a little bit larger" was a perennial deficiency.
  • Running applications were shared among a pool of machines. Although Unix access controls separated processes from each other, users with interactive workspace sessions could identify other users as well as observe the names of the programs that they were running. Furthermore, workspace users effectively had unrestricted access to the local network which made it inappropriate to grant such capabilities to users who were unknown or not believed to be trustworthy.
  • Recording of statistics about individual simulations within an application session was done by writing information to an output log that was subsequently read and interpreted by the middleware. Although this was sufficient for captive simulation applications, it could not be used to identify programs run by interactive workspace users or jobs executed in remote venues such as Condor or PBS.

Design goals

To address the deficiencies in Narwhal, a system was designed with the following new features:

  • An enhanced VNC server and client were developed that were capable of:
    • integral diagnosis of browser deficiencies.
    • ability to transit firewalls by use of a proxy.
    • full SSL encryption of the entire session.
    • ability to negotiate a connection through a nanoHUB port router which used a single, standard port (563) that is tolerated by filtering firewalls.
    • arbitrary dynamic resizing of the framebuffer.
    • an upcall mechanism to invoke limited actions on the browser that could initiate file upload or download.
  • File transfers are performed using secure HTTPS through the central nanoHUB web server. Furthermore, the invocation no longer happens through a dedicated connection through the a port router to a specialize Rappture application. File import and export can be initiated by general applications and by interactive workspace users.
  • The use of the OpenVZ virtualization system was introduced in order to provide independent, secure virtual machines for all applications.
  • A submission mechanism was introduced that would interact with the middleware to register information about users' jobs as well as route them to remote resources.

Architecture

Compared to previous middleware systems, Maxwell is very web-server-centric. Instead of using a port router "firewall" host, the web server uses a daemon to dynamically relay incoming VNC connections to the execution host on which an application session is running. Instead of using the port router to set up a separate channel by which a file import or export operation is conducted, Maxwell uses VNC to trigger an action on the browser which relays a file transfer through the main nanoHUB web server. The primary advantage of consolidating these capabilities into the web server is that it limits the entry point to the nanoHUB to one address: www.nanohub.org. This simplifies the security model as well as reduces on the number of independent security certificates to manage.

One disadvantage of consolidating most communication through the web server is the lack of scalability when too much data is transfered by individual users. In order to avoid a network traffic jam, the web server can be replicated and clustered into one name by means of DNS round-robin selection.

The backend execution hosts that support Maxwell can operate with conventional Unix systems, Xen virtual machines, and a new form of virtualization based on OpenVZ. For each system, a VNC server is pre-started for every session. When OpenVZ is used, that VNC server is started inside of a virtual container. Processes running in that container cannot see other processes on the physical system, see the CPU load imposed by other users, dominate the resources of the physical machine, or make outbound network connections. By selectively overriding the restrictions imposed by OpenVZ, we can synthesize a fully private environment for each application session that the user can use remotely.

About the name

The improvements over Narwhal represented a shift in philosophy from a middleware system that allocated resources to one that regulated access between systems and between zones of security. This is similar in spirit to James Clerk Maxwell's Dæmon—a thought experiment where a hypothetical shutter regulated the movement of molecules between two chambers. Furthermore, the use of the term daemon was already prevalent in the vernacular of software system architecture. It was for these reasons that the name "Maxwell" was selected.