WebSTAR/5.0 Design
WebSTAR is a threaded server, which means all active tasks being performed by the server reside in the same process space. Each thread has a specific purpose: some threads are for handling connections, other threads handle idle-time tasks, while still others are specific to start-up and shut-down tasks.
The server implements a single class, WS_Thread, which encapsulates all common features of these threads and the basic implementation of all WSAPI callbacks. Some callbacks, such as the stream calls, are available to all threads and are therefore implemented in this base class. Other callbacks, such as WSAPI_SendHTTPData, are only valid for a specific subclass so the base implementation simply returns WSAPI_E_WrongRoleForCall. All of these callbacks are virtual functions, so any plug-in can call any callback from any thread and still get the correct implementation.
The data carried by the class is often referred to as the "context" for that thread. For a request-processing thread this information can be extensive, while an idle-time thread may carry very little information. When a plug-in is called, the thread's object pointer is cast to a void* and is saved into the WSAPI_CommandPB.api_data field along with the ID of the plug-in being called. When the plug-in invokes one of the server's callback functions, this pointer is cast back to a WS_Thread* and the appropriate member function is invoked.
Normally these thread objects are only accessed from their associated OS thread. However, this is not always the case. Some plug-ins need a much larger stack space than what is provided by the server. When these plug-ins receive a command, they launch their own thread and pass the parameter block to the new thread, putting the old thread to sleep until the new thread completes its task. In this case, the thread that is executing carries the old object with it and is accessing its data. In another common case, threads need to have access to the callbacks from maintenance threads which are always running and therefore ask the server to provide a special parameter block for this purpose. In both cases you can pass a prameter block (and therefore its context) from one thread to another, but only one thread may use any given parameter block at any given time. This is because all of the code in WebSTAR is reentrant, but the data in a context/object is not thread-safe.
WS_Thread Descendants
There are five subclasses of the WS_Thread class:
1.WS_Main is the class for the main server thread. It is responsible for start-up, shut-down, and some major events.
2.WS_Connection is the class used for all http requests, and generates all commands related to request-processing. It encapsulates the current protocol state, request parameters, network connections, and authentication state, among other things.
3.WS_ListenGroup is the class created when a plug-in calls WSAPI_ListenStream. It creates the WS_Listen threads, manages the server factory, and organises the listen shut-down process. WSAPI_ListenerRef is really a typecast pointer to this object.
4.WS_Listen is a thread that handles connections that arrive through a WS_ListenGroup and dispatches the WSAPI_ListenComplete command to the plug-in that originated the group.
5.WS_Idle is an idle thread for periodic maintenance. It is also the kind of object provided by calls to WSAPI_CreateContext. When used as a custom context, a new thread is not created the object is simply created and returned without launching the corresponding thread.
WSAPI_ParamKeywords Implementation
Most callbacks are defined either in the WS_Thread class or the appropriate subclass. The major exception to this is Get/SetIndexedParameter, which is implemented in all of the classes. Each implementation examines the requested keyword, and if it doesn't know how to handle the keyword it calls the WS_Thread implementation.
This works out very nicely, since all of the global keywords can be implemented in WS_Thread, and each subclass only has to implement the keywords specific to that kind of context. Since some keywords are not applicable in some contexts, while a few keywords have different meanings in some contexts, this has been a useful design.
WSAPI/2 Changes
There are many changes to WSAPI in version 2. Some cope with process/thread model changes in MacOS X, others improved areas that were weak, and still others improve performance.
The most serious changes for most plug-ins is in general connection processing. In the past, plug-ins were required to generate their own headers. This had a few undesireable side-effects. First, developers had to know a lot about the http specification. It was also difficult to deal with persistent connections and chunked encoding, and required a lot of additional code to support these features with dynamically generated data. Finally, it wasn't possible for one plug-in to affect the outgoing headers in another plug-in's responses (eg: adding an expired header).
In WSAPI/2, plug-ins don't generate their own headers. Instead, plug-ins will use the piResponseHeaderField indexed parameter to set the response fields they want (most notably content-type) and set the response code in the PB before calling WSAPI_SendHTTPData, and the server will take care of the details. This includes deciding whether or not the connection can be persistent, use chunked encoding, add the Server field, etc..
Plug-ins that want to keep sending full headers must set the capFullHeaders flag during the Register or Init commands. Plug-ins that are linked to the WSAPI/1 frameworks will receive a Register command with the capFullHeaders bit already set.
There is also a new command, WSAPI_FileInfo, which is called after the Filter command but before the Run command. All plug-ins that have registered with WSAPI_RegisterFileInfo will receive this command. During this command, the plug-in may change how the request will be processed by changing parameters for that connection. This enables a plug-in to grab control of a request in the Run command simply by setting a parameter instead of re-writing the whole request a dangerous and error-prone procedure. It will also be much faster, and makes the Filter command largely obsolete.
Currently, the following parameters can be changed during the FileInfo command. Others may be added to the list in the future. Please let us know if any important ones are missing.
piAction
piActionPath
piFileMimeType
piURLPhysPath
Error Handling
Error handling also changes, to become much more flexible. Instead of an error file, the administrator specifies an error directory. By default, this is the root directory of the current virtual host. When an error is encountered, the server will look for an error file with a name containing the same number as the error (eg: 404.* for a 404 error, 403.* for a 403 error, etc.). If that file is not present, the server looks for a more generic filename (400.* for 4xx errors, 500.* for 5xx errors, etc.). If no such file is found, then the server will look for "error.*", and as a last resort will use a hard-coded response. This allows a site to have very customised pages for a wide range of errors.
The appropriate parameters (piScriptName, piURLPhysicalPath, etc) are changed to reflect the chosen error file, and the plug-in responsible for handling that kind of file is called to process the file.
Implementation Note: The error processing routine only checks for .html files. When the file-info cache is complete, this limitation will be lifted.
Virtual Hosts and Settings
A few changes have been made to improve a plug-in's ability to behave differently on a per-virtual-host basis:
More parameters are settable on a per-connection basis, especially since many features have been moved out of the server and into plug-ins.
Two parameters have been added: piVHostID and piVHostName. They are the virtual host ID (a long) and the virtual host user-interface name. They can be set during the WSAPI_Route command, and read during any other connection-related command. Knowing which virtual host is in effect for a connection is important for anything configured on a per-VH basis.
The piSettingsData parameter has been added. This is an indexed parameter, where the index is simply a long cast to the appropriate type. The index is the ID of the virtual host whose settings you wish to set or retrieve. The value of the parameter is any data you wish to store or retrieve for that virtual host provided it is XML friendly (well-formed). The data is kept in the WebSTAR settings file for that virtual host. Using piSettingsData without an index indicates the system-wide settings. A plug-in can only retrieve/change it's own settings.
Access/Auth
There have been substantial changes to the access-control model. Plug-ins registered to handle access issues determine which URLs are protected and the authentication methods to use. Authentication plug-ins generate challenges and verify credentials.
The new model also allows plug-ins to check a users' access rights to files other than the current URI handy for things like #include directives.
Please read the security document for details.
Ipv6 Compatibility
New Stream and Listen callbacks will have to be provided to work properly with Ipv6. Depending on customer demand, this may be delayed until WSAPI 2.1.
Services
Plug-In-To-Plug services must now be routed through the new WSAPI_RegisterService and WSAPI_CallService callbacks. This is to deal with the mixed thread model imposed by the combination of MacOS X and cooperative plug-ins. See the Preemption Support document for details.
Memory
Callbacks have been added to reallocate memory, and to provide faster pooled-memory access.
See the Preemption Support for details.
Security API
The WSAPI security support, like the rest of the API, should be modular and flexible. Experience has shown that there is a need for diverse authentication methods, more granular access control, and support for the cooperation of multiple plug-ins that have no prior knowledge of each other.
The WSAPI security model has two parts: Authentication and Access. These functions are designed to be mostly independent of each other, possibly residing in different plug-ins supplied by different vendors. An access handler decides if a given URI is protected or not, whether or not the user has access, and the nature of access the user should be granted. An authorisation handler challenges users to identify themselves, and determines if they are who they say they are.
When a request arrives, the server calls each registered access handler. Each handler may either accept or decline responsibility for controlling access to the URI in question. If all handlers decline, then a default access privilege of read-only is granted. If a handler accepts responsibility for a URI then all other handlers are ignored. The responsible access handler looks up the user's credentials and then either denies all access or sets the user's access permissions for that URI.
The authentication handler is activated when the access handler attempts to obtain the user's credentials. If the user has presented valid credentials the authentication handler sets the piUserKeyword parameter to reflect the user's name.
The authentication handler registers itself with the server during the Register command, providing its name (eg: "mySQL user database"). A server implementation may allow a single plug-in to register more than one method.
The WSAPI_Authenticate command is triggered when an access control plug-in attempts to look up the piUserKeyword parameter. The WSAPI_Authenticate command is only sent to the plug-in indicated by the access handler.
An authentication handler checks the credentials using the available HTTP header, client IP address, or any other information available through the PB. Because the realm name is required for most challenges, it is provided as part of the PB for this command.
If the credentials are present and valid, then the auth handler sets the piUserKeyword parameter to reflect the name of the authenticated user.
If no valid credentials are present then the auth handler issues a challenge by setting the applicable header fields through the indexed piResponseHeaderField parameter.
For example, if an auth plug-in wants to issue a challenge using the Basic HTTP authentication method, it might generate the challenge with the following:
void IssueChallenge(WSAPI_CommandPBPtr pb) { char challenge[256] ; long len ; WSAPI_DescPtr desc ;
// WWW-Authenticate: Basic realm="name" len = sprintf(challenge,
"Basic realm=\"%.200s\"",
pb->param.auth.realm) ; if(!WSAPI_NewDescriptor(pb, &desc, typeText,
challenge, len)) {
WSAPI_SetIndexedParameter(pb, piResponseHeaderField, "WWW-Authenticate",
desc) ;
WSAPI_DisposeDescriptor(pb, &desc) ; } }
The access handler is called after the FileInfo handlers have been called, but before the preprocessors. There are three possible return values:
WSAPI_E_NotHandled - The handler is declining responsibility for this request.
WSAPI_I_NoPermission - The handler has determined that the request falls within its protected space, but it does not accept the credentials presented by the client (or the client did not provide any). The handler sets pb->param.accessControl.result to reflect the desired response code. The server will then handle the request as an error.
WSAPI_I_NoErr - The handler has determined the request falls within its protected space. The value returned in authPermissions is the flag indicating the level of access that should be granted.
If the return value is WSAPI_E_NoPermission or WSAPI_I_NoErr, the server will not ask any more plug-ins for access determinations. When WSAPI_E_NoPermission is returned, a value should be placed into the responseCode. If no change is made to the response code, the server will generate a 401 response using its normal error handling procedures.
To obtain the client's credentials, the handler must do three things:
1) Set piRealm to the name of the protected space. Most authorisation methods require this (e.g.: the Basic method described in the HTTP specifications) and it is also used by most client applications as part of the prompt for credentials from the user. Therefore, it must be present before the authorisation can take place.
2) Set piAuthMethod to the name of the desired authentication method. This will decide which authentication handler is called to obtain the credentials. A list of the available methods can be obtained with the piAuthMethodList keyword. This specification does not discus how the administrator configures an authorisation plug-in to associate a protection realm with an authorisation method.
3) Retrieve the piUserKeyword. If the user has not yet been authenticated the server will send the WSAPI_Authenticate command to the appropriate plug-in. If that plug-in is successful in authenticating the user, then a user name will be returned. If no valid credentials are present, then the piUserKeyword parameter will be empty.
It is important to compute the access to pb->param.accessControl.uri, NOT of the URI in the HTTP header or the value of the piScriptName parameter. This is to support requests for access to URIs other than those requested by the client (e.g.: a file to be included into the output).
Ask for piPermissions keyword with an index of the requested URI:
WSAPI_GetIndexedParameter(pb, piPermissions, "/my/uri/some.html", &desc) ;
If the result is WSAPI_E_ParameterNotFound then the server doesn't provide this functionality. WSAPI_I_NoPermission indicates the client should not be given any kind of access to the requested URI. WSAPI_I_NoErr indicates that the descriptor contains an unsigned long with flags detailing the user's access permissions for the requested URI.
The addition of the piResponseHeaderField parameter implies a significant change in desired server behavior. The expectation is that the server will perform at least some minimal response header management, enough to add the specified lines to the outbound header. In practice, this isn't such a great burden, as most implementations already insert additional data into response headers generated by plug-ins as a work-around for a bug found in many copies of Netscape still in use on the Internet.
This specification does not attempt to define:
- how protection spaces are defined, parsed, or configured
- the semantic relationship between realm names and protection spaces
- the type or operation of authentication methods
- how various security plug-ins may share settings or other configuration
information - how the administrator associates/configures specific protection
spaces to authentication methods
Preemptive Support
As MacOS X nears, it is expected that some plug-in developers will merely carbonize their plug-ins while others will want to take full advantage of the new operating system. MacOS X supports not only differing thread models (preemptive and cooperative), but also multiple executable formats (CFM and Mach-O). It is not alone in this regard, as other platforms have been known to support more than one executable type.
The purpose of this specification is to outline changes in the WSAPI necessary to allow full exploitation of the underlying operating system (regardless of brand), while at the same time allowing for easy portability for existing plug-ins.
If a plug-in sets the capIsPreemptive flag during the WSAPI_Register command, it is claiming to be preemptive. The plug-in must be completely thread-safe, meaning multiple commands can be dispatched to the plug-in and run concurrently.
If a plug-in fails to set the capIsPreemptive flag, then the server should treat the plug-in as only being safe for cooperative threading. A thread running within a cooperative plug-in should only be preempted by another thread executing code in the same plug-in when the running thread uses the server callbacks. Cooperative plug-in authors should assume that any callback can yield, and that network-related callbacks will almost certainly yield.
As plug-ins become more complex it has become common to access server callbacks "out of context", meaning outside the context of any command received from the server. Typically such plug-ins will create their own threads to perform various maintenance tasks that are too elaborate to be done inside the idle thread. To access server callbacks from these threads, developers have often used NULL parameter blocks, or even constructed fake parameter blocks to access callbacks that require a valid PB.
This specification adds callbacks for creating "out of context" PBs for use in such threads. WSAPI_NewCommandPB() returns WSAPI_I_NoErr and changes the value of the customPB parameter to point to the new PB. If the callback returns an error code then the value of customPB is undefined. WSAPI_DisposeCommandPB() is used to dispose of the PB when it is no longer needed.
Custom PBs are not thread-safe. This means plug-ins that create custom PBs must ensure that any given PB is not used in two or more threads running concurrently. Server implementations should allow all callbacks to be accessed with custom PBs, excepting those that are restricted to specific run roles.
It has been well-established that heap management has a direct and significant impact on server performance. An HTTP request that does not stray from the critical path of the server code may require only four or five heap operations, but a request that produces dynamic output may require several hundred such operations. In a preemptive environment each request to the global heap requires synchronisation, which can quickly become very expensive. Clearly, high-performance plug-ins need a more efficient memory model. By associating a memory "pool" with each request, implementations can greatly reduce the occurrence of heap synchronisation.
The WSAPI_ReallocateMemory() callback is added because reallocation is typically much more efficient than the alternative. Some heap managers can "grow" a memory location much more efficiently than allocating a new space and moving the data. If NULL is returned, then the request failed and the old pointer is still valid. If a non-NULL value is returned then the request succeeded, and the old pointer is invalid. Note that in the ideal case, the old and new pointers will be the same, meaning the existing space was simply enlarged.
The WSAPI_AllocatePoolMemory(), WSAPI_ReallocatePoolMemory(), and WSAPI_FreePoolMemory() work almost identically to the existing memory calls. The difference is that they require a valid PB (unlike the existing calls), and memory MUST be freed/reallocated with the same PB as was used to allocate it.
WSAPI has supported the concept of plug-in-to-plug-in services for a long time. Allowing plug-ins to talk to each other directly in order to share data and functionality that the server does not directly provide, has proven to be a highly useful concept. It is used in the WebSTAR Data Cache, the VDMAPI, and the PIXO standard.
Having mixed threading and binary models introduces new complications to that process. Preemptive plug-ins should not be calling services located in cooperative plug-ins without some kind of additional synchronisation, and Mach-O binaries can not call services inside CFM binaries without some glue code to fix the cross-TOC calling conventions. While it is fairly easy for the server to prevent a plug-in from looking up a service registered by an incompatible plug-in, it is more desirable to allow all these plug-ins to interoperate without restriction. To that end, new service management callbacks are proposed. By passing all service calls through the server, this standard opens the possibility of cross-binary-format/thread-style service calls.
The user-defined WSAPI_ServiceFunc function type takes a valid PB and argument as input, and returns an unsigned long. Services are registered with the WSAPI_RegisterService() callback, which takes as its parameters a valid PB, a null-terminated service name, and a pointer to the registered function. Registering a NULL function un-registers a service by the same name registered by the same plug-in. Valid names are composed of alphanumeric characters, underscore "_", and spaces " ". An attempt to register a service under a name already used by a different plug-in will return the WSAPI_E_DuplicateService error.
The piServicesList returns a TAB-delimited list of registered services.
WSAPI_CallService() causes the server to pass the given argument and parameter block to the service registered under the given name. If the service cannot be located, the server will return the new error code WSAPI_E_ServiceNotFound. If the server can not reconcile the differing thread/format types of the two plug-ins, the error code WSAPI_E_ServiceNotCompatible will be returned.
Backward compatibility: the use of the piServiceAddress parameter in new plug-ins is deprecated. Server implementations may continue to support it, but if they do so they should not allow a plug-in of one thread/format type to look up a service registered by a plug-in of a different thread/format type.
Server implementations should issue the WSAPI_Register and WSAPI_Init commands before becoming multi-threaded. Plug-in developers should assume that any commands received after WSAPI_Init take place in a multi-threaded environment.
The requirement for support of cooperative plug-ins is very deliberately defined in terms of how concurrent threads are scheduled within a given plug-in, as opposed to specifying the nature of the thread itself. This allows implementors to support cooperatively written plug-ins even on systems that lack the concept of non-preemptive threading. The WebSTAR/5 implementation creates only preemptive threads, but uses semaphores to prevent multiple threads from executing inside a cooperative plug-in. Developers can assume that their plug-in code won't be interrupted by another thread executing within the same plug-in, but should not assume they won't be preempted by other applications or threads operating within other plug-ins.
Implementations that do not wish to support memory pools can simply map the pooled memory callbacks to their normal heap management functions. WebSTAR will initially do this until an appropriate pool manager can be incorporated into the server.