hosted services
FastCGI is a neat method of keeping an application running in memory to service web requests without restarting the whole application. If you consider that Perl applications require lengthy execution delays, it is sensible to keep the application in memory with all libraries loaded and simply iterate the main()
loop.
flush
One of the things which I noticed immediately with fcgi was that it didn't appear to have a method of flushing the output from the user application. The work around for this was to add the following to the <VirtualHost>
directive.
FcgidOutputBufferSize 0
(I'm not sure if this incurs any optimisation penalties, I didn't bother to test in any scientific way). Once the config is reloaded call the following to see the results:
while( $request->Accept() >= 0 ) {
print( "Content-type: text/html\r\n\r\n", ++$count );
for( my $i=0 ; $i<3; $i++ ) {
print "Hi!<br />\n";
$request->Flush();
sleep( 1 );
}
}
You should then see the lines appear one second apart.
Of course though, there is more than one way to skin a cat.
cgi parameters with fcgid
Often if you're using CGI.pm (like I do) you may find that variables with param don't appear correctly after the first page access. You will need to refresh them each time the application while loop runs.
You may find it helpful to write your while loop something like this:
#!/usr/bin/perl
use strict;
use warnings;
use FCGI;
use CGI;
sub main {
my $request = FCGI::Request();
while( $request->Accept() >= 0 ) {
my $e = $request->GetEnvironment();
undef @CGI::QUERY_PARAM;
my $q = CGI->new();
# ...
}
}
main;
There appears to be two main ways to interact with FCGI from a perl script. these are CGI::Fast and FCGI. I have found CGI::Fast to be rather limited whilst FCGI seems to offer the majority of what CGI::Fast does by default, that is everything except pre-loading with CGI.pm functionality. Do yourself a favour and use FCGI rather than CGI::Fast.
memory
The trouble with mod_fcgid.so
is that requests greater in length than fcgidmaxrequestlen
will be written to disk and then passed to the application, otherwise they will be read into memory. This is not ideal for either option. Firstly, the memory could be saturated, or in the case of disk, the IO is wasted to simply do shovelling.
If you want to run an application, such as a file uploader, then you can use mod_proxy_fcgi
and simply send the traffic directly to the application without disk or memory consumption. You'll get a faster response to your user as a result and other applications on the system will not run the risk of getting memory/disk starvation.
ProxyPass "/myapplication/" "fcgi://localhost:9000/"
Beware though, if you use CGI;
as the moment you use 'new' you'll be reading the whole request into memory again, instead, if you want to read the post data you can do something similar to this:
my $socket = FCGI::OpenSocket( ":9000", 3 );
my $request = FCGI::Request(
\*STDIN, \*STDOUT, \*STDERR, \%request_params, $socket );
while( $request->Accept() >= 0 ) {
print("Content-type: text/plain\r\n\r\n" );
my ($in, $out, $err) = $request->GetHandles();
open( my $f, ">", "/tmp/destination_post_file" );
binmode($in);
binmode($f);
my $buffer;
while( my $br = sysread( $in, $buffer, 16384 ) ) {
syswrite( $f, $buffer, $br );
}
close($in);
close($f);
}
FCGI::CloseSocket( $socket );
If you're running within docker, you can run one application per container with a large swarm for redundancy, or use a watchdog program in your entrypoint to ensure that the application is restarted should it die. Unlike mod_fcgi you have to manage the jobs. However, mod_proxy_balancer can help you here if you make a job pool (docker swarm, for instance).
python
Things are a little different with python's FCGI implementation. It is named 'wscgi'. To start, you'll need this boiler plate code:
from flup.server.fcgi import WSGIServer
def app(environ, start_response):
import cgi
start_response('200 OK', [('Content-Type', 'text/html')])
yield '<html><head><title></title></head>\n' \
'<body>\n' \
'<p>hello world</p>\n'
WSGIServer( app, bindAddress=( '0', 9001) ).run()
An interesting difference with python's implementation of FCGI, and CGI in general, compared with perl, is that the input data is not written to disk. What's also interesting with the approach of using yield
is that the connection is set to close and chunked encoding is often not used (though I've had mixed results).
spawnfcgi
A great way to keep a FCGI application running (or microservice, if you
want to call it that) is to combine spawn-fcgi
and multiwatch
. This
combination allows you to start and expose a service daemon that listens
to the FCGI protocol and keep a number of service daemons running.
/usr/bin/spawn-fcgi -n -p 9000 -u nobody -- \
/usr/bin/multiwatch -f 5 -- \
/usr/bin/python /usr/local/bin/microservice.py
I find this a good balance. We could use docker swarm to keep a service
running with a number of workers, but that could be more overhead than
we need, multiwatch
can do that work for us without requiring
additional containers.
Docker swarm is useful with this setup still as that performs part of a
cluster management role which would be outside what multiwatch
can do
for us.