fatal startup errors

Poster Content
nk4um Moderator
Posts: 755
January 8, 2010 18:40
Jeff - you are brilliant!   That is so clean.  I''ve wanted to spend time working out a solution to this for some time and was worried that it would take a rewrite of the HTTPTransport to be able to declare it dynamically, outside of Jetty''s declarative config.  This really neatly solves the problem.  Thanks! We''ll definitely update the fulcrums to use this!
nk4um User
Posts: 101
January 8, 2010 18:22control port
While I''m on the control port, it would also be useful to be able to specify the port used by the backend fulcrum by a system property so that it could be changed from. the command line.  Fortunately this is very simple, just a one line change to the jetty config (etc/HTTPServerConfig.xml), from
<Setname="port">1060</Set>
to
<Setname="port">
  <SystemPropertyname="netkernel.backend.port" default="1060" />
</Set>


Now I can add in "-Dnetkernel.backend.port=1061" to my command line and make it much easier to run a separate dev instance on the same server. 

Obvoiusly a similar thing could be done for the frontend fulcrum also.
nk4um User
Posts: 101
January 8, 2010 17:51
I agree that "fatal" errors and shutting down should be avoided if at all possible.  If this functionality existed, it should be reserved for really extreme cases (and the definition of extreme should be controllable by the user).  I mentioned the control port only because I actually ran into the problem and ended up with 6 instances of NK running, only 1 of which was connected to anything, and without that connection I couldn''t shut the others down.  In my case the solution is a pre-start verification (i.e., before starting check that there is not already a server running or listening on the control port) but that is not foolproof (it''s a race condition).

Your BootAssurance suggestion is significantly broader.  I don''t think I would use it, but I could see benefits if used in an automated update system, for example.
nk4um Moderator
Posts: 485
January 8, 2010 14:13Boot Assurance feature?
Thinking about this a bit, how about an optional BootAssurance feature for enterprise edition which could be configured from the control panel? This would check the deployment state after startup and if any errors and/or warnings were raised a configured action such as shutdown of transition to a lower runlevel would be performed. (Switching to a lower runlevel could restore a stem system to a known clean state ready to receive new updates)
nk4um Moderator
Posts: 485
January 8, 2010 12:05typo
Sorry, my last sentence wasn''t sarcasm but a typo.

I should have said "If that shutdown because of a config error it would not be very good!"
nk4um Moderator
Posts: 485
January 8, 2010 11:40Interesting
It''s interesting you mention this. With NetKernel 4 we have deliberately tried to move away from that approach. The idea being that the NetKernel instance is always on (24x7) and you just add and remove, reconfigure whenever. I understand that by removing or breaking the management interfaces that might not always be possible but with proper safeguards these situations can be avoided. I like of how the Mars rovers work, they must always preserve core running features such as comms and upgradability/reconfiguration even if other stuff fails. If that shutdown because of a config error it would be very good!

Tony
nk4um User
Posts: 101
January 7, 2010 19:00fatal startup errors
It would be useful if there was a way to specify that particular errors are fatal and should completely shut down the kernel.

The best candidate for this would be if the backend fulcrum is unable to start up correctly (typically because the control port is in use) then it shouldn''t start up at all, because running without the control port makes it somewhat difficult to control.