[Date Prev][Date Next][Date Index]

Re: autosave/restore software fails




John Maclean wrote:
> 
> ... I suspect one of the main differences
> between the accelerator and the beamlines is that, there, the servers
> are on UPS power. If this problem causes so much downtime and
> particularly if it has the possibility of damaging instrumentation, then
> it might be prudent to move the servers on to UPS power. I believe ASD
> will shortly have a number of UPS units becoming spare, let me know if
> you would like one for you server (and possibly iocs) and I will see if
> I can help you obtain one. This is probably the fastest and cheapest
> solution.

Typically UPS can maintain power for no longer than half an hour.
For longer interruptions (like the last time) it cannot not help. 

I did not check the autosave code, but according to what people were
reporting yesterday, it looks like the software writes the .sav file 
directly. Then, if writing is not finished due to power failure, the 
file gets corrupted. If this is the case, it would make sense to 
introduce a small modification: write data into some scratch file 
and only when writing is successfully finished, close it and rename 
into the .sav file. That should reduce the corruption possibility to 
minimum.

-- Sergey Stepanov