[Date Prev][Date Next][Date Index]

Re: autosave/restore software fails

I'm going to go out on limb here, without adequate
review of our archives.  Seems that our discussion
(very fruitful and productive, thank you) leads me
to suspect third party NFS servers may be one of
the culprits for underperformance of the autosave/restore.
Running the autosave software as an OPI client
rather than in the IOC _may_ alleviate some observed
troubles, maybe not.

HOWEVER, the restore algorithm needs to be revisited.
1. adequate diagnostic needs to be provided upon success
   or failure of restore.  A BI PV would do this.
   PVs that fail to restore (or appear corrupt when read)
   should be reported to VxWorks console.
2. restore needs to fail safe!!!!!!
   That means, when restore fails, and is unable to
   make repairs, a really big notice needs to come up
   at the user level (not just some message on a VxWorks console)
3. If restore fails, restore could attempt reading the
   next most recent backup (Yes, I realize this is sort of how
   it already works, but a system of *.sav files such as suggested
   by Mark Rivers is a more obvious and uniform way to do this.
   Indeed, this numbered system of file versions is how backups
   of Linux log files are done on a routine basis.)
   On an autosave:
   file.sav.5 is deleted
   file.sav.4  --> file.sav.5
   file.sav.3  --> file.sav.4
   file.sav.2  --> file.sav.3
   file.sav.1  --> file.sav.2
   file.sav    --> file.sav.1
   file.sav is written, then 100% verified  (checksum anyone?)


Dr. Pete R. Jemian <jemian@uiuc.edu>            | UNICAT, Bldg 438D
Scientist                                       | Advanced Photon Source
Frederick Seitz Materials Research Laboratory   | Argonne National Laboratory
University of Illinois at Urbana-Champaign      | Argonne, IL  60439
Urbana, IL  61801                               | 630 - 252 - 0863
Education is the one thing for which people are willing to pay yet not