[Date Prev][Date Next][Date Index]
Re: BESSRC experience with Spec/Epics
- Subject: Re: BESSRC experience with Spec/Epics
- From: Guy Jennings <jennings@anl.gov>
- Date: Tue, 19 Nov 2002 18:53:15 -0600
On 11/19/02 4:53 PM, 'Tim Mooney' <mooney@aps.anl.gov> wrote:
>
> I haven't heard from users directly reporting long CA timeouts with spec,
> though I have heard of problems thought to be CA related--I think BESSRC
> may have seen some of this.
Yes we have, so much so that we have largely given up using epics with spec.
>
> I agree that applications should not be expected to fix problems with network
> hardware, although they should make as much use as they can of error returns
> and connection-management messages from CA.
I'm reasonably convinced that our problems are not caused by network
hardware.
>
> I talked with Gerry yesterday, and have the impression that spec's doing the
> right things (I'm not a CA guru): when it does a ca_put() or ca_get() (the
> non-callback version) it calls ca_pend_io() with a user-specified timeout.
> It's ok for this timeout to be quite long, because ca_pend_io() will return
> as soon as it receives server replies to all the outstanding non-callback
> requests. Also, spec calls ca_pend_event() frequently (with a very short
> time value, because ca_pend_event() will never return before the specified
> time
> has elapsed), so CA should be getting enough processor time to do its
> business.
>
> My understanding is that it's possible for CA to simply not send some messages
> if it's 'send' buffer runs out of space and new messages continue to be added.
I didn't know this. Is this on the client? It would be consistent with
what we see. I get the impression that the errors are more likely when spec
has a large number of epics motors defined and specifically during the burst
of activity that occurs when leaving spec's 'config' screen.
> It's also possible for CA to get insufficient CPU time to handle all the
> messages it's intended to handle. This could mean that a request doesn't
> get sent, that a sent request doesn't get received, that an acknowledge
> doesn't
> get sent, or that a sent acknowledge doesn't get received.
How much CPU time does it want ? I've seen the problem on a dual 1.5GHz
Athlon system, which should be adequate for talking to a 25MHz 68040 !
>As you note, there
> doesn't seem to be a way for the client always to know what has occurred.
> What
> could a client do in this case other than complain to the user or retry the
> operation (if the operation /can/ be retried)?
Guy Jennings
BESSRC CAT