Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

General Principles

  • DO NOT use control-c to stop a script, this can leave the instrument in an unsafe state. Use kpfconfig.SCRIPTSTOP to request a stop (also available as a button in the KPF OB GUI).

...

Setting SCRIPTSTOP to “Yes” will request that the running script terminate in an orderly fashion. This may take several minutes depending on when the running script reaches a sensible breakpoint. It is important to use SCRIPTSTOP to halt a script (instead of hitting ctrl-c) because cleanup actions will be performed after a SCRIPTSTOP (e.g. turning off laps to preserve their lifetime).

...

The agitator can be reenabled by simply setting the keyword to “Yes”. This should only be done by WMKO staff and should only be done if the agitator is fully functional. A broken or misbehaving agitator mechanism presents a significant danger to the science fibers.

...

The cal source has likely been disabled for a reason. Reenabling it should only be done by WMKO staff with knowledge of why it was disabled in the first place.

SlewCal or Simultaneous Calibration Source is Wrong

...

These two sources are set from the same place and are always kept to the same value using the kpfconfig.SIMULCALSOURCE keyword.

Solution

This value should only be changed by WMKO staff. The choice of calibration source is influenced by our desire to maintain the lifetime of the various lamps and calibration sources and by the need for warmup times on certain sources. This is not something the user should adjust, contact a WMKO Staff Astronomer if you feel it is set incorrectly.

...

The FIU failed to reach destination mode (kpffiu.MODE) after repeated attempts.Problem

One possible reason for this is that the fold mirror is in a limit switch. To establish if this is the case:

Code Block
gshow -s kpffiu FOLDLIM
     FOLDLIM = Positive

error code which you might see is:

Code Block
decode_write_response_event(): MODE (on behalf of MODE): ERR_WRITE_SW_ERROR (-5401) There was an error in the device-specific write routine for this keyword:  check the log files

Problem

One possible reason for this is that the fold mirror is in a limit switch. To establish if this is the case:

Code Block
gshow -s kpffiu FOLDLIM
     FOLDLIM = Positive

Normal state is FOLDLIM=”None”

...

The detector has likely been disabled for a reason. Reenabling it should only be done by WMKO staff with knowledge of why it was disabled in the first place.

Detector Fails to Trigger Without Error

...

Green or Red CCDPOWER is “Not Configured”

Symptom

The CCDPOWER keyword in the kpfgreen or kpfred service is “Not Configured”. This may manifest as the system not being able to take exposures.

Problem

The core problem is not yet clear, but this appears to be something in the Archon controller itself. Investigation is ongoing as of this writing. “It is as if the Archon glitched and lost its ACF.”

Solution

Restart the relevant kpfgreen or kpfred service (we will use kpfgreen in this example):

kpf restart kpfgreen

After the restart, the heater configuration must immediately be reset to the nominal values to restore temperature control, this will also turn on CCDPOWER:

kpfmon-heater-copy green copytolive

Ca HK Detector Stuck

Symptom

The HK detector is stuck in the exposing state.

or

The kpfexpose EXPOSE status is stuck and kpfexpose.EXPLAINNR is "hk:ACQPHASE=wait".

Problem

Some combination of hardware and software is stuck.

Solution

  1. Abort the existing HK exposure:

modify -s kpf_hk expose=abort

Alternatively, the same action is available via the FWWM menu under KPF Troubleshooting Menu → KPF Trouble Recovery → Reset Ca HK Detector

  1. Then restart the kpf_hk service:

kpf restart kpf_hk

Take test exposures to see if the system has recovered.

If the system is still sticking after test exposures:

  1. Power cycle the Andor Camera, and the HK Galil on power strip J, ports 1, 2, and 5, then restart kpf_hk again. This can be accomplished in one step using:

kpfPowerCycleCaHK

...

Note: The above solution was made obsolete when the Archons were removed from the heater control loops.

Green or Red CCDPOWER Fails to turn on

Symptom

When trying to power up the detector by setting the CCDPOWER keyword to “On”, the keyword does not successfully transition to on. The stderr file for the relevant camerad process (e.g. `gcamerad.stderr`) contains statements like:
(Archon::Interface::archon_cmd) ERROR: Archon controller returned error processing command: POWERON

Problem

There is something wrong in the ACF file for the Archon.

Solution

Resend the ACF file by either modifying the appropriate ACF keyword or using the kpfSetReadModeNormal or kpfSetReadModeFast scripts.

Ca HK Detector Stuck

Symptom

The HK detector is stuck in the exposing state.

or

The kpfexpose EXPOSE status is stuck and kpfexpose.EXPLAINNR is "hk:ACQPHASE=wait".

Problem

Some combination of hardware and software is stuck.

Solution

  1. Abort the existing HK exposure:

modify -s kpf_hk expose=abort

Alternatively, the same action is available via the FWWM menu under KPF Troubleshooting Menu → KPF Trouble Recovery → Power Cycle Reset Ca HK Detector

Exposure Meter in Error State

Symptom

The exposure meter is in an error state: kpf_expmeter.EXPSTATE=Error

Problem

Something in the SBIG CCD control software is unhappy.

Solution 1

First, try resetting the exposure meter detector:

modify -s kpf_expmeter EXPOSE=Reset

Check the status: gshow -s kpf_expmeter EXPSTATE should become “Ready”.

Solution 2

If the above solution fails to recover the system, power cycle the camera on power port L3.

...

Then do kpf restart kpf_expmeter

This is equivalent to

...

  1. Then restart the kpf_hk service:

kpf restart kpf_hk

Take test exposures to see if the system has recovered.

...

If the system is still sticking after test exposures:

  1. Power cycle the Andor Camera, and the HK Galil on power strip J, ports 1, 2, and 5, then restart kpf_hk again. This can be accomplished in one step using:

kpfPowerCycleCaHK

Alternatively, the same tool is available via the FWWM menu under KPF Troubleshooting Menu → KPF Trouble Recovery → Power Cycle Ca HK Detector

Exposure Meter in Error State

Symptom

The exposure meter is in an error state: kpf_expmeter.EXPSTATE=Error

Problem

Something in the SBIG CCD control software is unhappy.

Solution 1

First, try resetting the exposure meter detector:

modify -s kpf_expmeter EXPOSE=Reset

Check the status: gshow -s kpf_expmeter EXPSTATE should become “Ready”.

Solution 2

If the above solution fails to recover the system, power cycle the camera on power port L3.

...

Then do kpf restart kpf_expmeter

This is equivalent to

kpf restart kpf_expmeter1 and kpf restart kpf_expmeter2
kpf_expmeter1 is the exposure meter camera.
kpf_expmeter2 is the exposure meter DRP.

Sometimes kpf_expmeter2 is stuck in a busy state after a kpf_expmeter1 restart. Both kpf_expmeter1 and kpf_expmeter2 need to be ready before kpfexpose can trigger new exposures.

...

L0 files may be deprecated when switching to fast-readout mode. The observers may find broken L1/L2 files, and the L0 files are missing green/red/CaHK components. If L0 files needed to be regenerated, it is better to write to a new directory (e.g., /sdata1701/kpfeng/DATE/new_L0 ), and inform Jeff Mader so he can ingest the new L0 files to Keck Observatory Archive.

Agitator

Agitator Sounds Wrong or Speed is Wrong

Symptom

The agitator is not working as expected. This is usually seen in the agitator speed value (see screenshots) or by listening to the sounds.

...

Bad Agitator Speed.pngImage Added

You can listen to the agitator mechanism via the “KPF crypt M5075” camera on the facility camera list (note this is an internal web page at Keck). In normal operation the agitator makes a regular (roughly 1-2 Hz) mechanical oscillation sound. When the bad behavior above occurred it was either silent (note there is background fan noise on that camera) or would make a single mechanical “cachunk” sound, then stop.

Problem

The motor is not initialized properly.

Solution

Initialize using:

modify -s kpfmot agitmod=pos

modify -s kpfmot agitini=no

This leaves the system in Halt mode. Initialize using:

modify -s kpfmot agitmod=pos
modify -s kpfmot agitini=yes

Vacuum

Vacuum Chamber vacuum levels rising

Symptom

Vacuum levels in the chamber are rising, but the vacuum levels at the pump are falling. These are kpfvac.VCH_HIVAC and kpfvac.VCART_HIVAC keywords respectively.

This might manifest as a “vac chamber trouble” alert from kpfmon.

Problem

The gate valve between the vacuum chamber and the pump has closed.

The gate valve is currently (late 2023) not instrumented and is controlled by compressed air from the facility. As long as facility compressed air is working, the valve is open and the pump should be keeping the vacuum chamber at good vacuum levels. If the gate valve closes, it is presumably because compressed air has failed.

...

Solution

Restore compressed air.

Note that compressed air depends on HELCO power. If we’re on generator, it is not active. It should come back on it’s own once power is restored.

SoCal

Enclosure Won’t Open or Close

Symptom

Enclosure will not move. It may begin moving, then stop and reverse.

Problem

Enclosure motor is hitting an overcurrent limit.

To verify this is the problem, log in to the dome controller from kpfeng@kpfserver :

ssh socal

This connects to the Raspberry Pi controller in the dome enclosure. The username and IP address has configured in the ~/.ssh/config file on kpfserver (you can ssh manually using pi@192.168.23.244) and the SSH key for kpfserver has been installed on the Pi, so it should not ask for a password, but if it does, the password in in the usual showpasswords location.

View the dome log file in the ~/dome.log and look for errors which indicate the nature of the problem.

The ~/grep_for_dome_error script will exclude many of the noisy, not useful lines in the dome.log file and help with examining the log.

Solution

If the log file indicates overcurrent on the motors is the issue. Ensure the mechanisms are clear of obstruction and reasonably well balanced (it doesn’t need to be perfect).

Enclosure Won’t Open

Symptom

Enclosure will not move. It may begin moving, then stop and reverse.

Problem

Enclosure motor is not getting current.

To verify this is the problem, log in to the dome controller from kpfeng@kpfserver :

ssh socal

This connects to the Raspberry Pi controller in the dome enclosure. The username and IP address has configured in the ~/.ssh/config file on kpfserver (you can ssh manually using pi@192.168.23.244) and the SSH key for kpfserver has been installed on the Pi, so it should not ask for a password, but if it does, the password in in the usual showpasswords location.

View the dome log file in the ~/dome.log and look for errors similar to:

Code Block
2024-07-09 19:09:37,752 WARNING  Operation timed out (35.0 secs), max measured motor
current: 0.0 A.

The ~/grep_for_dome_error script will exclude many of the noisy, not useful lines in the dome.log file and help with examining the log.

Solution

Reboot the controller (raspberry pi): sudo reboot and restart the kpfsocal3 dispatcher: kpf restart kpfsocal3.