A quick link and extract from an update on Wednesday in ESA's Science directorate web pages; it provides a nice overview of the 'return to science' situation. References to our very own VMC camera activities highlighted - and note very nice comments on teamwork! Click link to read the full report.
Full report via ESA Science & Technology
While full science operations have now been resumed, a number of tasks remain to be completed. Most important among these is the implementation of an OBCP scheduler. This will enable the spacecraft to operate autonomously for up to a week, compared to the few days that are possible with the current FAST system. Work is also in hand to resume operation of the Visual Monitoring Camera (VMC – the 'Mars webcam').
Enormous team effort
Completely redesigning the way in which Mars Express is controlled has involved an enormous amount of work for the mission control team at the European Space Operations Centre (ESOC), assisted by their counterparts at the European Space Astronomy Centre (ESAC), PI-teams, other ESA experts and partners in industry. Everyone involved with the mission is extremely grateful for their hard work.
Although the 'Express' in Mars Express highlights that the mission was developed in a short time and with a relatively modest budget, the ability to resume full operations after a very serious failure shows that the resulting design is both robust and flexible.
Mars Express has now been restored to full operational capability and its potential mission lifetime remains unchanged
"Hang on, lads; I've got a great idea..."
The Italian Job, 1969
Last month, we spoke with several of the Mars Express team here at ESOC about their almost completed activities to restore, reconfigure and return Mars Express to service.
An interview with Mars Express Spacecraft Operations Engineer Daniel Lakey
Spacecraft Operations Engineer Daniel Lakey sitting beside the SSMM
A black box, edge-length 30 cm, is at the centre of the recent trouble with Mars Express.
Daniel Lakey, an engineer working on the mission at ESOC in Darmstadt, looks down at the engineering model of said black box sitting on his desk and recalls the seemingly endless night shifts he has had to pull because of its twin mounted on Mars Express, orbiting the Red Planet many millions of kilometres away.
In mid-August 2011, Mars Express unexpectedly placed itself into safe mode – think blue screen of death and reboot on a PC – because something went wrong either with the Solid-State Mass Memory (SSMM) housed inside this black box or with the on-board channels it uses to pass data to the spacecraft’s data management system (DMS) computer.
To extend the PC analogy, imagine that the memory chips in your computer, the RAM chips, or the memory controllers that tell them what to do, suffered a fault. The memory might continue functioning, apparently normal, but whenever an electronic signal tried to access the faulty unit, the operation would fail and the system would crash. That's what happened with Mars Express.
Holiday phone call at 3 AM
"I was on holidays in England when I got the call at three o'clock in the morning. Since I'm assigned as the mission's software coordinator, the problem fell in my area of responsibility," says Lakey.
Switching into safe mode means that the spacecraft automatically turns its solar panels to the Sun for maximum energy and its antenna to Earth for good communication – ostensibly very helpful in any untoward situation – but this process uses a significant amount of vital fuel. Every unnecessary safe mode reduces the life of this hugely valuable mission, and in safe mode, normal gathering of scientific data stops.
After an initial investigation, it was found that the safe modes were being triggered by the DMS computer whenever a batch of commands transferred from the SSMM was interrupted.
The problem: Big command batches were being interrupted,
triggering a fuel-gobbling safe mode
The SSMM is a large-capacity device, and it stores large numbers of commands sent by mission controllers and the instrument scientists, as well as raw data gathered by the instruments (prior to their being radioed back to Earth).
The SSMM then delivers a constant ‘stream’ of commands to the DMS computer one at a time; when the stream was interrupted – either due to a fault in the SSMM or due to some unknown problem with the on-board communication channels – the DMS detects the problem and auto-commands the spacecraft to switch to safe mode.
Taking action - but problems persist
At first, the flight control team executed the standard recovery procedures and restarted observations, hoping that Mars Express would function normally again.
But, frustratingly, safe modes happened two more times in the next few weeks, even though the engineers had tried switching on-board systems to use back-up communication channels (there is only one SSMM), among many other normal fixes. Nothing in the routine procedures, it seemed, could prevent the frustrating safe modes from occurring.
"We had to find a solution," says Lakey, "otherwise the mission would have soon been over."
By late August, the team had already gone through many night shifts trying to coax the recalcitrant spacecraft into some sort of stable configuration, with little luck.
"But then one day, an idea came to mind – while I was standing under the shower," says Lakey, with a laugh. "It occurred to me that, since something was happening to interrupt the flow of commands, triggering the safe mode, the solution might lie in by-passing the checks between the SSSM and the DMS computer, and finding a safe way to ignore problems with the link between the two."
With a little checking, Lakey was able to determine that the problem was, in fact, an issue of 'transient communication problems' between the SSMM and the computer. "When the main computer sees this interruption, it interprets it as a serious problem and stops executing its 'To-Do' list of commands – because it doesn't know whether the list is complete," says Lakey.
Fortunately, there's another, back-up, memory inside the DMS computer that could store the command stack, but it's much, much smaller than the SSMM, holding only 117 commands vs. over 3000.
So the engineers set about reconfiguring the spacecraft's systems to transfer commands from the SSMM to the onboard computer's memory in a different way. Rather than a constant stream of commands, one at a time, the commands would be transferred as a discrete block of commands relating to one complete spacecraft activity, just before that activity started.
"I thought we could use a trick, by packing the commands into smaller stacks and telling the on-board software to act only when it received a complete package. This 'all-or-nothing' scheme means we're no longer affected by the SSMM problems, but now we have more limits on what we can schedule in one go – but that's been proven to be acceptable."
But would they buy it?
As soon as he could, Lakey presented the idea to his colleagues.
"Perhaps predictably, they reacted with an operations engineer's traditional caution and scepticism. The first answers were, 'No, no, that won't work, No way...' But, after a lot of discussion, they slowly came around to 'Oh wait... maybe we should look at this... it could work'," Lakey tells.
The solution: make command batches small
With a clear consensus and the approval of Mars Express Spacecraft Operations Manager Michel Denis, the team set to work designing operations procedures that could be implemented using reduced command stacks, working first on just a certain set of basic on-board activities. This was a huge challenge.
As designed, Mars Express normally makes use of thousands of commands; for example, it takes up to 50 separate commands to simply take a single photo of Mars using the HRSC camera. Using the new, reduced command stacks would prove worthless if engineers couldn't actually do anything with the reduced command stacks.
Thus, making the solution work entails a massive amount of reprogramming to drastically reduce the number of commands needed to do anything on board. This work is what has kept the mission operations team on extended hours since November 2011.
Smiles all around
But, to everyone's delight, the solution is working and the team is substantially finished the work of converting thousands of commands to on-board procedures to be used much more efficiently than its designers had ever envisaged.
"We are confident that all the Mars Express instruments and systems can be commanded using the reduced command stacks," says Lakey.
"Now, we only need a few commands to capture an image and we can switch on and operate all the instruments at one time," he explains.
"We can proudly say that Mars Express is working properly again – and, with luck – the fuel left could last for another ten years."
A comment today from Mars Express Spacecraft Operations Manager Michel Denis on this week's report: "MARSIS completes measurement campaign over Martian North Pole." The report gives good news!
"The Mars Advanced Radar for Subsurface and Ionosphere Sounding (MARSIS) instrument on board Mars Express has recently completed a subsurface sounding campaign over the planet's North Pole. The campaign was interrupted by the suspension of science observations several times between August and October due to safe modes and to anomalies in the operation of the spacecraft's Solid-State Mass Memory (SSMM) system. As MARSIS best observes in the dark, which for the North Pole only occurs every few years, it was among the first instruments to resume observations once a partial work-around for the problems had been implemented."
In his comment below, the 'FAST Method' that Michel refers to is the operations team's newly developed way of uploading commands to Mars Express, which avoids using the problematic Solid State Mass Memory (SSMM) for critical commanding.
The 'File-based Activities on Short Timeline' method essentially means that commands are grouped in very short self-contained files that can be loaded safely, in advance of execution, from the SSMM into an alternative memory unit (that is reliable but not as capacious as the SSMM).
The FAST method - loading short command files upon need into the short onboard mission timeline - was put into use at the end of October 2011 with the (excellent) result that we could save what was remaining of the North Pole observation campaign by the MARSIS radar.
The net loss in data collection was mitigated by using the existing MARSIS command sequences as soon as possible. Meanwhile, as for the other instruments, new MARSIS on-board control procedures (OBCPs) are under development and will allow operation with fewer commands, therefore enabling the operation of several science instruments in parallel.
My main point? We did our job: contrary to widespread received wisdom, the spacecraft operators' role is not to simply watch over (supposedly) boring routine operations during the many long years of a mission - nor simply saving a spacecraft that experiences problems. In fact, we are relied upon to deliver safely as much of the expected (precious) scientific data as possible within the resources available - despite adversity. And that's what we're doing!
This in this morning from Jonathan Schulster on the Mars Express operations team:
The ASPERA (Energetic Neutral Atoms Analyser) instrument high voltage (kV) lines and equipment were successfully switched on today, a few minutes ago at Mars (~09:57 CET), using the new on-board control procedures (OBCP).
These will run for one hour until 10:40CET today and the ASPERA science team will examine the recorded science data before giving the 'go-ahead' for full operations of ASPERA using only OBCP's starting 9 Jan uary 2012.
- Jonathan Schulster
Mars Express Mission Planning & Flight Control Team
Looks like another instrument is set to return to action! -- Daniel
Today's update comes courtesy of Jonathan Schulster, an engineer working on the Mars Express Mission Planning & Flight Control Team. Jonathan's in the Mars Express (MEX) Dedicated Control Room at ESOC this morning, where the first commands to switch on the ASPERA (Energetic Neutral Atoms Analyser) device were sent a few minutes ago. (ASPERA is studying the interaction between the solar wind and the Martian atmosphere.) -- Daniel
To allow the mission to restart operations of all instruments, we needed to write 'macro' on-board control procedures (OBCPs) to replace long sequences of telecommands with single 'start macro' telecommands that would fit into the restrictive memory space provided by the short mission time-line (Basically, all commands now have to be a lot shorter to be stored on board now that the solid state mass memory no longer functions properly - Ed.).
The first OBCPs to switch ASPERA on and off, and its 'high voltage' lines up and down - along with on/off for PFS fourier spectrometer and the start/stop pre-heating for the OMEGA infrared (IR) spectrometer 'scanner' - were uplinked to Mars Express on Monday, 5 December. These must now be flight tested on the spacecraft.
The OMEGA pre-heating test took place on 7 December during a pass over ESA's 35m ground station at Cebreros, Spain, between 08:00 08:30 GMT (09:00-09:30 CET).
Today, Friday, 9 December, also between 08:00-08:30GMT, the ASPERA instrument switch on/off OBCPs were successfully flight tested. :-)
Pierre Choukroun (Standing, Left), Erhard Rabenau (standing, right) and Jonathan Schulster, sitting, in MEX DCR this morning.
Next week we plan to perform a full test of ASPERA with high voltage up/down after confirmation from the science team that the on/off worked fine.