Loosely Synchronized Dual Processor Architecture Computer Science Essay

Modern twenty-four hours autos have been equipped with sophisticated distributed embedded systems. Most of these embedded systems have rigorous safety demands. This tendency in the automotive industry keeps increasing in the figure of safety-related embedded systems which are responsible for active and inactive safety of the vehicle. This construct called drive-by-wire or x-by-wire purposes at bettering safety of the riders by replacing mechanical constituents with strictly electronic or electro mechanical constituents. Researchs have besides been carried out towards a to the full automated or assisted impulsive experience [ 4 ] [ 5 ] .

Since these electronic systems should be able to manage extremely safety critical undertakings without mechanical backup and operate in peculiarly rough environment, their mistake tolerance is of extreme importance. Besides to be considered is the fact that the Automobile industry is extremely sensitive to be. This study presents several hardware based mistake tolerant architectures and attacks to utilize these architectures in a cost effectual mode by utilizing scheduling methodological analysiss.

1.1. Automotive Safety Requirements

As defined in [ 6 ] safety unity is defined as the chance that a safety related system performs its safety maps satisfactorily under all stated conditions for the declared period of clip. The safety criterions for automotives described by IEC 61508 and ISO 26262 assign automotive safety unity degrees ( ASIL ) for electronic constituents in vehicle. The ASILs assigned to electronic constituents are in the scope of A to D ; in the increasing order of hazard ( Level A – correspond to least sum of hazard ) [ 8 ] .

In the coming old ages safety features ( including mistake tolerance ) along with public presentation would be the separating factor in automotive field. Figure 1 show how the safety demands of electronic constituents in vehicle have increased throughout the old ages every bit good as the grade of mechanization in this field.

ASIL D is the most rigorous demand as per the safety criterions and has to supply mistake tolerance to all types of hardware mistakes.

The mistakes are general classified into,

i‚· Transient mistakes: These occur due to external influences like heat, electrical noise, radiation etc. Radiation could enforce localised ionization events which in bend would upset internal informations provinces. The mistake induced is besides called soft mistake. Soft mistakes significantly cut down the system handiness. The rate at which they occur is called Soft mistake rate ( SER ) . It has to be noted that soft mistake is of concern in current engineerings where device size is shriveling [ 12 ] [ 13 ] [ 15 ] .

i‚· Intermittent mistakes: These are transeunt mistakes that occur from clip to clip.

ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 2

i‚· Permanent mistakes: These mistakes are ever consistent and originate due to hardware ( physical ) harm. For illustration the high degree of integrating in the french friess could take to cut down pitch and wire breadth and this may take to an unfastened or short circuit [ 11 ] .

Figure 1: Badness of failures in electronic driver aid systems and thrust by wire systems [ 5 ] .

When a mistake is detected depending on the ASIL, the system has to degrade or exhibit mistake tolerance harmonizing to the undermentioned degrees,

i‚· Fail-Operational ( FO ) : The system can digest a individual failure which means system remains operational even after one failure. Highly safety critical system in ASIL-D degree like electronic brake system in brake by wire and detector and actuators which supports the functionality must hold this capableness. Another illustration is the regenerative energy storage system in electric autos [ 16 ] .

i‚· Fail-Safe ( FS ) : Even after a failure, constituent continues to map and initiates some actions so that the system reaches a safe province. A safe province for a vehicle is defined as base still province. This mistake tolerance is required for systems in ASIL C and D. The mechanical backup for the electronic control system, the propulsion system in electric autos are illustrations for the fail-safe units in vehicle [ 5 ] [ 16 ] .

i‚· Fail-Silent ( FSIL ) : Upon a failure, the affected constituent shuts down so that it will non wrongly act upon other constituents [ 5 ] .

ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 3

1.2. Analysis of Fault tolerant techniques

Fault tolerant techniques can be categorized into ;

i‚· Information redundancy: This methodological analysis works by adding excess information or excess information during informations exchange. The excess information added can be used for mistake sensing and mistake rectification. There are several signifiers of mistake observing codifications viz. para codifications ( used in memory storage ) , duplicate codifications ( applied in communicating system ) , checksums ( applied in informations transportation between memory elements ) , Berger codification ( detects all unidirectional mistakes ) and cyclic redundancy cheque ( CRC codifications ) . By utilizing mistake rectifying codifications ( ECC ) like overacting codifications, the detected mistakes can besides be corrected. Memories and coachs are normally protected by ECC. For measuring the codification words a hardware checker has to be implemented. It is besides of import that in order to guarantee dependability of the hardware checker a ego proving mechanism besides necessitate to be implemented which can observe internal mistakes without external stimulation.

i‚· Temporal redundancy: This technique employs repeat of calculation for two or more times and comparing of the consequences, look intoing for disagreements. Time redundancy attempts to cut down extra hardware required ( as in instance with spacial and information redundancy ) by utilizing the available clip slots. Temporal redundancy was chiefly aimed to observe soft mistakes but it could besides observe lasting mistakes: stuck at mistake in a coach line can be detected by directing original informations foremost followed by the compliment of original informations after a clip interval. The disadvantage of temporal redundancy is that it can non be applied to systems with difficult existent clip restraints.

i‚· Spatial redundancy: In this technique a system would hold more constituents than really required for its functionality. These constituents could be indistinguishable to the bing 1s or with different functionality. The spacial redundancy itself is loosely divided into,

o Passive hardware redundancy: To accomplish mistake tolerance the replicated constituents execute same undertakings and their consequence are routed to a elector or a comparator which checks the cogency of the consequences. Depending on the active figure of such constituents the system could exchange between Fail operational, fail safe or fail silent manners.

o Active hardware redundancy: This method tries to accomplish mistake tolerance by mistake sensing, localisation and recovery. Compared to passive scheme, in this method there is no effort to forestall mistakes from bring forthing mistakes in the system. Once an mistake is detected, system attempts to happen out the mistake location and so reconfigures the system without the faulty constituent ( debauched manner ) or triping the standby constituent ( standby replacing ) .

o Hybrid redundancy: This method has the attractive characteristics of both Passive and active redundancies. That means system has the ability for mistake cover and reconfiguration.

ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 4

The disadvantage of spacial redundancy is the cost of hardware and it increases from active to hybrid redundancy [ 10 ] [ 17 ] .

The first subdivision of this study aims to discourse chiefly the spacial redundant architectures which can be applied to the embedded systems in automotive industry. The functionality and the advantages and disadvantages of the architectures will be discussed. The subdivision ends with an illustration execution of a mistake tolerant architecture in the industry.

The mistake tolerance realised with redundancy incurs cost and the following subdivision of the study focuses on the stairss or methods to cut down the cost of redundancy and derive upper limit throughput from the architecture. Two methods named as relaxed dedication and Distributed temporal redundancy will be presented. This is followed by a comparing and survey about public presentation betterment.

2. Mistake Tolerant Multi-Core Architectures

The multi-core systems-on-chip are the new tendency in Embedded systems to increase the public presentation so that sophisticated control maps can be implemented. In automotive sphere multi nucleus is attractive to guarantee the needed safety criterions with redundancy and in add-on brings higher computational power. With the current rate of engineering scaling more and more nucleuss could be integrated into a bit. However it is besides known that such grading will do the electronic constituents more susceptible to soft mistakes due to external perturbations. The other known jobs due to scaling down are variableness and debasement ( aging ) [ 9 ] .

In the following subdivisions different multi-core architectures are discussed which can be used to efficaciously manage soft mistakes caused by transient mistakes.

2.1 Lock-Step Dual Processor Architecture

Figure 2 depicts the Lock-Step architecture which uses two processors: a maestro CPU and a checker CPU.

The maestro accesses to the memory and fetches instructions, informations and executes them. The checker nucleus continuously executes the instructions on the coach which are fetched by the maestro. The consequences of the executing, both references and informations are fed to the proctor who so compares them with those from the maestro. The sensing of any disagreement indicates the presence of a mistake in either of the CPUs.

The proctor nevertheless could non observe any coach or memory mistake. And hence coach and memory are to be protected with the aid of ECC codifications ( like para codifications ) .

This architecture could work as Fail-Silent node which is capable of observing a individual failure ( of either CPU ) [ 1 ] [ 17 ] [ 18 ] . ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 5

Figure 2: Lock-Step double processor architecture [ 1 ] .

2.2 Loosely-Synchronized Dual Processor Architecture

The Loosely-Synchronised Architecture as seen in Figure 3 has two independent CPUs which can entree its ain memory subsystem.

Figure 3: Loosely-Synchronized double processor architecture [ 1 ] . ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 6

Loosely-Synchronized double processor architecture is an illustration of asymmetric multi processing system in which the synchronism and mistake checking are performed via interprocessor communicating by the existent clip runing system ( RTOS ) running on both CPUs.

For the intent of mistake checking, the critical undertakings are to be duplicated in both the memories. They are executed in analogue and the consequences are exchanged. The single RTOS so checks the consistence of the consequence and a mistake is concluded in instance of any disagreement. In instance of mistake sensing the consequences are non committed and each CPU runs its ain ego trial to happen the faulty constituent. If the cheque was successful to happen the peculiar faulty constituent so the system can still go on to run ( debasement ) with the aid of the healthy constituent. Besides since in this architecture the coach and memory system are replicated, information redundancy like ECC need non be implemented.

Cross checking of consequences is of import before it is committed. And hence clip defenders can be used to curtail CPU entree to outputs for a predefined time-window. This can be either implemented in hardware or performed by the RTOS. Another technique would be to add signatures to the consequences by each processor. At the receiving system informations is accepted merely after look intoing both the signatures.

This architecture can be compared to lock-step architecture on the footing of the figure of critical undertakings to be handled. When there is smaller figure of critical undertakings, slackly synchronized architecture can use both the processors independently for put to deathing non critical undertakings thereby increasing the throughput.

On the other manus, in slackly synchronized architecture the critical undertaking set has to be to the full replicated in both the memories and public presentation would be indistinguishable to that of a individual processor during regular executing. Here the lock-step menus better in footings of public presentation because of the ego look intoing algorithm implemented as hardware ( CPU Checker ) . The CPU maestro and checker tallies in lock-step and consequences are fed into proctor synchronously which makes the error sensing faster [ 1 ] [ 18 ] .

2.3 Triple Modular Redundant ( TMR ) Architecture

TMR architecture ( Figure 4 ) is a common signifier of inactive hardware redundancy. The architecture has three indistinguishable processors which operate in lock measure put to deathing the direction fetched from a individual beginning ( RAM/Flash ) . There is a elector which is implemented in hardware to look into the end products from the 3 CPUs and it can observe and dissemble individual CPU failure which means that this architecture is Fail Operational. Since the memory and coach system are non replicated they need to be protected by ECC ( like para spots ) .

The system nevertheless depends on the dependability of the elector as the failure of it consequences in the failure of the system. One method to guarantee dependability is by triplication of the elector. This means even if one of the elector fails, the system continues to stay operational with the aid of other two electors. ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 7

Another trouble faced by voting mechanism is that in certain state of affairss the consequence from the processors may non be indistinguishable even in mistake free state of affairs. As an illustration, parallel to digital transition can bring forth fluctuations in at least the least signifant spots. The attack to work out the job is to choose the mid value ( value that lies between other two ) of the consequences called as mid value select technique.

Figure 4: Ternary modular redundant ( TMR ) architecture [ 1 ] .

Besides to be noted is the fact that, since all the nucleuss of TMR runs in lock measure mode ; its public presentation is similar to that of individual processor. However TMR is being used in systems with high dependability demand, where safety has higher penchant than cost.

One of such noteworthy application is in aircraft Boeing 777. The chief flight computing machine ( fly-by-wire ) of the aircraft must be extremely dependable and therefore has three indistinguishable units in TMR constellation. And each of these units has 3 processors, once more working in TMR constellation. In add-on to it, the processors used in each unit are heterogenous [ 17 ] [ 19 ] . ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 8

2.4 Double Lock-Step Architecture

As depicted in Figure 5, in Dual Lock-Step architecture there are two fail-silent channels implemented with lock-step architecture. The entire system act as a fail-operational unit which can dissemble individual CPU mistake.

Figure 5: Double Lock-Step architecture [ 1 ] .

The critical undertakings need to be replicated in both the memories in order to look into for mistakes. As an advantage over slackly synchronized theoretical account, it can be seen that this architecture need non execute the self trial to happen the faulty constituent as the critical codification is duplicated and executed by both the CPU Masterss whose consequences are so verified by the several CPU checker.

A point to be noted is that, in instance a mistake is detected the faulty CPU master/ checker channel can be masked and the other channel can still move as a fail-silent node [ 1 ] . ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 9

2.5 Execution of Fault-tolerant architectures and Comparisons

Cost demand to be considered while implementing the multi-core architectures discussed supra. Using shared memory in Loosely synchronised and Dual lock-Step architecture can salvage cost. For avoiding or cut downing the public presentation constriction ensuing from two processors accessing the same memory, this can be split into 4 Bankss ( 2 for codification and 2 for informations ) and the coach system can be replaced with a cross-bar switch.

Figure 6 depicts a Shared memory Loosely synchronized architecture where both the processors portion a memory subsystem.

Figure 6: SM Loosely-Synchronized double processor architecture [ 1 ] .

Here the duplicate of critical codification becomes a tradeoff between public presentation and memory size. Non duplicated codification would endure from slower executing, since nucleuss are non synchronised ( non in lock-step ) and memory entrees to same location would be at different times. Duplicated codification runs faster but consumes more memory infinite. ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 10

Figure 7 depicts the Shared memory Dual Lock-Step architecture. The major plus point is that since the memory subsystem is shared, the architecture provides the same mistake tolerance of the TMR solution. In this instance, two channels will run in lock-step manner and an extra elector implemented in package would transport out the consequence comparing.

Figure 7: SM Dual Lock-Step architecture [ 1 ] .

At the same clip it besides offers flexibleness as two separate fail-silent channels which would supply dual public presentation. Therefore Lock-step and parallel executing becomes two manners of operation for this architecture.

The comparing of the so-far discussed multi-core solutions would uncover some interesting facts

i‚· The Lock-Step architecture, though improves dependability can non offer any public presentation encouragement over the individual processor architecture. Another disadvantage is the limited debasement possibility which is merely as a fail-silent node. The encouraging fact is that this architecture has less country ( Si ) operating expense compared to other architectures.

i‚· The SM loosely-synchronised architectures on the other manus offer a debauched manner of operation. In order to run in Lock-step mode the critical codification has to be duplicated and this increases the memory cost and consumes a batch in footings of country. Besides the mistake diagnosing is complicated and clip consuming as it need to be handled by the RTOS.

ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 11

i‚· The TMR solution has lesser country required compared to SM Dual Lock-Step and it besides provides individual fault-masking ability. But the disadvantage is that the public presentation can merely be that of the individual processor architecture.

i‚· The fail silent channels of SM Dual Lock-Step architecture can either run in lock-step or independently. When runing independently, it gives public presentation as that of double processor architecture. When in lock-step, dependability is achieved by supplying same individual mistake dissembling capableness as that of TMR. This all comes at a little cost of the modest addition in country because of the excess CPU added compared to that of TMR.

The SM Dual Lock-step solution is ideal for applications which require high computational power while keeping the higher safety unity degrees ( ASIL D ) as described by the criterions.

2.6. Multi-core Execution in the Industry

Multi-core architectures have made into the car industry with the coming of extremely computational particular applications and dependability demands for the safety critical systems. The AURIX household from a commercial maker is such an entry to market assuring attachment to the ISO26262 criterions [ 7 ] .

Figure 8 depicts the architecture for the AURIX household of MCUs. It has three nucleuss: two public presentation nucleuss and one efficiency nucleus. The mistake tolerance is attained by holding a diverse lock measure nucleus which is implemented in one of the public presentation nucleus every bit good as in efficiency nucleus. The other public presentation nucleus can run non-critical undertakings which do non necessitate mistake tolerance. Multi nucleuss are susceptible to common manner failures due to time tree, power supply or Si substrate. A diverse lock measure nucleus could forestall some of these common manner failures as there is physical separation between nucleuss and physical harm in one nucleus may non impact the other nucleus.

Figure 9 explains the operation of the lock-step nucleus. The architecture is similar to the Lock-Step architecture holding a chief nucleus, lock measure nucleus and a comparator which are physically separated. In add-on to this holds have been introduced between their executings. The hold introduced increases the chance to observe mistakes due external influences like electromotive force spikes. In this architecture the comparator is a cardinal constituent and must be extremely dependable [ 7 ] . ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 12

Figure 8: AURIX architecture [ 7 ] .

Figure 9: Lockstep CPU in Infineon AURIX [ 7 ] . ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 13

3. Reducing Cost by effectual programming techniques

The Multi-Core architectures discussed in the old subdivision purpose to increase the dependability, safety and besides the public presentation of the embedded systems in the vehicle.

For accomplishing the needed mistake tolerance degree ( viz. Fail operational, Fail safe and Fail silent ) , redundancies in some signifier have to be used with which the public presentation of the system can besides be improved. Here scheduling techniques can play an of import function by assisting to cut down the cost of redundancy and increase the public presentation of the system.

The Electronic control units that handle safety critical application typically have two sets of undertakings to be executed viz. critical undertaking ( CT ) set and non-critical undertaking set. The critical set is dedicated to be executed in critical resources ( CR ) while the non-critical undertaking set in non-critical resource ( NCTR ) . The differentiation is due the fact that in order to keep mistake tolerance, the CTs are scheduled in CRs as lock-step executing. The Noncritical undertaking ( NCT ) subsystem does non normally interfere with the critical resources thereby guaranting the timely sensing of failures.

However this could in turn lead to less use of treating power because the critical resources remain idle or underutilized when there are no critical undertakings running. This brings in an thought called on-demand redundancy which helps to better the processor use and cut down cost [ 2 ] .

There are two different techniques which can be employed for accomplishing this end.

i‚· Relaxed Dedication [ 2 ] : This method allows NCT to be executed in CRs and thereby increase the throughput of NCTs.

i‚· Distributed temporal redundancy [ 3 ] : This method relies on loosen uping the lock-step without haltering the dependability of the system.

3.1. Relaxed Dedication

As the name suggests relaxed dedication relaxes the demand that merely critical undertakings can be executed in critical resources. This would assist to cut down the hardware demand to agenda and put to death the NCTs.

The critical undertakings are traditionally scheduled with adequate slack such that in instance of a detected mistake, the undertaking can be re-executed. For this intent there are retry slots allotted after each CT executing so that the system can seek to recompute the faulty operation. The retry slots are statically scheduled with the restraint that they do n’t traverse the undertaking deadlines [ 15 ] .

Upon an premise that the transeunt mistake rate or soft mistake rate ( SER ) and lasting mistake rate due to hardware harm is low, these retry slots remain unutilised. Relaxed dedication proposes to schedule NCTs during these retry slots and in the idle slots between critical undertakings. ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 14

The relaxed dedication can be already deployed in systems with lock-step executing. The lone design alteration required is to add a position spot to advise whether the comparator logic is to be used or non.

When there is a switch to critical undertaking from NCT this spot would be set bespeaking that the executing must be in lock-step and consequences are to be compared.

3.1.1. Analytic Model

To understand the public presentation additions of the relaxed dedication, the undermentioned analytical theoretical account can be used.

Assuming a double modular redundancy ( DMR ) system with N NCTRs, the figure of rhythms for NCT executing is given by,

WDMR = tantrum

where fi is defined as the clock frequence of NCTRi and T is the hyper period.

As mentioned above relaxed dedication purposes to utilize the idle and retry slots for public presentation betterment. The figure of rhythms for NCTs in DMR system with M critical resource brace is given as,

WDMR+RD = 2fit ( 1-ci ) + WDMR

Here curie is defined as the fraction of clip brace of CTRi is put to deathing Critical undertakings and hence ( 1-ci ) gives the idle/retry slots which could be used for NCTs. Sing the easier instance for analysis, that each CTRi brace executes CTs for the same fraction of clip degree Celsius.

WDMR+RD = 2Mft ( 1-c ) + Nft

The ratio, WDMR+RD / WDMR = ( 2Mft ( 1-c ) + Nft ) / Nft = 1 + .

The equation reveals that when c=0.5, M=1, N=1, DMR architecture with relaxed dedication would supply duplicate the public presentation compared to a dedicated DMR. Besides as the figure of critical resource brace additions so is the throughput of non critical undertakings [ 2 ] .

3.1.2. Scheduling of critical and non critical undertakings

The critical undertakings are normally scheduled on critical resources as shown in the Figure 10. As it can be observed, the undertakings have retry slots after each scheduled case and besides idle clip between them. ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 15

Figure 10: Critical undertakings ( Solid red ) and retry reserves ( green ) [ 2 ] .

cP0 and cP1 are the critical resource brace while the ncP0 is the non critical resource.

Figure 11 shows that in dedicated architectures the non critical undertakings are scheduled merely in the non-critical resource.

Figure 11: Dedication bounds NCT undertaking ‘s ( blue ) scheduling chance [ 2 ] .

This dedication of non-critical undertakings to non critical resource badly degrades the throughput of NCTs while the critical resource remains idle.

The relaxed dedication methodological analysis ( Figure 12 ) attempts to schedule the NCTs in the critical resources every bit good.

Figure 12: Relaxed Dedication increases the throughput of NCTs [ 2 ] . ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 16

In the programming procedure, at first the critical undertakings can be scheduled utilizing the List programming method. The programming is based on the laxness of each critical undertaking which is the difference between their latest/earliest finish clip and the arrival clip. And the undertaking with least laxness would acquire scheduled foremost. After this non-critical undertakings can be scheduled in the retry slots every bit good as in the idle period.

In instance a mistake ( transient or permanent ) is detected by the hardware comparator, the non-critical undertakings in the retry slots are pre-empted and the corresponding critical undertaking is executed once more in lock-step manner.

In relaxed dedication, the effectivity depends besides on the undertaking size ( executing clip ) of NCTs and CTs. When the ratio of CT to NCT sizes ( ? ) is greater than one, so it means NCTs have smaller size which means that they can be easy scheduled in the retry or idle slots. This would increase the work burden of CTRs and improves the public presentation. If the value of ? is less than 1, the noncritical undertaking ‘s sizes are larger and that would do it hard to acquire scheduled in the empty slots in the critical undertakings inactive agenda.

Nevertheless experimentations and surveies have proven that relaxed dedication significantly increases the rhythms for NCT executing. For illustration relaxed dedication provides 73 % more rhythms ( on an norm ) for NCTs in DMR architecture [ 2 ] .

3.2. Distributed Temporal Redundancy

Relaxed dedication attempts to better upon the defects of relaxed dedication by loosen uping the demand to put to death NCTs in NCTRs and schedule them together with CTs. Distributed temporal redundancy ( DTR ) is another methodological analysis which could every bit good be employed to better the public presentation.

Distributed temporal redundancy relaxes on two demands of traditional lock-step architecture,

i‚· That the critical undertakings are to be executed in lock-step and

i‚· The critical undertakings execute on Critical undertaking resources merely.

provided that this relaxation does fulfill the clip restraints ( deadlines ) of the difficult existent clip undertakings of the system.

This relaxation gives great advantage on public presentation and mistake tolerance of the system,

i‚· When the critical undertakings are relaxed from the lock-step, so they can be co-scheduled along with the NCTs and this will guarantee better work load burden distribution among all nucleuss.

ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 17

i‚· DTR could besides supply a mistake localisation mechanism. Sing a double modular redundant system, when a mismatch is detected comparing the undertaking consequence of a critical undertaking brace, so the scheduler can originate that peculiar critical undertaking executing on the 3rd resource. The consequence from the 3rd resource can be compared with old consequences, which would assist to happen out or place the faulty constituent.

This implies that the system gives the dependability and mistake tolerance that of a TMR architecture at no extra cost ( for holding another critical resource ) .

3.2.1. Scheduling chances

In a traditional DMR system, the undertakings are being scheduled as shown in Figure 13.

Figure 13: Agenda in a DMR system, critical undertakings ( ruddy ) with retry reserves ( dashed red ) [ 3 ] .

The Texas indicates a critical undertaking case scheduled and rtx is the corresponding retry case of the undertaking.

DTR attack applies the construct of temporal redundancy ; which means two transcripts of the same undertaking are scheduled at two different times on the resources. And it besides relies on loosen uping the lock measure. This would in bend aid to co agenda NCTs and CTs together. The chief advantage is that the agenda can be derived based on optimisation and cost consideration. The joint programming gives the scheduler chance for better use of clip slots, sharing of peripherals and increase the work load of CTRs. With high use of CTRs it would be possible to salvage country in the bit allotted for non-critical resources.

A constriction with relaxed dedication based attack was that the larger NCTs ( longer executing clip ) were non able to use the smaller idle or retry slots in the critical resource ‘s agenda. DTR attack solves this job since the lock-step demand has been relaxed and co-scheduling is employed.

The agendas for the DTR applied system are drawn in such a manner that the relaxation in lock-step will non impact the deadlines of the critical-tasks. ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 18

Figure 14 shows a agenda utilizing DTR attack.

Figure 14: Relaxed lock measure agenda [ 3 ] .

The DTR attack would assist to schedule another set of NCTs, trebling their throughput. As seen in the Figure 14, the critical cases t2 and t1,2 are scheduled in critical resources c0 and c1 without using lock-step. It can be seen that NCTs necessitating larger executing clip could be schedulable in CTRs. The retry slots rt1,1, rt1,2 and rt2 have been reserved for executing on non critical resource nc0 in instance a mismatch from critical undertaking brace has been reported. On such a happening, the non-critical undertakings at that blink of an eye would be pre-empted.

The execution of DTR requires some alterations in the hardware. Since the undertakings are non executed in lock-step, buffers are to be introduced to hive away the latest executed undertaking consequence. The buffer value has to be retained until all transcripts of the undertaking have been executed. In the above undertaking set undertaking t1,2 and t2 are to be buffered until their transcript is executed in another resource.

3.2.2. Design considerations for implementing DTR in a mistake tolerant architecture.

The DTR construct can be ported to the lock-step architectures which have been discussed antecedently. In the lock-step architecture as seen in Figure 15 two CPUs ( c0 and c1 ) run the same codification together and a hardware comparator proctors both the reference and informations end products before being sent to memory subsystem.

Figure 15: Cores put to deathing in lock-step, informations and addressed compared before being send to memory subsystem [ 3 ] . ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 19

When DTR is applied to system, lock-step is relaxed, so that the critical undertakings execute at different times on the resources. Hence it is required that buffers to hive away the old undertaking consequences. But there can be a batch of information and it may take to high storage demand increasing the cost. In order to cut down storage of excessively many information, fingerprinting methodological analysis could be utilised [ 14 ] .

Fingerprinting method relies on cyclic redundancy cheque ( CRC ) to compact and hive away the alterations in architectural registries, memory values and burden and shop references. CRC has really good mistake sensing capableness, for illustration a 16-bit CRC has 0.99998 chance of observing an mistake. Though fingerprinting takes clip for mistake sensing, the scheduling algorithm helps it to finish the computations before undertaking deadline.

Figure 16 depicts the finger print attack adopted in multi nucleus system utilizing DTR. The finger prints are collected from each undertaking and buffered. When all the transcripts of the undertaking have been executed, the buffered values are compared by the hardware comparator. If the comparing resulted in an exact lucifer so one transcript of the alterations to external province ( due to the undertaking ) can be released as end product. Otherwise a 3rd undertaking executing can be scheduled in another resource. Upon completion of 3rd undertaking, the old buffered values can be compared to latest and the faulty constituent could be recognised.

Figure 16: Changes to registries, memory and memory addressed are accumulated in a individual CRC fingerprint ; fingerprint comparing has a high chance of observing failures independent of accrued alterations [ 3 ] .

The buffers could itself capable to soft mistakes and lasting mistakes and hence their content could be protected with ECC codifications.

The NCT undertakings are non fingerprinted or buffered, when they execute on critical resources. NCT and CT intervention can be prevented with memory protection strategies and with the aid of scheduling algorithm. ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 20

3.2.3. Performance Estimation and comparings of consequences of DTR and RD

The public presentation of DTR is estimated with the aid of fake tempering and iterative programming. The fake tempering technique performs substitution to delegate NCTs to the resources and so uses iterative scheduling to schedule as many NCTs as possible. The experiment apparatus assumes M brace of critical resources and N brace of non-critical resources, a finite set of periodic CTs and infinite set of NCTs.

The annealer builds up the assignment tree by commuting the possible assignments of NCTs on processors/resources. Figure 16 shows an assignment tree of NCTs for two resources.

Figure 17: Performance appraisal assignment tree for two resources [ 3 ] .

The tree has different degrees, for e.g. : degree L0 implies a zero assignment while L1 shows two undertaking assignments on two different resources. Nodes 11, 16,17,4 are called terminal assignment as assignment after them can non be scheduled or the list agenda will bring forth an invalid agenda. That is no more NCTs can be scheduled without compromising on undertaking deadlines. The annealer tries to happen the optimum terminal assignment which could give highest resource use.

When comparing the public presentations of DTR and RD, relaxed dedication has restriction with the comparative undertaking size of critical and non critical undertakings. For illustration when the DMR system had critical undertaking size larger than the non critical undertaking size, it fared better. This was due to the fact that NCT undertaking size needed to be smaller so that it fits into retry/idle slots.

The DTR applied system does non hold the dependance on comparative undertaking lengths since CTs and NCTs are co scheduled. The system fares better compared to RD in footings of consistence and figure of rhythms used for NCT executing. Experiments have shown that DTR could utilize 93 % of the theoretical rhythms for NCT executing and its public presentation over RD is approximately 11 % on an mean [ 3 ] . ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 21

The DTR could be utilized merely when the relaxation of lockstep for critical undertakings does non impact their difficult deadlines. The figure of undertakings waiting for comparing ( ‘in-flight undertakings ‘ ) is a cost concern, since this decides the buffer size and comparing logic which are implemented as hardware. The co-scheduling of CTs and NCTs increase the complexness of scheduling algorithm, but today there are several algorithms which can work out this complexness. For illustration algorithms like canvassing waiter and entire bandwidth algorithm could schedule nonperiodic undertakings among periodic critical undertakings without impacting the deadlines.

4. Decision

The tendency in current Si engineering for scaling down makes the electronic constituents vulnerable to lasting and soft mistakes. As automotive industry uses electronics for many of its safety applications, mistake tolerance to the system is of import.

This study introduces the safety demands in the automotive sector and concentrate on the available double nucleus hardware architectures which can be utilised for guaranting system-reliability. There are different multi-core solutions and based on the make up one’s minding factor of cost, mistake tolerance and calculation power, a suited architecture can be selected. SM Dual lock measure architecture has been found to be assuring in footings of supplying mistake tolerance and localization of function and presenting high computational power when needed.

The multi-core architecture guaranting mistake tolerance brings in new chance to increase computational power and cut down cost. This is of import as the automotive industry is cost goaded and so the mistake tolerance must non be an expensive option. The ulterior portion of the study focuses on the effectual programming mechanisms which could be adopted to expeditiously use the resources ( processor nucleuss ) . The relaxed dedication method increases the work load of critical resources and ensures a better throughput for non critical undertakings. This method works good with systems holding existent difficult deadlines for their undertakings. For systems with lesser rigorous demand, temporal redundancy can be employed. One such method, distributed temporal redundancy technique is an betterment over relaxed dedication. It would give a better public presentation and more rhythms for NCT executing and with the aid of scheduling algorithm it besides provides the mistake tolerance similar to TMR architecture. ITI Seminar: Safety of Automotive ICs Topic: Hardware-based Fault-Tolerance for Automotive Applications Sunil P 22