

2 Document Identifier: DSP2061

Date: 2023-06-13

Version: 1.0.0

# PLDM Accelerator Modeling

6 Supersedes: None

1

3

4

7 Document Class: Informational

8 Document Status: Published

9 Document Language: en-US

- 10 Copyright Notice
- 11 Copyright © 2013, 2015, 2017, 2019, 2023 DMTF. All rights reserved.
- 12 DMTF is a not-for-profit association of industry members dedicated to promoting enterprise and systems
- 13 management and interoperability. Members and non-members may reproduce DMTF specifications and
- 14 documents, provided that correct attribution is given. As DMTF specifications may be revised from time to
- time, the particular version and release date should always be noted.
- 16 Implementation of certain elements of this standard or proposed standard may be subject to third-party
- 17 patent rights, including provisional patent rights (herein "patent rights"). DMTF makes no representations
- 18 to users of the standard as to the existence of such rights, and is not responsible to recognize, disclose,
- or identify any or all such third-party patent rights, owners, or claimants, nor for any incomplete or
- 20 inaccurate identification or disclosure of such rights, owners, or claimants. DMTF shall have no liability to
- any party, in any manner or circumstance, under any legal theory whatsoever, for failure to recognize,
- disclose, or identify any such third-party patent rights, or for such party's reliance on the standard or
- 23 incorporation thereof in its product, protocols or testing procedures. DMTF shall have no liability to any
- party implementing such standard, whether such implementation is foreseeable or not, nor to any patent
- 25 owner or claimant, and shall have no liability or responsibility for costs or losses incurred if a standard is
- 26 withdrawn or modified after publication, and shall be indemnified and held harmless by any party
- 27 implementing the standard from any and all claims of infringement by a patent owner for such
- 28 implementations.
- 29 PCI-SIG, PCIe, and the PCI HOT PLUG design mark are registered trademarks or service marks of PCI-
- 30 SIG.
- 31 All other marks and brands are the property of their respective owners.
- 32 For information about patents held by third parties which have notified the DMTF that, in their opinion,
- 33 such patents may relate to or impact implementations of DMTF standards, visit
- 34 https://www.dmtf.org/about/policies/disclosures.php.
- 35 This document's normative language is English. Translation into other languages is permitted.

# 36 CONTENTS

| 37                   | For | eword |            |                                                        | 7  |
|----------------------|-----|-------|------------|--------------------------------------------------------|----|
| 38                   |     |       |            |                                                        |    |
| 39                   |     |       |            | nventions                                              |    |
| 40                   |     | Dooc  |            | aphical conventions                                    |    |
| 41                   |     |       |            | usage conventions                                      |    |
|                      |     | 0     |            | •                                                      |    |
| 42                   | 1   | -     |            |                                                        |    |
| 43                   | 2   | Norm  | native ref | erences                                                | 9  |
| 44                   | 3   | Term  | is and de  | efinitions                                             | 10 |
| 45                   | 4   | Svml  | ools and   | abbreviated terms                                      | 11 |
| 46                   | 5   | -     |            | rator Modeling overview                                |    |
| 47                   | 0   | 5.1   |            | al                                                     |    |
| 48                   |     | 5.2   |            | elements                                               |    |
| 49                   |     | 5.2   | 5.2.1      | PLDM terminus                                          |    |
| <del>4</del> 9<br>50 |     |       | 5.2.1      | Accelerator card                                       |    |
| 50<br>51             |     |       | 5.2.2      |                                                        |    |
|                      |     |       |            | Accelerator                                            |    |
| 52                   |     |       | 5.2.4      | Memory                                                 |    |
| 53                   |     | - 0   | 5.2.5      | Inter-Accelerator card connection                      |    |
| 54                   |     | 5.3   |            | sensors                                                |    |
| 55                   |     |       | 5.3.1      | General                                                |    |
| 56                   |     |       | 5.3.2      | Accelerator card temperature sensor                    |    |
| 57                   |     |       | 5.3.3      | Accelerator card power sensor                          |    |
| 58                   |     |       | 5.3.4      | Accelerator card fan speed sensor                      |    |
| 59                   |     |       | 5.3.5      | Accelerator card voltage sensor                        |    |
| 60                   |     |       | 5.3.6      | Accelerator card auxiliary device temperature sensor   | 15 |
| 61                   |     |       | 5.3.7      | Accelerator card auxiliary device health sensor        | 15 |
| 62                   |     |       | 5.3.8      | Accelerator card composite state sensor                | 15 |
| 63                   |     |       | 5.3.9      | Accelerator temperature sensor                         | 15 |
| 64                   |     |       | 5.3.10     | Accelerator power sensor                               | 15 |
| 65                   |     |       |            | Accelerator composite state sensor                     |    |
| 66                   |     |       |            | Accelerator clock speed sensor                         |    |
| 67                   |     |       |            | Memory temperature sensor                              |    |
| 68                   |     |       |            | Memory error statistics                                |    |
| 69                   |     |       |            | Memory composite state sensor                          |    |
| 70                   |     | 5.4   |            | thy description of the Accelerator card model elements |    |
| 71                   |     | 0.1   | 5.4.1      | General                                                |    |
| 72                   |     |       | 5.4.2      | Physical entities association                          |    |
| 73                   |     |       | 5.4.3      | Logical entity association                             |    |
| 74                   |     |       | 5.4.4      | Sensor association                                     |    |
| 7 <del>5</del>       |     | 5.5   |            | nt PLDM Type IDs                                       |    |
| 76                   |     | 5.6   |            | ration                                                 |    |
|                      |     | 5.0   | 5.6.1      |                                                        |    |
| 77<br>70             |     |       | 5.6.2      | General Enumeration scheme                             |    |
| 78<br>70             |     | E 7   |            |                                                        |    |
| 79                   |     | 5.7   |            | illustration                                           |    |
| 80                   |     |       | 5.7.1      | General                                                |    |
| 81                   |     |       | 5.7.2      | Accelerator Card                                       |    |
| 82                   |     |       | 5.7.3      | Accelerator                                            |    |
| 83                   |     |       | 5.7.4      | Memory                                                 |    |
| 84                   |     | 5.8   |            |                                                        |    |
| 85                   |     |       | 5.8.1      | General                                                |    |
| 86                   |     |       | 5.8.2      | Accelerator firmware version change                    |    |
| 87                   |     |       | 5.8.3      | Health and state sensors events notifications          |    |
| 88                   | 6   | Mode  | el use ex  | ample                                                  | 23 |

### **DSP2061**

| 89  | 6.1     | Genera   | al                                                   | 23 |
|-----|---------|----------|------------------------------------------------------|----|
| 90  | 6.2     | Model    | hierarchy                                            | 24 |
| 91  | 6.3     | Top-le   | vel TID                                              | 25 |
| 92  | 6.4     | Accele   | erator card                                          | 25 |
| 93  |         | 6.4.1    | General                                              | 25 |
| 94  |         | 6.4.2    | Accelerator card power sensor                        | 27 |
| 95  |         | 6.4.3    | Accelerator card temperature sensor                  | 27 |
| 96  |         | 6.4.4    | Accelerator card fan speed sensor                    | 27 |
| 97  |         | 6.4.5    | Accelerator card voltage sensor                      | 28 |
| 98  |         | 6.4.6    | Accelerator card auxiliary device temperature sensor | 28 |
| 99  |         | 6.4.7    | Accelerator card auxiliary device health sensor      | 28 |
| 100 |         | 6.4.8    | Accelerator card composite state sensor              | 29 |
| 101 | 6.5     | Accele   | erator                                               | 29 |
| 102 |         | 6.5.1    | General                                              | 29 |
| 103 |         | 6.5.2    | Accelerator temperature sensor                       | 30 |
| 104 |         | 6.5.3    | Accelerator power sensor                             | 31 |
| 105 |         | 6.5.4    | Accelerator composite state sensor                   | 32 |
| 106 |         | 6.5.5    | Accelerator clock speed sensor                       | 33 |
| 107 | 6.6     | Memoi    | ry                                                   | 33 |
| 801 |         | 6.6.1    | General                                              | 33 |
| 109 |         | 6.6.2    | Memory temperature sensor                            | 34 |
| 110 |         | 6.6.3    | Memory error statistics sensors                      | 34 |
| 111 |         | 6.6.4    | Memory composite state sensor                        | 35 |
| 112 | ANNEX A | (informa | ative) Notation and conventions                      | 36 |
| 113 | ANNEX B | (informa | ative) Change log                                    | 37 |
|     |         |          |                                                      |    |

# **Figures**

| 116 | Figure 1 – Inter-Accelerator card connection                                      | 13 |
|-----|-----------------------------------------------------------------------------------|----|
| 117 | Figure 2 – Accelerator card PLDM model diagram                                    | 14 |
| 118 | Figure 3 – Hierarchy description using containerEntityContainerID referencing the |    |
| 119 | containedEntityContainerID                                                        |    |
| 120 | Figure 4 – Defining a logical association                                         |    |
| 121 | Figure 5 – Top-level sensor association                                           |    |
| 122 | Figure 6 – Example model diagram                                                  |    |
| 123 | Figure 7 – Accelerator card model hierarchy                                       |    |
| 124 | Figure 8 – Accelerator card level elements                                        |    |
| 125 | Figure 9 – Accelerator card container PDR                                         |    |
| 126 | Figure 10 – Accelerator card power sensor PDR                                     |    |
| 127 | Figure 11 – Ambient Temperature sensor PDR                                        |    |
| 128 | Figure 12 – Accelerator card fan speed sensor PDR                                 | 27 |
| 129 | Figure 13 – Accelerator card voltage sensor PDR                                   | 28 |
| 130 | Figure 14 – Auxiliary device temperature sensor PDR                               | 28 |
| 131 | Figure 15 – Auxiliary device health sensor PDR                                    | 28 |
| 132 | Figure 16 – Accelerator card composite state sensor PDR                           | 29 |
| 133 | Figure 17 – Example model Accelerator                                             |    |
| 134 | Figure 18 – Accelerator entity association PDR                                    |    |
| 135 | Figure 19 – Accelerator temperature sensor PDR                                    | 30 |
| 136 | Figure 20 – Accelerator power sensor PDR                                          | 31 |
| 137 | Figure 21 – Accelerator composite state sensor PDR                                | 32 |
| 138 | Figure 22 – Accelerator card clock speed sensor PDR                               |    |
| 139 | Figure 23 – Example Memory model                                                  | 33 |
| 140 | Figure 24 – Memory association PDR                                                | 34 |
| 141 | Figure 25 – Memory temperature sensor PDR                                         | 34 |
| 142 | Figure 26 – Memory correctable errors PDR                                         | 34 |
| 143 | Figure 27 – Memory uncorrectable errors PDR                                       | 35 |
| 144 | Figure 28 – Memory composite state sensor PDR                                     |    |
| 145 |                                                                                   |    |

# **DSP2061**

| 7 | Га | h  | عما |
|---|----|----|-----|
|   |    | LJ |     |

| 147 | Table 1 – Type IDs used in the Accelerator card model | 19 |
|-----|-------------------------------------------------------|----|
|     | Table 2 – Chosen enumeration limits in the model      |    |
| 149 | Table 3 – Example Enumeration Scheme with Type IDs    | 21 |
| 150 | Table 4 – TID PDR                                     | 25 |
| 151 |                                                       |    |

| 152        |         | Foreword                                                                                                                                                                                                                |
|------------|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 153<br>154 |         | OM Accelerator Modeling (DSP2061) document was prepared by the Platform Management nications Infrastructure (PMCI) Working Group of the DMTF.                                                                           |
| 155<br>156 |         | a not-for-profit association of industry members dedicated to promoting enterprise and systems ment and interoperability. For information about the DMTF, see <a href="https://www.dmtf.org">https://www.dmtf.org</a> . |
| 157        | Acknow  | vledgments                                                                                                                                                                                                              |
| 158        | The DM  | TF acknowledges the following individuals for their contributions to this document:                                                                                                                                     |
| 159        | Editors |                                                                                                                                                                                                                         |
| 160        | •       | Rama Rao Bisa – Dell Technologies                                                                                                                                                                                       |
| 161        | •       | Pavan Kumar Gavvala – Dell Technologies                                                                                                                                                                                 |
| 162        | Contrib | utors:                                                                                                                                                                                                                  |
| 163        | •       | Bob Stevens – Dell Technologies                                                                                                                                                                                         |
| 164        | •       | Hemal Shah – Broadcom Inc.                                                                                                                                                                                              |
| 165        | •       | Patrick Caporale – Lenovo                                                                                                                                                                                               |
| 166        | •       | Yuval Itkin – Nvidia                                                                                                                                                                                                    |
| 167        | •       | Eliel Louzoun – Intel Corporation                                                                                                                                                                                       |
| 168        | •       | Ryan Weldon – Groq                                                                                                                                                                                                      |
| 169        | •       | Deepak Kodihalli – Nvidia                                                                                                                                                                                               |
| 170        | •       | Pierre-Philippe Stevens – Advanced Micro Devices                                                                                                                                                                        |
| 171        | •       | Michael Garner – Meta                                                                                                                                                                                                   |

| 172        | Introduction                                                                                                                                                                                                                |
|------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 173<br>174 | This document describes a modeling scheme for an Accelerator card using PLDM for Platform Monitoring and Control <a href="DSP0248">DSP0248</a> semantics.                                                                   |
| 175        | Document conventions                                                                                                                                                                                                        |
| 176        | Typographical conventions                                                                                                                                                                                                   |
| 177        | The following typographical conventions are used in this document:                                                                                                                                                          |
| 178        | <ul> <li>Document titles are marked in italics.</li> </ul>                                                                                                                                                                  |
| 179        | <ul> <li>Important terms that are used for the first time are marked in italics.</li> </ul>                                                                                                                                 |
| 180<br>181 | <ul> <li>Terms include a link to the term definition in the "Terms and definitions" clause, enabling easy<br/>navigation to the term definition.</li> </ul>                                                                 |
| 182        | ABNF rules are in monospaced font.                                                                                                                                                                                          |
| 183        | ABNF usage conventions                                                                                                                                                                                                      |
| 184<br>185 | Format definitions in this document are specified using ABNF (see RFC5234), with the following deviations:                                                                                                                  |
| 186<br>187 | <ul> <li>Literal strings are to be interpreted as case-sensitive Unicode characters, as opposed to the<br/>definition in <u>RFC5234</u> that interprets literal strings as case-insensitive US-ASCII characters.</li> </ul> |
| 188        | Reserved and unassigned values                                                                                                                                                                                              |
| 189<br>190 | Unless otherwise specified, any reserved, unspecified, or unassigned values in enumerations or other numeric ranges are reserved for future definition by the DMTF.                                                         |
| 191<br>192 | Unless otherwise specified, numeric or bit fields that are designated as reserved shall be written as 0 (zero) and ignored when read.                                                                                       |
| 193        | Byte ordering                                                                                                                                                                                                               |
| 194<br>195 | Unless otherwise specified, byte ordering of multibyte numeric fields or bit fields is "Big Endian" (that is, the lower byte offset holds the most-significant byte, and higher offsets hold less-significant bytes).       |
| 196        | Other Conventions                                                                                                                                                                                                           |
| 197        | See ANNEX A for other conventions                                                                                                                                                                                           |

#### **DSP2061**

198

213

# 1 Scope

- 199 This document defines an example data model for implementing the systems management of
- 200 accelerators using PLDM for Platform Monitoring and Control <u>DSP0248</u> semantics. This document
- 201 establishes a common framework that can provide implementation consistency between a system's
- 202 Management Controller and accelerators and accelerator cards the system contains, focusing on FPGAs
- and GPUs and similar devices that offload processing from the host CPU. This data model is assumed to
- be extensible to a variety of physical implementations and should not be construed to be limited to the
- 205 examples herein.
- 206 Accelerators and Accelerator card implementations may include ancillary features such as networking
- and storage that have management schemas defined in other data models and Specifications. The
- 208 management of those features is outside the scope of this data model. The data model provided here
- 209 focuses on the management of the accelerator features of the card, but composite sensors that return
- overall card status for example, may include metadata from those other functional areas. For instance, it
- 211 may be appropriate to use either DSP2054 or DSP0222 for the management of networking features that
- 212 may be included on the accelerator or card.

### 2 Normative references

- The following referenced documents are indispensable for the application of this document. For dated or
- versioned references, only the edition cited (including any corrigenda or DMTF update versions) applies.
- 216 For references without a date or version, the latest published edition of the referenced document
- 217 (including any corrigenda or DMTF update versions) applies. Unless otherwise specified, for DMTF
- documents this means any document version that has minor or update version numbers that are later
- 219 than those for the referenced document. The major version numbers must match the major version
- 220 number given for the referenced document.
- 221 DMTF DSP0222, Network Controller Sideband Interface (NC-SI) Specification 1.1,
- 222 https://www.dmtf.org/sites/default/files/standards/documents/DSP0222 1.1.0.pdf
- 223 DMTF DSP0236, MCTP Base Specification 1.3,
- 224 https://www.dmtf.org/sites/default/files/standards/documents/DSP0236 1.3.0.pdf
- 225 DMTF DSP0240. Platform Level Data Model (PLDM) Base Specification 1.1.
- 226 https://www.dmtf.org/sites/default/files/standards/documents/DSP0240 1.1.0.pdf
- 227 DMTF DSP0241, Platform Level Data Model (PLDM) Over MCTP Binding Specification 1.0,
- 228 https://www.dmtf.org/sites/default/files/standards/documents/DSP0241 1.0.0.pdf
- 229 DMTF DSP0245, Platform Level Data Model (PLDM) IDs and Codes Specification 1.3,
- 230 https://www.dmtf.org/sites/default/files/standards/documents/DSP0245 1.3.0.pdf
- 231 DMTF DSP0248, Platform Level Data Model (PLDM) for Platform Monitoring and Control Specification
- 232 1.2, https://www.dmtf.org/sites/default/files/standards/documents/DSP0248 1.2.0.pdf
- 233 DMTF DSP0249, Platform Level Data Model (PLDM) State Sets Specification 1.1,
- 234 https://www.dmtf.org/sites/default/files/standards/documents/DSP0249 1.1.0.pdf
- 235 DMTF DSP0257, Platform Level Data Model (PLDM) FRU Data Specification 1.0,
- 236 https://www.dmtf.org/sites/default/files/standards/documents/DSP0257 1.0.0.pdf
- 237 DMTF DSP0267, Platform Level Data Model (PLDM) for Firmware Update Specification 1.1,
- 238 https://www.dmtf.org/sites/default/files/standards/documents/DSP0267 1.1.0.pdf
- 239 DMTF DSP2054, Platform Level Data Model (PLDM) NIC Modeling Specification 1.0,
- 240 https://dmtf.org/sites/default/files/standards/documents/DSP2054 1.0.0.pdf

#### **DSP2061**

- 241 IETF RFC2781, *UTF-16*, an encoding of ISO 10646, February 2000,
- 242 https://www.ietf.org/rfc/rfc2781.txt
- 243 IETF STD63, UTF-8, a transformation format of ISO 10646 https://www.ietf.org/rfc/std/std63.txt
- 244 IETF RFC4122, A Universally Unique Identifier (UUID) URN Namespace, July 2005,
- 245 https://www.ietf.org/rfc/rfc4122.txt
- 246 IETF RFC4646, Tags for Identifying Languages, September 2006,
- 247 https://www.ietf.org/rfc/rfc4646.txt
- 248 ISO 8859-1, Final Text of DIS 8859-1, 8-bit single-byte coded graphic character sets Part 1: Latin
- 249 alphabet No.1, February 1998
- 250 ISO/IEC Directives, Part 2, Rules for the structure and drafting of International Standards,
- 251 https://www.iso.org/sites/directives/current/part2/index.xhtml
- 252 IETF RFC5234, ABNF: Augmented BNF for Syntax Specifications, January 2008,
- 253 https://tools.ietf.org/html/rfc5234

#### 3 Terms and definitions

- 255 In this document, some terms have a specific meaning beyond the normal English meaning. Those terms
- are defined in this clause.

- The terms "shall" ("required"), "shall not", "should" ("recommended"), "should not" ("not recommended"),
- 258 "may", "need not" ("not required"), "can" and "cannot" in this document are to be interpreted as described
- 259 in ISO/IEC Directives, Part 2, Clause 7. The terms in parentheses are alternatives for the preceding term,
- 260 for use in exceptional cases when the preceding term cannot be used for linguistic reasons. Note that
- 261 ISO/IEC Directives, Part 2, Clause 7 specifies additional alternatives. Occurrences of such additional
- alternatives shall be interpreted in their normal English meaning.
- The terms "clause", "subclause", "paragraph", and "annex" in this document are to be interpreted as
- described in ISO/IEC Directives, Part 2, Clause 6.
- The terms "normative" and "informative" in this document are to be interpreted as described in ISO/IEC
- 266 Directives, Part 2, Clause 3. In this document, clauses, subclauses, or annexes labeled "(informative)" do
- 267 not contain normative content. Notes and examples are always informative elements.
- 268 Refer to DSP0240 for terms and definitions that are used across the PLDM specifications.

# 269 4 Symbols and abbreviated terms

- 270 Refer to <u>DSP0240</u> and <u>DSP0248</u> for symbols and abbreviated terms that are used across the PLDM
- 271 specifications. For the purposes of this document, the following additional symbols and abbreviated terms
- 272 apply.
- **273 4.1**
- 274 **PCB**
- 275 Printed Circuit Board
- 276 **4.2**
- 277 **FPGA**
- 278 Field Programmable Gate Array
- **279 4.3**
- 280 **GPU**
- 281 Graphics Processing Unit

# 5 PLDM Accelerator Modeling overview

#### 283 **5.1 General**

282

294

295

297

- 284 This document describes a hierarchical modeling scheme for an Accelerator card using PLDM for
- 285 Platform Monitoring and Control <u>DSP0248</u> semantics. The model is scalable, allowing consistent
- 286 modeling of Accelerator cards with different configuration options such as the number of Accelerators.
- While PLDM for Platform Monitoring and Control DSP0248 is a published standard, using the model
- defined in this document simplifies interoperability by establishing a consistent schema.
- The basic format that is used for sending PLDM messages is defined in <a href="DSP0240">DSP0240</a>. The format that is
- 290 used for carrying PLDM messages over a transport-layer protocol and medium is given in companion
- documents to the base specification. For example, DSP0241 defines how PLDM messages are formatted
- and sent using MCTP as the transport.
- 293 The model supports the following:
  - Consistent modeling of an Accelerator card regardless of the specific configuration and resource count
- Accelerator card hardware structure description
  - Reporting of configuration changes such as firmware update

#### 298 5.2 Model elements

#### 299 **5.2.1 PLDM terminus**

- 300 PLDM for Platform Monitoring and Control <u>DSP0248</u> defines a single root for every device instance,
- referred to as PLDM Terminus and identified with a TID. The term "MC" is used to identify a PLDM
- 302 terminus which communicates with an Accelerator card throughout this document.
- 303 When there are multiple Accelerators assembled on the same card, there may be a single Accelerator
- 304 which reports all the sensors of all the elements on the Accelerator card to the MC. Alternatively, each
- 305 Accelerator in the Accelerator card may present a separate PLDM terminus.
- 306 PLDM for Platform Monitoring and Control <u>DSP0248</u> does not allow associating components reported via
- 307 different PLDM termini since every database is relative to a given PLDM terminus. To overcome this
- 308 constraint, the implementers can retrieve a globally unique ID (Board part number and serial number)
- 309 from each TID and recognize these TIDs belonging to the same Accelerator card. The process to retrieve
- the globally unique ID (Board part number and serial number) from each TID is outside of this document.
- 311 All PLDM IDs specified by the model in this document shall be consistent across all TIDs on a given card.
- This avoids conflict from duplication of IDs in the combined model, generated by merging the TID-specific
- 313 model elements reported as part of the overall model.

#### 314 5.2.2 Accelerator card

- 315 In this model, the Accelerator card is the top-level element of the hierarchy containing one or more
- 316 Accelerators on a PCB. An Accelerator card is a hardware and software solution that offloads certain
- 317 processing from the host processor. The Accelerator card in this document refers to various form factors
- and is represented with PLDM Entity ID code 68 for Add-in card. The Accelerator card may contain
- 319 sensors.

#### 320 **5.2.3 Accelerator**

- 321 In this model, an Accelerator is the second level element of the hierarchy containing one or more sensors.
- 322 An Accelerator is a hardware device with a main function of offloading certain processing from the host

329

330

331

340

341

342

343

344

345

346

347 348

349 350

351

processor. An Accelerator may contain sensors such as health state, power-consumption, and temperature.

#### 5.2.4 Memory

The term "memory" in this document covers the internal memory of the Accelerator, memory chips installed on the PCB, and the DIMMs. In this model, the memory is at the second level of the hierarchy. A Memory may contain sensors such as temperature, health state, and error statistics.

#### 5.2.5 Inter-Accelerator card connection

The Accelerator cards may support communication with each other. Figure 1 depicts an Inter-Accelerator card connection, and it may not be the only communication interface between Accelerator cards.



Figure 1 - Inter-Accelerator card connection

#### 5.3 Model sensors

#### 5.3.1 General

Attributes are reported by means of sensors. Numeric sensors are used to report specific measured attributes. State sensors report operational and/or health state. The default thresholds for all numeric sensors shall be set by the hardware vendor. The sensors can be associated with any entity such as the Accelerator card, Accelerator or Memory. The description of each sensor is applicable only for the implemented sensors and it is not mandatory to implement all the sensors described in this document. There may be auxiliary devices present on the accelerator card and each auxiliary device may present its own set of sensors.

Note: The Sensor Auxiliary Names PDR is recommended to provide the proper name of each sensor.



Figure 2 – Accelerator card PLDM model diagram

### 5.3.2 Accelerator card temperature sensor

The temperature sensor on the Accelerator card reports the card's ambient temperature and is represented using a numeric sensor. There may be multiple temperature sensors installed on the Accelerator card.

#### 5.3.3 Accelerator card power sensor

The power sensor on the Accelerator card reports the estimated or measured aggregate power consumption of the Accelerator card and is represented using a numeric sensor. An Accelerator card which cannot accurately report its real-time power consumption may report its estimated maximal power. When there are multiple Accelerators on the same Accelerator card, there may be no visibility by any Accelerator to the real-time information of the other Accelerators. For this reason, this sensor is only implemented when there is only one Accelerator on the Accelerator card, or when there is a hardware sensor which does allow measuring and reporting the total card power consumption or when the maximal estimated power is reported without being measured or when the accelerators can communicate with each other.

#### 5.3.4 Accelerator card fan speed sensor

The fan speed sensor on the Accelerator card reports the speed of an active cooling fan and is represented using a numeric sensor. An Accelerator card may have multiple fans installed, each potentially with its own speed sensor.

### 5.3.5 Accelerator card voltage sensor

The voltage sensors on the Accelerator card report various voltages on the card and are represented using numeric sensors. There may be multiple voltage sensors installed on the card.

#### **DSP2061**

#### 375 **5.3.6 Accelerator card auxiliary device temperature sensor**

- 376 The temperature sensor on the auxiliary device reports the ambient temperature of the auxiliary device
- 377 and is represented using a numeric sensor. This document does not mandate having an auxiliary device
- 378 temperature sensor.

#### 379 5.3.7 Accelerator card auxiliary device health sensor

- 380 The health sensor on the auxiliary device reports the health state of the auxiliary device and is
- 381 represented using a state sensor. This document does not mandate having an auxiliary device health
- 382 sensor.

383

#### 5.3.8 Accelerator card composite state sensor

- 384 The Accelerator card composite state sensor combines the Accelerator card thermal state sensor, the
- 385 Memory operational fault state sensor, and the Accelerator card health state sensor. The Accelerator card
- 386 health state is the aggregated health state of all the components on the card. The reported aggregated
- 387 health state of the Accelerator card reflects the worst case of the reported health states for each of the
- 388 elements monitored in the model. For example, if an Accelerator health state is non-critical and a memory
- 389 heath state is critical, then the Accelerator card health state may be set to critical in the Accelerator card
- 390 composite state sensor.
- 391 When there are multiple Accelerators, there may be no visibility by any Accelerator to the real-time
- information of other Accelerators. For this reason, this composite state sensor is only implemented when
- there is only a single Accelerator on the Accelerator card or when the Accelerator card has the needed
- visibility of all the components such as Accelerators and memory.
- 395 To determine the respective sensor states, the following steps shall be used: the accelerator card thermal
- 396 state sensor shall also reflect the auxiliary device temperature and the accelerator card health state
- sensor shall also reflect the auxiliary device health state.

#### 398 **5.3.9 Accelerator temperature sensor**

- 399 The temperature sensor of the Accelerator reflects the device temperature and is represented using a
- 400 numeric sensor. This sensor is typically located in the thermally sensitive areas on the Accelerator.

#### 401 **5.3.10 Accelerator power sensor**

- 402 The power sensor on the Accelerator reports the estimated or measured power consumption of the
- 403 Accelerator and represented using a numeric sensor. An Accelerator which cannot accurately report its
- real-time power consumption may report its estimated maximal power.

#### 5.3.11 Accelerator composite state sensor

- 406 The Accelerator composite state sensor combines the Accelerator Thermal trip state, Accelerator health
- 407 state, Configuration valid state, Configuration change state, and Accelerator firmware version change
- 408 state. The MC can use this sensor to identify issues with the Accelerator and to identify the specific
- 409 maintenance operations that it needs to perform. These operations may include Accelerator reset,
- 410 system-level shutdown for thermal protection, and other system-level maintenance.

411

- Using the configuration change indication, the Accelerator notifies the MC to retrieve PDRs updated by
- the configuration change.
- When a firmware update is detected, the composite state sensor can reflect this event to the MC, allowing
- 415 the MC to take any action needed to respond to the update. Note that reading the new firmware version
- 416 may be performed by the MC using protocols other than PLDM for Platform Monitoring and Control
- 417 DSP0248, such as DSP0257 and/or DSP0267. Please note that firmware update only reflects the

- 418 conclusion of the firmware programming operation; it is device-specific whether this detection additionally
- 419 implies that new firmware is already active.

#### 420 5.3.12 Accelerator clock speed sensor

- 421 The clock speed sensor of the Accelerator is used to read the clock speed and is represented using
- 422 numeric sensors. An Accelerator may have multiple clock domains, each with its own clock speed sensor

### 423 **5.3.13 Memory temperature sensor**

- The temperature sensors on the memory modules and internal memory report the memory temperatures
- 425 and are represented using numeric sensors. There may be multiple memory temperature sensors
- installed on the internal memory, on the soldered memory, and on the DIMMs.
- The memory that is soldered on the Accelerator card PCB may not have a temperature sensor on them.
- In this case, the implementations may choose to have a temperature sensor near the soldered memory
- 429 chips calibrated to approximate the temperature of those memory devices.

### 430 **5.3.14 Memory error statistics**

- 431 The memory error statistics sensors report the memory error statistics (i.e., correctable errors and
- 432 uncorrectable errors) and are represented using numeric sensors. Refer to the "sensorUnits enumeration"
- 433 table in <u>DSP0248</u>.

#### 434 **5.3.15 Memory composite state sensor**

- The memory composite state sensor combines sensors such as memory health state sensor, memory
- cache state sensor, memory error state sensor, and memory redundant activity state sensor. The MC can
- 437 use this sensor to identify issues with the memory and to identify the specific maintenance operations that
- 438 it needs to perform. Refer to Table 11 (Memory-Related State Sets) of DSP0249 for all memory-related
- 439 sensors and their states.

### 440 5.4 Hierarchy description of the Accelerator card model elements

#### 441 **5.4.1 General**

- 442 PLDM Accelerator Modeling uses a hierarchical model. Refer to section 10 PLDM associations and
- section 11 Entity Association PDR of <u>DSP0248</u> to understand physical and logical associations.

#### 444 5.4.2 Physical entities association

- Physical association is defined in <u>DSP0248</u> as a method to associate components which are physically
- connected to each other. The model uses this concept to describe the following structures:
- Content of the Accelerator card PCB
- Content of the Accelerators
- Content of the Memory Modules
- 450 A hierarchy entity is defined using an entity association PDR identified with a unique *containerID*
- identifier parameter. The entity association PDR's *containerEntityContainerID* references the PDR in
- which the entity is contained. This entity association PDR shall also contain the contained entities defined
- 453 in DSP2054 for the elements shown inside the purple dotted line of Figure 2.
- 454 Figure 3 shows an example of how an Accelerator card entity association PDR references its container
- 455 entity and contained entities:

#### **Accelerator card Entity Association PDR**

| Container ID  | 100  |
|---------------|------|
| Record Handle | 1100 |

| Container Entity              |    |                |
|-------------------------------|----|----------------|
| Entity Type                   | 68 | Add-in<br>card |
| Entity Instance Number        | 1  |                |
| Container Entity Container ID | 0  | System         |

| Association Type | Physical to Physical |
|------------------|----------------------|
| Association Type | containment          |

| Contained Entity - Accelerator |     |                  |  |
|--------------------------------|-----|------------------|--|
| Entity Type                    | 149 | Accelerator      |  |
| Entity Instance Number         | 1   |                  |  |
| Container Entity Container ID  | 100 | Accelerator card |  |

| Contained Entity - Memory     |     |                  |  |
|-------------------------------|-----|------------------|--|
| Entity Type                   | 66  | Memory           |  |
| Entity Instance Number        | 1   |                  |  |
| Container Entity Container ID | 100 | Accelerator card |  |

457 458

Figure 3 – Hierarchy description using containerEntityContainerID referencing the containedEntityContainerID

460

461

462

463

464

459

### 5.4.3 Logical entity association

The <u>DSP0248</u> defines logical association as a method to associate components which collectively form a shared property yet are not physically part of the same component. This model uses logical association to describe the following structures:

Figure 4 shows logical association between an Accelerator and a memory module:

| Channel #1 Entity Association PDR |        |                                                                     |
|-----------------------------------|--------|---------------------------------------------------------------------|
| ,                                 |        |                                                                     |
| Container ID                      | 900    |                                                                     |
| Record Handle                     | 1180   |                                                                     |
|                                   |        |                                                                     |
| Container Entity                  |        |                                                                     |
| Entity Type                       | 79     | Processor/memory module (processor and memory together on a module) |
| Entity Instance Number            | 1      |                                                                     |
| Container Entity Container ID     | 100    | Accelerator card                                                    |
|                                   |        |                                                                     |
| Association Type                  | Logica | al containment                                                      |
|                                   |        |                                                                     |
| Contained Entity - Accelerator    |        | <del>,</del>                                                        |
| Entity Type                       | 149    | Accelerator                                                         |
| Entity Instance Number            | 1      |                                                                     |
| Container Entity Container ID     | 100    | Accelerator card                                                    |
|                                   |        |                                                                     |
| Contained Entity - Memory Module  |        |                                                                     |

468

473

Figure 4 - Defining a logical association

66

1

100

Memory module

Accelerator card

#### 5.4.4 Sensor association

**Entity Type** 

**Entity Instance Number** 

Container Entity Container ID

As per DSP0248, numeric and state sensors are not included inside entity association PDRs. They are instead associated to the measured entity by directly referencing the EntityContainerID, EntityType, and EntityInstanceNumber of the measured entity in an entity association PDR. A sensor is identified by a unique Sensor ID value.

#### 5.4.4.1 Associating a sensor at the top level

When associating a sensor to the top-level entity which is the system the association uses the top-level containerEntityType containerEntityInstanceNumber, and containerEntityContainerID parameters.

476 Figure 5 illustrates the association of a temperature sensor to the Accelerator card in the model.

478

479

480

481

482

483

484

485

486 487

488

489 490

493

494

495



Figure 5 - Top-level sensor association

### 5.5 Element PLDM Type IDs

The model uses the following Type ID for each component in the model, selected from the available types defined in <u>DSP0249</u>. The following table lists the chosen Type IDs used in the model:

Table 1 - Type IDs used in the Accelerator card model

| Component        | Type ID |
|------------------|---------|
| Accelerator card | 68      |
| Accelerator      | 149     |
| Memory Module    | 66      |

#### 5.6 Enumeration

#### 5.6.1 General

PLDM for Platform Monitoring and Control <u>DSP0248</u> uses enumerated IDs to define elements in the database. These IDs are labeled as:

- Container ID unique for each container PDR in the model database
- Instance ID unique for each entity type within a given hierarchy level
  - Handle ID unique ID for each PDR in the model database
  - Sensor ID unique for each sensor in the model database

The proposed model provides an example enumeration scheme for these IDs, allowing a reasonably scalable formulation. This model is only an example and implementations should not rely on these values.

#### 5.6.2 Enumeration scheme

The model assumes some maximal limits to define the enumerated values. These limits are provided as an example and can be adjusted according to the specific Accelerator card requirements.

Version 1.0.0 Published 19

The example model enumeration is designed to support an Accelerator card that does not exceed the following limits:

Table 2 - Chosen enumeration limits in the model

| Model Limit                             | Value |
|-----------------------------------------|-------|
| Max Accelerators                        | 10    |
| Max Memory Modules                      | 10    |
| Max board temperature sensors           | 10    |
| Max temperature sensors per Accelerator | 10    |

499

501

502

498

### 500 **Note:**

• If one of the above limits is insufficient for an Accelerator card, only the enumerated values will be affected and the model structure will not have to change.

Table 3 illustrates the enumeration scheme, calculated based on the above limits.

505

Table 3 – Example Enumeration Scheme with Type IDs

| ltem                                       | Max | Base<br>Container | Max<br>Container | Base<br>Handle | Max<br>Handle | Base<br>Sensor ID | Max<br>Sensor-ID | Base<br>Instance | Max<br>instance | Type-ID |
|--------------------------------------------|-----|-------------------|------------------|----------------|---------------|-------------------|------------------|------------------|-----------------|---------|
| Accelerator card                           | 1   | 100               |                  | 1100           |               |                   |                  | 1                | 1               | 68      |
| Accelerator card Composite State<br>Sensor | 1   |                   |                  | 1101           | 1101          | 5                 | 5                | 1                | 1               | 68      |
| Accelerator card Power Sensor              | 1   |                   |                  | 1102           | 1102          | 6                 | 6                | 1                | 1               | 68      |
| Accelerator card Temperature sensors       | 10  |                   |                  | 1130           | 1139          | 20                | 29               | 1                | 10              | 68      |
| Accelerator card fan speed sensor          | 10  |                   |                  | 1150           | 1159          | 40                | 49               | 1                | 10              | 68      |
| Accelerator card Voltage sensor            | 10  |                   |                  | 1170           | 1179          | 80                | 89               | 1                | 10              | 68      |
| Processor Memory Interface                 | 10  | 900               | 909              | 1180           | 1189          | 90                | 99               | 1                | 10              | 68      |
| Connectors                                 | 20  | 1040              | 1059             | 1190           | 1209          | 100               | 119              | 1                | 20              | 185     |
| Memory module                              | 10  | 1020              | 1029             | 1210           | 1219          |                   |                  | 1                | 10              | 66      |
| Memory composite state sensor              | 1   |                   |                  | 1220           | 1220          | 120               | 120              |                  | 1               | 66      |
| Memory temperature sensor                  | 20  |                   |                  | 1225           | 1244          | 125               | 144              | 1                | 20              | 66      |
| Memory module correctable Errors           | 10  |                   |                  | 1255           | 1264          | 150               | 159              |                  | 1               | 66      |
| Memory module uncorrectable Errors         | 10  |                   |                  | 1275           | 1284          | 180               | 189              |                  | 1               | 66      |
| Accelerators                               | 10  | 1000              | 1009             | 1295           | 1304          |                   |                  | 1                | 10              | 149     |
| Accelerator power sensor                   | 1   |                   |                  | 1310           | 1310          | 210               | 210              |                  | 1               | 149     |
| Accelerator State sensor                   | 1   |                   |                  | 1315           | 1315          | 220               | 220              |                  | 1               | 149     |
| Accelerator temperature sensor             | 10  |                   |                  | 1325           | 1334          | 240               | 249              | 1                | 10              | 149     |
| Accelerator clock speed sensor             | 10  |                   |                  | 1335           | 1344          | 260               | 269              | 1                | 10              | 149     |
| Accelerators Ports                         | 10  |                   |                  | 1345           | 1354          | 290               | 299              | 1                | 10              | 149     |
| Accelerators Port State                    | 10  |                   |                  | 1360           | 1369          | 320               | 329              | 1                | 10              | 149     |
| Accelerators Link Speed                    | 10  |                   |                  | 1380           | 1389          | 350               | 359              | 1                | 10              | 149     |
| Auxiliary Device Temp Sensor               | 1   |                   |                  | 1395           | 1395          | 380               | 380              |                  | 1               | 68      |
| Auxiliary Device health sensor             | 1   |                   |                  | 1400           | 1400          | 395               | 395              |                  | 1               | 68      |
| Plugs                                      | 20  | 1070              | 1089             | 1410           | 1429          | 410               | 429              | 1                | 20              | 214     |
| Plug Composite Sensor                      | 1   |                   |                  | 1430           | 1430          | 450               | 450              | 1                | 1               | 214     |
| Plug Power Sensor                          | 20  |                   |                  | 1440           | 1459          | 470               | 489              | 1                | 20              | 214     |
| Plug Temp Sensor                           | 10  |                   |                  | 1470           | 1479          | 510               | 519              | 1                | 10              | 214     |
| Cable                                      | 16  |                   |                  |                |               |                   |                  | 1                | 16              | 187     |
| Communication Channel                      | 100 | 800               | 899              | 1490           | 1589          |                   |                  | 1                | 100             | 79      |

| Calculated Model Constant | Model Sensors<br>described in this<br>doc | Common sensors<br>for NIC and<br>Accelerator | n/a |
|---------------------------|-------------------------------------------|----------------------------------------------|-----|
|---------------------------|-------------------------------------------|----------------------------------------------|-----|

| 507 | 5.7 | Model | ille | ustration |
|-----|-----|-------|------|-----------|
|-----|-----|-------|------|-----------|

#### 508 **5.7.1 General**

- 509 The Accelerator card PLDM model is a hierarchical model. The following subclauses describe the model
- for each of the hierarchy levels:

#### 511 5.7.2 Accelerator Card

- 512 The Accelerator card top level may contain the PCB card, Accelerators, Memory modules, one or more
- 513 thermal sensors, Accelerator card composite state sensor, Fan speed sensor, power sensor and voltage
- sensors. The PCB power consumption is represented with a power sensor. The Accelerator card
- operational state is represented by a composite state sensor. When there are multiple Accelerators on
- the same card, Accelerator card sensors are typically only reported by the first Accelerator. The
- 517 Accelerator card is responsible for determining the order of accelerators in the card. Note that the top-
- 518 level health state sensor of the composite state sensor may reflect the card level sensors and the health
- 519 states of Accelerators.
- Refer the purple dotted line in Figure 2 to the Network port link speed sensor, Network port link state
- 521 sensor, Pluggable module temperature sensor, pluggable module power sensor and Pluggable module
- 522 composite state sensor sections of DSP2054 specification for networking functionality.

#### **523 5.7.3 Accelerator**

- 524 The Accelerator hierarchy represents the active device (or one of multiple devices) that performs the
- 525 Accelerator control interface. An Accelerator is represented as a collection of sensors.

### 526 **5.7.4 Memory**

- 527 The Memory hierarchy represents a memory device (or one of multiple devices). A Memory is
- 528 represented as a collection of sensors.

#### 529 **5.8 Events**

#### 530 **5.8.1 General**

- This model supports using PLDM events as a method to notify the MC upon changes in the sensor
- readings/states as described in DSP2048. The following example events can be used with the model and
- the implementation may choose to have more events.

#### 534 5.8.2 Accelerator firmware version change

- 535 This event indicates to the MC that the firmware version of the Accelerator has changed. The MC may
- use the **GetPDRRepositoryInfo** command and check if the **timestamp** parameter value has changed
- since it last read the PDRs. The MC may update the whole PDR repository by re-reading all the PDRs.
- The value used for the *timestamp* can be a virtual time value initialized by the Accelerator at device
- 539 initialization.

540

#### 5.8.3 Health and state sensors events notifications

- 541 The sensors on the accelerator card may report a change in value, health, or state using a PLDM state or
- numeric sensor event. Providing such a notification can significantly shorten the response time, compared
- to waiting for the MC to poll the sensors, for an occurrence that requires the MC to take an action such as
- increasing the airflow from a cooling fan.

549

550

# 6 Model use example

| 546 | 6.1 | General |
|-----|-----|---------|
|-----|-----|---------|

The following example for modeling an Accelerator card using PLDM for Platform Monitoring and Control DSP0248 describes an Accelerator card with the following attributes:

- Accelerator Card
- Temperature Sensor
- 551 o State Sensor
- 552 o Fan speed Sensor
- 553 o Voltage Sensors
- 554 o Power Sensor
- 555 o Auxiliary Device Temperature Sensor
- 556 o Auxiliary Device Health Sensor
- Accelerator
- 558 o Temperature Sensor
- 559 o Power Sensor
- 560 o State Sensor
- o Clock speed Sensor
- 562 Memory
- 563 o Temperature Sensor
- o Memory State Sensor
- 565 o Memory Error statistics Sensor
- Figure 6 illustrates the model which is used in the example.



568

569

570

Figure 6 - Example model diagram

# 6.2 Model hierarchy

The model PDRs identify the elements depicted in Figure 6. The hierarchies are illustrated in the following diagram. For simplicity, Figure 7 shows sensors of Accelerator and Memory Module.



Figure 7 - Accelerator card model hierarchy

### 6.3 Top-level TID

The terminus ID is identified by the terminus locator PDR. The TID defines the top-level entry point to the PLDM model. Because there is only one Accelerator on the Accelerator card in this example, there is only one TID.

Table 4 – TID PDR

| Field name            | Value | Description                       |
|-----------------------|-------|-----------------------------------|
| Container ID          | 0     | System                            |
| TID                   |       | Assigned by MC                    |
| Record Handle         | 1100  | Opaque number                     |
| Terminus Locator Size | 1     | Size of (EID) or size of (UID)    |
| Terminus Locator Type | 1     | MCTP EID                          |
| EID                   | EID   | MCTP assigned EID Value           |
| UID                   | UID   | Vendor provided UUID format value |

The TID value is assigned to the terminus by the MC. When the transport layer is MCTP, the identification of the terminus is performed using the Endpoint ID (EID) value. When using PLDM over RBT, the terminus locator PDR shall use the UID (instead of EID). The UID value in the terminus locator PDR uses the device UUID value as the terminus UID. For more information regarding terminus locator PDR see <a href="DSP0248">DSP0248</a>

#### 6.4 Accelerator card

#### 6.4.1 General

The top level of the model is the Accelerator card. The Accelerator card includes the physical elements which are an Accelerator (only one Accelerator in this example) and a memory module (only one memory module in this example).



Figure 8 - Accelerator card level elements

The sensors on the Accelerator card level are described using a reference to the measured entity, independent of the container that includes all the physical elements on the Accelerator card.

### **Accelerator card Entity Association PDR**

| Container ID  | 100  |
|---------------|------|
| Record Handle | 1100 |

| Container Entity              |    |             |  |  |
|-------------------------------|----|-------------|--|--|
| Entity Type                   | 68 | Add-In card |  |  |
| Entity Instance Number        | 1  |             |  |  |
| Container Entity Container ID | 0  | System      |  |  |

| Association Type | Physical to Physical containment |
|------------------|----------------------------------|
|------------------|----------------------------------|

| Contained Entity – Accelerator |     |                  |  |  |
|--------------------------------|-----|------------------|--|--|
| Entity Type 149 Accelerator    |     |                  |  |  |
| Entity Instance Number         | 1   |                  |  |  |
| Contained Entity Container ID  | 100 | Accelerator card |  |  |

| Contained Entity – Memory     |     |                  |  |
|-------------------------------|-----|------------------|--|
| Entity Type 66 Memory         |     |                  |  |
| Entity Instance Number        | 1   |                  |  |
| Contained Entity Container ID | 100 | Accelerator card |  |

Figure 9 – Accelerator card container PDR

Note that the Accelerator card container ID, 100, is referenced by the sensors not included in the entity association PDR. The enumeration model shown in

Table 3 includes the container ID for every hierarchy level.

599 600

596

### 6.4.2 Accelerator card power sensor

602

601

| Field           | Value | Description                 |
|-----------------|-------|-----------------------------|
| Record Handle   | 1102  |                             |
| Sensor ID       | 6     |                             |
| Entity Type     | 68    | Add-In card                 |
| Entity Instance | 1     | Accelerator card Instance # |
| Container ID    | 0     | System                      |
| Base Unit       | 7     | Watts                       |
| Unit Modifier   | -1    | 0.1 watt resolution         |

Figure 10 – Accelerator card power sensor PDR

# 6.4.3 Accelerator card temperature sensor

605

603

604

| Field           | Value | Description                 |
|-----------------|-------|-----------------------------|
| Record Handle   | 1130  |                             |
| Sensor ID       | 20    |                             |
| Entity Type     | 68    | Add-In card                 |
| Entity Instance | 1     | Accelerator card Instance # |
| Container ID    | 0     | System                      |
| Base Unit       | 2     | Degrees Celsius             |
| Unit Modifier   | 0     | No need for scaling         |

Figure 11 – Ambient Temperature sensor PDR

# 6.4.4 Accelerator card fan speed sensor

607608

606

| Field           | Value | Description                 |
|-----------------|-------|-----------------------------|
| Record Handle   | 1150  |                             |
| Sensor ID       | 40    |                             |
| Entity Type     | 68    | Add-In card                 |
| Entity Instance | 1     | Accelerator card Instance # |
| Container ID    | 0     | System                      |
| Base Unit       | 19    | RPM                         |
| Unit Modifier   | 0     | No need for scaling         |

Figure 12 – Accelerator card fan speed sensor PDR

# 6.4.5 Accelerator card voltage sensor

611

610

| Field           | Value | Description                 |
|-----------------|-------|-----------------------------|
| Record Handle   | 1170  |                             |
| Sensor ID       | 80    |                             |
| Entity Type     | 68    | Add-In card                 |
| Entity Instance | 1     | Accelerator card Instance # |
| Container ID    | 0     | System                      |
| Base Unit       | 5     | Volts                       |
| Unit Modifier   | -1    | 0.1 volt resolution         |

Figure 13 – Accelerator card voltage sensor PDR

## 6.4.6 Accelerator card auxiliary device temperature sensor

614

613

612

| Field           | Value | Description                 |
|-----------------|-------|-----------------------------|
| Record Handle   | 1395  |                             |
| Sensor ID       | 380   |                             |
| Entity Type     | 68    | Add-In card                 |
| Entity Instance | 1     | Accelerator card Instance # |
| Container ID    | 0     | System                      |
| Base Unit       | 2     | Degrees Celsius             |
| Unit Modifier   | 0     | No need for scaling         |

Figure 14 – Auxiliary device temperature sensor PDR

# 6.4.7 Accelerator card auxiliary device health sensor

616617

615

| Field           | Value                       | Description                 |
|-----------------|-----------------------------|-----------------------------|
| Record Handle   | 1400                        |                             |
| Sensor ID       | 395                         |                             |
| Entity Type     | 68                          | Add-In card                 |
| Entity Instance | 1                           | Accelerator card Instance # |
| Container ID    | 0                           | System                      |
| Sensor Type     | 1                           | Health state                |
| Possible States | Refer to Table 1 of DSP0249 |                             |

Figure 15 – Auxiliary device health sensor PDR

## 6.4.8 Accelerator card composite state sensor

620

619

| Record Handle                 | 1101 |             |
|-------------------------------|------|-------------|
| Entity Type                   | 68   | Add-In card |
| Entity Instance Number        | 1    |             |
| Container Entity Container ID | 0    | System      |

| Terminus Handle        | 0 |
|------------------------|---|
| Sensor ID              | 5 |
| Composite Sensor Count | 3 |

| Sensor Type     | 1          | Health state             |
|-----------------|------------|--------------------------|
| Possible States | Refer to T | able 1 of <u>DSP0249</u> |

| Sensor Type     | 21         | Thermal Trip              |
|-----------------|------------|---------------------------|
| Possible States | Refer to 1 | Table 1 of <u>DSP0249</u> |

| Sensor Type     | 10                          | Memory Operational Fault status |
|-----------------|-----------------------------|---------------------------------|
| Possible States | Refer to Table 1 of DSP0249 |                                 |

### Figure 16 – Accelerator card composite state sensor PDR

### 622 6.5 Accelerator

#### 623 **6.5.1 General**

621

625

626

The Accelerator is an active device and being a physical entity that doesn't include other entities, the

Accelerator is not declared in its own PDR. It is instead declared in the Accelerator card container PDR.

The Accelerator includes a set of device-level sensors. The following diagram illustrates the model

627 sensors in the Accelerator:



630

631 632

Figure 17 – Example model Accelerator

The Accelerator content is declared using an entity-association PDR that includes the hierarchical description of the Accelerator. The device-level sensors are declared with separate PDRs using direct references to the measured entities.

633

| Container ID  | 1000 |
|---------------|------|
| Record Handle | 1295 |

| Container Entity              |     |                  |
|-------------------------------|-----|------------------|
| Entity Type                   | 149 | Accelerator      |
| Entity Instance Number        | 1   |                  |
| Container Entity Container ID | 100 | Accelerator card |

| Association Type | Physical to Physical containment |
|------------------|----------------------------------|
|------------------|----------------------------------|

Figure 18 - Accelerator entity association PDR

-

6.5.2 Accelerator temperature sensor

635636

634

| Field           | Value | Description            |
|-----------------|-------|------------------------|
| Record Handle   | 1325  |                        |
| Sensor ID       | 240   |                        |
| Entity Type     | 149   | Accelerator            |
| Entity Instance | 1     | Accelerator Instance # |
| Container ID    | 100   | Accelerator card       |
| Base Unit       | 2     | Degrees Celsius        |

Figure 19 – Accelerator temperature sensor PDR

# 6.5.3 Accelerator power sensor

639

638

| Field           | Value | Description            |
|-----------------|-------|------------------------|
| Record Handle   | 1310  |                        |
| Sensor ID       | 210   |                        |
| Entity Type     | 149   | Accelerator            |
| Entity Instance | 1     | Accelerator Instance # |
| Container ID    | 100   | Accelerator card       |
| Base Unit       | 7     | Watts                  |
| Unit Modifier   | -1    | 0.1 watt resolution    |

Figure 20 – Accelerator power sensor PDR

641

# 6.5.4 Accelerator composite state sensor

643

642

| Record Handle                 | 1315 |                  |
|-------------------------------|------|------------------|
| Entity Type                   | 149  | Accelerator      |
| Entity Instance Number        | 1    |                  |
| Container Entity Container ID | 100  | Accelerator card |

| Terminus Handle        | 0   |
|------------------------|-----|
| Sensor ID              | 220 |
| Composite Sensor Count | 5   |

| Sensor Type     | 1              | Health state        |
|-----------------|----------------|---------------------|
| Possible States | Refer to Table | 1 of <u>DSP0249</u> |

| Sensor Type     | 21             | Thermal Trip        |
|-----------------|----------------|---------------------|
| Possible States | Refer to Table | 1 of <u>DSP0249</u> |

| Sensor Type     | 18             | Firmware Version    |
|-----------------|----------------|---------------------|
| Possible States | Refer to Table | 1 of <u>DSP0249</u> |

644

| Sensor Type     | 15                          | Configuration |
|-----------------|-----------------------------|---------------|
| Possible States | Refer to Table 1 of DSP0249 |               |

| Sensor Type     | 16                          | Configuration Change |
|-----------------|-----------------------------|----------------------|
| Possible States | Refer to Table 1 of DSP0249 |                      |

Figure 21 – Accelerator composite state sensor PDR

646

### 6.5.5 Accelerator clock speed sensor

648

647

| Field           | Value | Description            |
|-----------------|-------|------------------------|
| Record Handle   | 1335  |                        |
| Sensor ID       | 260   |                        |
| Entity Type     | 149   | Accelerator            |
| Entity Instance | 1     | Accelerator Instance # |
| Container ID    | 100   | Accelerator Card       |
| Base Unit       | 20    | Hertz                  |
| Unit Modifier   | 6     | 1 MHz resolution       |

Figure 22 - Accelerator card clock speed sensor PDR

### 6.6 Memory

#### 6.6.1 General

The Memory is a physical entity in the model. The Memory is already declared within the Accelerator card container PDR. The Memory includes a set of device-level sensors. The Memory sensors cover all three types of memory i.e., DIMM, Internal memory and soldered memory chips. The following diagram illustrates the model sensors in the Memory:

656

664

665

666

649

650

651

652

653

654

655

657 658 Memory 659 660 Memory Memory Memory Error 661 Temperature State **Statistics** 662 663 Figure 23 - Example Memory model

The Memory content is declared using an entity-association PDR that includes the hierarchical description of the Memory. The device-level sensors are declared with separate PDRs using direct references to the measured entities.

| Container ID  | 1020 |
|---------------|------|
| Record Handle | 1210 |

| Container Entity              |     |                  |
|-------------------------------|-----|------------------|
| Entity Type                   | 66  | Memory           |
| Entity Instance Number        | 1   |                  |
| Container Entity Container ID | 100 | Accelerator card |

| Association Type Physical to Physical cont |
|--------------------------------------------|
|--------------------------------------------|

Figure 24 – Memory association PDR

# 6.6.2 Memory temperature sensor

669

668

667

| Field           | Value | Description       |
|-----------------|-------|-------------------|
| Record Handle   | 1225  |                   |
| Sensor ID       | 125   |                   |
| Entity Type     | 66    | Memory            |
| Entity Instance | 1     | Memory Instance # |
| Container ID    | 100   | Accelerator card  |
| Base Unit       | 2     | Degrees C         |

## Figure 25 – Memory temperature sensor PDR

# 671 **6.6.3 Memory error statistics sensors**

672

673

| Field           | Value | Description        |
|-----------------|-------|--------------------|
| Record Handle   | 1255  |                    |
| Sensor ID       | 150   |                    |
| Entity Type     | 66    | Memory             |
| Entity Instance | 1     | Memory instance #  |
| Container ID    | 100   | Accelerator card   |
| Base Unit       | 80    | Correctable Errors |

Figure 26 – Memory correctable errors PDR

| Field           | Value | Description          |
|-----------------|-------|----------------------|
| Record Handle   | 1275  |                      |
| Sensor ID       | 180   |                      |
| Entity Type     | 66    | Memory               |
| Entity Instance | 1     | Memory Instance #    |
| Container ID    | 100   | Accelerator card     |
| Base Unit       | 81    | Uncorrectable Errors |

Figure 27 – Memory uncorrectable errors PDR

# 6.6.4 Memory composite state sensor

676

675

674

| Memory composite state sensor PDR |      |                  |
|-----------------------------------|------|------------------|
| Record Handle                     | 1220 |                  |
| Entity Type                       | 66   | Memory           |
| Entity Instance Number            | 1    |                  |
| Container Entity Container ID     | 100  | Accelerator card |

| Terminus Handle        | 0   |
|------------------------|-----|
| Sensor ID              | 120 |
| Composite Sensor Count | 4   |

| Sensor Type     | 1                           | Health state |
|-----------------|-----------------------------|--------------|
| Possible States | Refer to Table 1 of DSP0249 |              |

| Sensor Type     | 320        | Memory cache status       |
|-----------------|------------|---------------------------|
| Possible States | Refer to T | able 11 of <u>DSP0249</u> |

| Sensor Type     | 321                         | Memory error status |
|-----------------|-----------------------------|---------------------|
| Possible States | Refer to Table 11 of DSP024 |                     |

Sensor Type 322 Redundant Memory activity status

Possible States Refer to Table 11 of <u>DSP0249</u>

Figure 28 – Memory composite state sensor PDR

679

678

| 680               |                                                             |             | ANNEX A                                                                                                                                                                                                               |  |
|-------------------|-------------------------------------------------------------|-------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| 681               |                                                             |             | (informative)                                                                                                                                                                                                         |  |
| 682               |                                                             |             | Notation and conventions                                                                                                                                                                                              |  |
| 683               | <b>A.1</b>                                                  | Notatio     | ns                                                                                                                                                                                                                    |  |
| 684               | Examples of notations used in this document are as follows: |             |                                                                                                                                                                                                                       |  |
| 685<br>686<br>687 | •                                                           | 2:N         | In field descriptions, this will typically be used to represent a range of byte offsets starting from byte two and continuing to and including byte N. The lowest offset is on the left; the highest is on the right. |  |
| 688<br>689        | •                                                           | (6)         | Parentheses around a single number can be used in message field descriptions to indicate a byte field that may be present or absent.                                                                                  |  |
| 690<br>691        | •                                                           | (3:6)       | Parentheses around a field consisting of a range of bytes indicates the entire range may be present or absent. The lowest offset is on the left; the highest is on the right.                                         |  |
| 692<br>693<br>694 | •                                                           | <u>PCle</u> | Underlined, blue text is typically used to indicate a reference to a document or specification called out in "Normative references" clause or to items hyperlinked within the document.                               |  |
| 695               | •                                                           | rsvd        | This case-insensitive abbreviation is for "reserved."                                                                                                                                                                 |  |
| 696<br>697        | •                                                           | [4]         | Square brackets around a number are typically used to indicate a bit offset. Bit offsets are given as zero-based values (that is, the least significant bit [LSb] offset = 0).                                        |  |
| 698<br>699        | •                                                           | [7:5]       | This notation indicates a range of bit offsets. The most significant bit is on the left; the least significant bit is on the right.                                                                                   |  |
| 700<br>701        | •                                                           | 1b          | The lowercase "b" following a number consisting of $0s$ and $1s$ is used to indicate the number is being given in binary format.                                                                                      |  |
| 702               | •                                                           | 0x12A       | A leading "0x" is used to indicate a number given in hexadecimal format.                                                                                                                                              |  |
| 703               |                                                             |             |                                                                                                                                                                                                                       |  |

36 Published Version 1.0.0

| ANNEX B      | 704 |
|--------------|-----|
| (informative | 705 |
|              | 706 |

708 Change log

| Version | Date      | Description              |  |
|---------|-----------|--------------------------|--|
| 1.0.0   | 5/25/2022 | Initial draft            |  |
| 1.0.0   | 6/13/2023 | Released for publication |  |