Last week, ARM released their Machine ReadableArchitecture Specification andI wrote about what you can do with it.But before you can do anything with the specification, you needto extract the useful bits from the release so I thought I would tryfor myself and describe what I found out (and release somescripts that demonstrate/testwhat I am saying).
So what exactly is in the release?
Total Thunderbird Converter converts emails from Mozilla's Thunderbird e-mail client to PDF, DOC (WORD), HTML, TXT, TIFF, and PST in batch. Whether you need to safely archive your correspondence in PDF, or migrate emails to Outlook, our converter is up to the task providing you with fast, accurate conversion options. Covid19 is a worldwide pandemic challenge that started in Wuhan, China and spread to almost all countries on the planet within a few months. The causative virus was found to be highly contagious and, until now, considerably difficult to contain. A look at the epidemiological distribution of the disease over the planet has raised a number of questions whose answers could help us understand the. The latest tweets from @AvecLunettes.
It contains a lot of html files (see Milosch Meriac’s online v8-A html)
These are the human readable version of the specification — all nicely hyperlinked. This is the best way to navigate the ARM specification.
It contains some pdf files
These contain the same information as the html files above.The “OPT” versions are equivalent to the standard version but they havehad some simplifying optimisations applied to make them more readable.The “OPT_diff” versions show the difference between the standard versionand the OPT version — the best way of understanding what the OPT versiondoes.
It contains some xml files
This is the bit that we want!There are two files of particular interest:
- notice.xml ARM’slegal notice
- shared_pseudocode.xmlhundreds of shared support functions used in the definition of instructions,page table walk, permission checking, exceptions/interrupts, etc.
And the rest of the files are instruction encodings such asadc.xml.
And it contains files like a DTD file that defines the XML schema used bythe XML.
Inside the Shared XML files
The Shared XML files contain all the type definitions, constants, variablesand functions required by the instructions and system support code.Here is how a typical function is presented in XML:
This shows a named “ps” section containing a small number of related definitions(just one in this case) and with the ASL implementation of the functionenclosed in a “pstext” section.Almost all objects defined in the pstext section are tagged with “anchor”and almost all references to objects in the pstext section are tagged with“a”. (We will use these links later.)The rest of the XML attributes are mostly useful for generating HTML — I willignore them.
As we process this ASL code, it will be useful to track what definitionseach “ps” section contains and what the section depends on.Here is a Python class to represent this (the full code is here).
and here is some code to read the XML, extract dependencies and package it up asan ASL object.
Inside the Instruction XML files
The ARM architecture often contains several different encodings for a single instruction.Each instruction shares some common ASL code to execute the instruction and (optionally) toperform part of the decoding.The “pstext” sections containing these are labelled “Execute” and “Postdecode”.Some instructions are just aliases for other instructions so they don’t containan execute section — I will discard these instructions.
These pieces of ASL can be shared by several different instruction encodings and each encodingis accompanied by a piece of ASL to interpret the fields of the encoding.
Instruction encodings and register descriptions use a common section format called “regdiagram”.One of the key pieces of information about an instruction is which instruction set it belongs to.The XML uses four different tags: “T16” (Thumb-32 short encoding), “T32” (Thumb-32 long encoding), “A32”(ARM-32 encoding) and “A64” (ARM-64 encoding). The T16 encoding is 16-bits long and all the others are 32-bits long.
The “regdiagram” section contains a number of boxes corresponding to one or morecontiguous bits within the encoding.The location of each box is specified bythe width of the box and the highest bit position in the box.Awkwardly, the T16 encoding numbers its bits from 31 down to 16instead of from 15 down to 0 so my script fixes that.
Some boxes have an attribute ‘name’, I use “_” for any anonymous boxes.
And some boxes have a constant bitvector made up of “0”, “1”, “x”, “(0)” or “(1)”.0 and 1 should be obvious, x means “don’t care” and (0) and (1) mean“should be 0/1 and it is UNPREDICTABLE what happens if they are not.”I use a suitable number of “x”s for any field with no constant specified.
The constant can also take the form “!= 1111” meaning “must not equal 1111”.This check is always replicated in the ASL code so I discard that information for now.
Sometimes the XML splits a single field into two adjacent fields: typically one of the fieldshas a constant value.When this happens, the fields have a name like “reg<4:1>” and “reg<0>”.This is not very convenient for our purposes so I look for this pattern andmerge them back into a single field called “reg”.
To finish off reading an encoding, weread the decode ASL and pick a good name for the instruction encoding.
Finally, the collection of the encodings, the postdecode ASL and the execute ASLare packaged up as an instruction named after the shared execute ASL.
And to read all the instructions in a directory, we use the following code:
Sorting the shared code
Once you have extracted all the code, you are going to want to process it in someway.This will probably be easier to do if we arrange the ASL type and function definitionsso that definitions always occur before their first use.So my script uses the dependencies that we extracted from the ASL to perform atopological sort of the code.
There are several modes it can work in:
Pdf To Xml Converter online, free downloadmarcus Reid Online
- Sort all the code.This will include code used by instructions but also code used when an interruptoccurs or an external debugger is attached.
- Extract all the code that is used by AArch64 instructions. That is, instructionsusing the A64 encodings.
- Extract all the code that is used by AArch32 instructions. That is, instructionsusing the A32 encodings.
- Extract all the code used by AArch64 or AArch32 instructions.
Conclusion
I hope this is useful for those who want to make use of ARM’s Machine ReadableArchitecture specification. The files are designed to meet many differentpurposes so it is not always obvious which parts of them are useful for yourpurpose. This is why I thought it would be a good idea to write some scriptsthat actually extract the code instead of just writing about how I believe youcan do it.
At work, I have access to the raw files from which the XML files are built so it hasbeen a while since I have tried to extract the specification from the XML and it has beeninteresting seeing how much easier it is to use the XML files than it was when I firststarted using the architecture specs.(But there were some issues that I had to work around as well — search the script for theword “workaround” for details.)
I would really welcome contributions from other people:
- I have a fairly narrow focus in what I want to do with the XML so my tools discard potentially useful information.
- I am also not a native Python speaker — you can probably tell.
- Many of the people who want to use the machine readable specificationare much more comfortable with functional languages — translations are most welcome.
So if you have a suggestion for improving the scripts or you want other scripts, feelfree to implement your suggestion and send me a pull request.
Pdf To Xml Converter online, free downloadmarcus Reid Full
Enjoy!
Converter online, free Pdf To Word
p.s., unpacking the tarballs and extracting the code is just the beginning.I am working withCambridge University’s REMS research groupto convert the ASL to their SAIL language from which you can generate O’Caml, LEMand HOL versions of the spec (with more backends planned).
And if the SAIL version does not suit your needs, then you might want want toknow how to lex, parse, typecheck and execute ASL code yourself. I willdescribe those in future posts.
Related posts and papers
Pdf To Xml Converter online, free download Marcus Reid Free
Pdf To Xml Converter online, free downloadmarcus Reidsville
- Paper: End-to-End Verification of ARM Processors with ISA-Formal, CAV 2016.
- Paper: Trustworthy Specifications of ARM v8-A and v8-M System Level Architecture), FMCAD 2016.
- This post: Dissecting the ARM Machine Readable Architecture files
- Code: MRA Tools
- Paper: Who guards the guards? Formal Validation of the Arm v8-M Architecture Specification), OOPSLA 2017.