keyboard_arrow_leftBack

Bitcoin Improvement Proposals : BIP 0134


  BIP: 134
  Title: Flexible Transactions
  Author: Tom Zander <[email protected]>
  Status: Draft
  Type: Standards Track
  Created: 2016-07-27

Table of Contents

Abstract

This BIP describes the next step in making Bitcoin's most basic element, the transaction, more flexible and easier to extend. At the same time this fixes all known cases of malleability and resolves significant amounts of technical debt.

Summary

Flexible Transactions uses the fact that the first 4 bytes in a transaction determine the version and that the majority of the clients use a non-consensus rule (a policy) to not accept transaction version numbers other than those specifically defined by Bitcoin. This BIP chooses a new version number, 4, and defines that the data following the bytes for the version is in a format called Compact Message Format (CMF). CMF is a flexible, token based format where each token is a combination of a name, a format and a value. Because the name is added we can skip unused tokens and we can freely add new tokens in a simple manner in future. Soft fork upgrades will become much easier and cleaner this way.

This protocol upgrade cleans up past soft fork changes like BIP68 which reuse existing fields and do them in a much better to maintain and easier to parse system. It creates the building blocks to allow new features to be added much cleaner in the future.

It also shows to be possible to remove signatures from transactions with minimal upgrades of software and still maintain a coherent transaction history. Tests show that this can reduce space usage to about 75%.

Motivation

Token based file-formats are not new, systems like XML and HTMl use a similar system to allow future growth and they have been quite successful for decades in part because of this property.

Bitcoin needs a similar way of making the transaction future-proof because re-purposing not used fields for new features is not good for creating maintainable code.

Next to that this protocol upgrade will re-order the data-fields which allows us to cleanly fix the malleability issue which means that future technologies like Lightning Network will depend on this BIP being deployed.

At the same time, due to this re-ordering of data fields, it becomes very easy to remove signatures from a transaction without breaking its tx-id, which is great for future pruning features.

Tokens

In the compact message format we define tokens and in this specification we define how these tokens are named, where they can be placed and which are optional. To refer to XML, this specification would be the schema of a transaction.

CMF tokens are triplets of name, format (like PositiveInteger) and value. Names in this scope are defined much like an enumeration where the actual integer value (id, below) is equally important to the written name. If any token found that is not covered in the next table it will make the transaction that contains it invalid.

Name id Format Default Value Description
TxEnd 0 BoolTrue Required A marker that is end of the transaction.
TxInPrevHash 1 ByteArray Required TxId we are spending
TxPrevIndex 2 Integer 0 Index in prev tx we are spending (applied to previous TxInPrevHash)
TxInScript 3 ByteArray Required The 'input' part of the script
TxOutValue 4 Integer Required Amount of Satoshis to transfer
TxOutScript 5 ByteArray Required The 'output' part of the script
LockByBlock 6 Integer Optional BIP68 replacement
LockByTime 7 Integer Optional BIP68 replacement
ScriptVersion 8 Integer 2 Defines script version for outputs following
NOP_1x 1x . Optional Values that will be ignored by anyone parsing the transaction

Scripting changes

In the current version of Bitcoin-script, version 1, there are various opcodes that are used to validate the cryptographic proofs that users have to provide in order to spend outputs.

The OP_CHECKSIG is the most well known and, as its name implies, it validates a signature. In the new version of 'script' (version 2) the data that is signed is changed to be equivalent to the transaction-id. This is a massive simplification and also the only change between version 1 and version 2 of script.

Serialization order

The tokens defined above shall be serialized in a certain order for the transaction to be valid. Not serializing transactions in the order specified would allow multiple interpretations of the data which can't be allowed. There is still some flexibility and for that reason it is important for implementors to remember that the actual serialized data is used for the calculation of the transaction-id. Reading and writing it may give you a different output and when the txid changes, the signatures will break.

At a macro-level the transaction has these segments. The order of the segments can not be changed, but you can skip segments.

Segment Description
Inputs Details about inputs.
Outputs Details and scripts for outputs
Additional For future expansion
Signatures The scripts for the inputs
TxEnd End of the transaction

The TxId is calculated by taking the serialized transaction without the Signatures and the TxEnd and hashing that.

Segment Tags Description
Inputs TxInPrevHash and TxInPrevIndex Index can be skipped, but in any input the PrevHash always has to come first
Outputs TxOutScript, TxOutValue Order is not relevant
Additional LockByBlock LockByTime NOP_1x
Signatures TxInScript Exactly the same amount as there are inputs
TxEnd TxEnd

TxEnd is there to allow a parser to know when one transaction in a stream has ended, allowing the next to be parsed.

Notice that the token ScriptVersion is currently not allowed because we don't have any valid value to give it. But if we introduce a new script version it would be placed in the outputs segment.

Script v2

The default value of ScriptVersion is number 2, as opposed to the version 1 of script that is in use today. The version 2 is mostly identical to version one, including upgrades made to it over the years and in the future. The only exception is that the OP_CHECKSIG is made dramatically simpler. The input-type for OP_CHECKSIG is now no longer configurable, it is always '1' and the content that will be signed is the txid.

TODO: does check-multisig need its own mention?

Block-malleability

The effect of leaving the signatures out of the calculation of the transaction-id implies that the signatures are also not used for the calculation of the merkle tree. This means that changes in signatures would not be detectable. Except naturally by the fact that missing or broken signatures breaks full validation. But it is important to detect modifications to such signatures outside of validating all transactions.

For this reason the merkle tree is extended to include (append) the hash of the v4 transactions. The markle tree will continue to have all the transactions' tx-ids but appended to that are the v4 hashes that include the signatures as well. Specifically the hash is taken over a data-blob that is built up from:

1. the tx-id 2. the CMF-tokens 'TxInScript'

Future extensibility

The NOP_1x wildcard used in the table explaining tokens is actually a list of 10 values that currently are specified as NOP (no-operation) tags.

Any implementation that supports the v4 transaction format should ignore this field in a transaction. Interpreting and using the transaction as if that field was not present at all.

Future software may use these fields to decorate a transaction with additional data or features. Transaction generating software should not trivially use these tokens for their own usage without cooperation and communication with the rest of the Bitcoin ecosystem as miners certainly have the option to reject transactions that use unknown-to-them tokens.

Backwards compatibility

Fully validating older clients are not compatible with this change.

SPV (simple payment validation) wallets need to be updated to receive or create the new transaction type.

This BIP introduces a new transaction format without changing or deprecating the existing one or any of its practices. Therefore it is backwards compatible for any existing data or parsing-code.

Reference Implementation

Bitcoin Classic includes this in its beta releases and a reference implementation can be found at;

https://github.com/bitcoinclassic/bitcoinclassic/pull/186

Deployment

To be determined

References

//github.com/bitcoinclassic/documentation/blob/master/spec/compactmessageformat.md CMF

Copyright

Copyright (c) 2016 Tom Zander <[email protected]></[email protected]>

This document is dual-licensed under the Creative-Commons BY-SA license v4.0 and the Open Publication License v1.0 with the following licence-options:

Distribution of substantively modified versions of this document is prohibited without the explicit permission of the copyright holder. Distribution of the work or derivative of the work in any standard (paper) book form is prohibited unless prior permission is obtained from the copyright holder.