Suggested delimiter for use in streams #8

Open
opened 2018-04-19 08:39:58 +00:00 by aaronpeterson · 1 comment
aaronpeterson commented 2018-04-19 08:39:58 +00:00 (Migrated from github.com)

What would be the recommendation for specifying a delimiter when writing one binary object to a file at a time for the purpose of later processing as a node.js stream?

Can I simply write the binary data fs.writeFile with 'binary' with a trailing require('os').EOL? The end goal would be to read in a stream and decode in a transform with something like maxogden/binary-split or myndzi/binary-split-streams2

What would be the recommendation for specifying a delimiter when writing one binary object to a file at a time for the purpose of later processing as a node.js stream? Can I simply write the binary data fs.writeFile with 'binary' with a trailing require('os').EOL? The end goal would be to read in a stream and decode in a transform with something like [maxogden/binary-split](https://github.com/maxogden/binary-split) or [myndzi/binary-split-streams2](https://github.com/myndzi/binary-split-streams2)
sitegui commented 2018-04-24 02:53:44 +00:00 (Migrated from github.com)

Hi @aaronpeterson ,

In general, there is no valid delimiter, since any arbitrary byte sequence can happen in the middle of some value. For example:

let Type = require('js-binary').Type,
    schema = new Type('Buffer')
schema.encode(Buffer.from('0123456789', 'hex')) // <Buffer 05 01 23 45 67 89>

If you want to write multiple values into a file, I can give you 3 valid approaches:

  1. Wrap each encoded buffer, appending its size, so that you can read them back. Example sync code:
function writeFrame(buffer) {
    let header = Buffer.alloc(4)
    header.writeUInt32BE(buffer.length, 0)
    fs.writeSync(fd, Buffer.concat([header, buffer])
}

function readFrame(fd) {
    let header = Buffer.alloc(4)
    fs.readSync(fd, header, 0, 4, null)
    let length = header.readUInt32BE(0)
    let buffer = Buffer.alloc(length)
    fs.readSync(fd, buffer, 0, length, null)
}
  1. Choose one byte (or byte sequence) as delimiter, but escape matches that may happen inside de encoded binary on write and then unescape them on read.

  2. Do not use any delimiter and rely on lower-level Type#read() function. This works because Type#read() only reads up to the end of one record, allowing you to call it multiple times to get multiple values

// Suppose `buffer` has potentially multiple (full) encoded values
// `schema` is an Type instance
let ReadState = require('js-binary').ReadState,
    rs = new ReadState(buffer)

while (!rs.hasEnded()) {
    console.log(schema.read(rs))
}

Hope it works for you!

Hi @aaronpeterson , In general, there is no valid delimiter, since any arbitrary byte sequence can happen in the middle of some value. For example: ```js let Type = require('js-binary').Type, schema = new Type('Buffer') schema.encode(Buffer.from('0123456789', 'hex')) // <Buffer 05 01 23 45 67 89> ``` If you want to write multiple values into a file, I can give you 3 valid approaches: 1) Wrap each encoded buffer, appending its size, so that you can read them back. Example sync code: ```js function writeFrame(buffer) { let header = Buffer.alloc(4) header.writeUInt32BE(buffer.length, 0) fs.writeSync(fd, Buffer.concat([header, buffer]) } function readFrame(fd) { let header = Buffer.alloc(4) fs.readSync(fd, header, 0, 4, null) let length = header.readUInt32BE(0) let buffer = Buffer.alloc(length) fs.readSync(fd, buffer, 0, length, null) } ``` 2) Choose one byte (or byte sequence) as delimiter, but escape matches that may happen inside de encoded binary on write and then unescape them on read. 3) Do not use any delimiter and rely on lower-level Type#read() function. This works because Type#read() only reads up to the end of one record, allowing you to call it multiple times to get multiple values ```js // Suppose `buffer` has potentially multiple (full) encoded values // `schema` is an Type instance let ReadState = require('js-binary').ReadState, rs = new ReadState(buffer) while (!rs.hasEnded()) { console.log(schema.read(rs)) } ``` Hope it works for you!
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: sitegui/js-binary#8
No description provided.