The stupid-simple messaging protocol is interesting. However, I would prefer the use of something like D. J. Bernstein's netstrings rather than NL terminated strings. Netstrings have the advantage of having an explicit length field and thus allowing the receiver to allocate buffers of the appropriate size without the complications of receiving NL terminated data. Examples of protocols using netstrings are SCGI and QMQP. Netstrings are easier to handle and are still easy to debug if verbs are expressed in ASCII while still handling arbitrary binary data equally well.
> The famous Finger security hole may be blamed on Finger's use of the CRLF encoding. In that encoding, each string is simply terminated by CRLF. This encoding has several problems. Most importantly, it does not declare the string size in advance. This means that a correct CRLF parser must be prepared to ask for more and more memory as it is reading the string. In the case of Finger, a lazy implementor found this to be too much trouble; instead he simply declared a fixed-size buffer and used C's gets() function. The rest is history.
> In contrast, as the above sample code shows, it is very easy to handle netstrings without risking buffer overflow. Thus widespread use of netstrings may improve network security.
My test for this kind of thing is this: suppose you need to make a high speed implementation in Verilog for an FPGA or ASIC. By high speed, I mean that you need to process multiple characters per cycle (say a 64-bit word at a time)- if you have to make a decision on a byte-by-byte basis, it's too slow. This is a very possible scenario if the protocol catches on: for example, I made HDLC byte stuffing for PPP framing, 4 characters at a time (at least better than bit stuffing).
The CRLF encoding is not so bad in this case: read 8 characters, in parallel detect CR. If there is no CR in your word, just append the data to the buffer. When you do have a CR, it's a big pain: you need to save the last word with byte masks, then shift any remaining for the next input (and the entire next string is shifted by this left-over balance). You could try to make all strings a multiple of 8 in length to avoid this, but this adds overhead to the message so is inefficient- the hardware will just have to do it.
OK, so now in your new format the hardware has to parse a variable length decimal number and convert it to binary (ideally in parallel), very fun! You could make the conversion byte at a time, but it's slow. You need to implement overflow detection.
At the very least use hex instead of decimal. Even in software you may need overflow detection. This is easy in hex, not so much in decimal. Better is to require the number to be a multiple of four or eight digits, even though this is a waste of bandwidth.
I assume you're coming from a hardware perspective - HDLC/PPP parsing in hardware might make sense in niche cases; though most protocols benefit much more from the flexibility and upgrade possibilities of a software implementation.
In this case, it is a messaging protocol. The incoming message essentially must be copied somewhere, therefore space must be allocated to store it, and therefore the length must be known.
CRLF require either two passes (one to get string length, another to copy data) or continuously expanding storage, both of which are significantly more expensive than just parsing a short number in the beginning of the string.
PPP in hardware is for IP over SONET at 2.5 - 40 Gbps, good luck doing it in software...
In hardware there is no malloc- instead there is a pool of pages and the string would be stored as a linked list of such pages. Linux socket buffers do the same thing.
Which for some reason is not in the abnf spec, and thus will be ignored up until somebody tries to implement this and fails a compatibility test.
Please put it in your abnf spec. People use them, you know.
Yes, it's hard to specify. The problem is that you have both an uncapped ID and an uncapped PAYLOAD in the same message. I recommend giving ID a max length of, say, 32, and PAYLOAD then has a max length of 951 if I'm counting right.
Or you could consider that an IPv6 path MTU is at least 1280, and use that (or 1232) as your per message bound instead of 1024. You're sending a packet, might as well get full value.
> The comma makes it slightly simpler for humans to read netstrings that are used as adjacent records, and provides weak verification of correct parsing.
Pascal-style strings have the length stored in the first byte, so a string can only be 255 characters long. Netstrings have the length in text at the front of the string instead, so you need only preallocate a buffer of 20 characters to store the text representation of string lengths up to 2^64 bytes.
I like what I see in the grammar. Very compact and clean. It's actually smaller than the table of contents of XMPP Core spec lol. That's the kind difference that might make an ultra-robust, efficient implementation a bit easier. ;)
Note: Reminds me when I was illustrating Oberon-2 complexity for C and C++ programmers by comparing Oberon-2 BNF to their specs same way. Good way to do prelim assessment of protocol/language complexity and whether it's worth the trouble.
Also, since SSMP is completely text-based, does that mean the only way to send binary data is to base64-encode it? Or does it support a length header or chunking similar to HTTP?
It's slightly worse than that: a payload can be any 8-bit data... except a LF character. Which is to say, you will need to either escape or quote it in some way, which is to say you are on your own.
It might have been cleaner to specify base64 encoding or length-prefaced payloads (say, 16 bit int preface indicating length in bytes). As it is, you are on your own.
A big advantage of LF-delimited over length-prefixed messages is netcat/telnet-friendliness. That was more valuable to us than being binary-clean as our use cases do not involve sending large binary messages.
I think you might want to make a distinction between a stream packet and a completed message.
If you're going for telnet compatibility then you'll want to terminate packets in CR+LF, but possibly expect to see only CR or LF from the client (ASCII mode).
Your stream could either be stateful (a message is always sent complete and in order, even if it takes multiple stream packets) or stateless* (different messages might have stream packets consecutively).
It would be more future proof if you started with a message grammar and then defined your protocol on top of that.
Neat but it's not clear how client authentication is handled in SSMP. I have used XMPP in the past and built-in identity was one of the biggest plusses of the protocol.
Netstrings are so brilliantly simple, see the wikipedia page: https://en.wikipedia.org/wiki/Netstring
This is what DJB says about the netstrings [1]:
> The famous Finger security hole may be blamed on Finger's use of the CRLF encoding. In that encoding, each string is simply terminated by CRLF. This encoding has several problems. Most importantly, it does not declare the string size in advance. This means that a correct CRLF parser must be prepared to ask for more and more memory as it is reading the string. In the case of Finger, a lazy implementor found this to be too much trouble; instead he simply declared a fixed-size buffer and used C's gets() function. The rest is history.
> In contrast, as the above sample code shows, it is very easy to handle netstrings without risking buffer overflow. Thus widespread use of netstrings may improve network security.
[1] http://cr.yp.to/proto/netstrings.txt
BTW, see Aaron Swartz's blog post on DJB, http://www.aaronsw.com/weblog/djb