Hacker News new | past | comments | ask | show | jobs | submit login
Jq - a lightweight and flexible command-line JSON processor (stedolan.github.com)
143 points by tchalla on Oct 21, 2012 | hide | past | favorite | 36 comments



It'd be great if the help text contained some examples rather than simply a pointer to the website, and if you added a man page. When you're in the middle of a terminal session, switching to a browser can be really disruptive.

Also, if you tag releases with a version number and make them available as downloads on GitHub, you'll have an easier time getting Jq into Homebrew, the OS X package manager. I wrote a formula for you, but AFAIK you'll need to add in a real version number before submitting a pull request to Homebrew: https://gist.github.com/3928074


Arch PKGBUILD would be much easier with a version number too...


They exist now!


Very nice tool! I'm getting syntax errors from the Twitter examples though (and everytime I try to use underscores):

<edit>Created an issue on Github</edit>

  curl 'http://search.twitter.com/search.json?q=json&rpp=5&include_entities=true | jq '.results[] | {from_user, text, urls: [.entities.urls[].url]}' 

  error: Invalid character
  .results[] | {from_user, text, urls: [.entities.urls[].url]}
                    ^
  error: syntax error, unexpected IDENT, expecting '}'
  .results[] | {from_user, text, urls: [.entities.urls[].url]}
                    ^^^^
  2 compile errors
    % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                   Dload  Upload   Total   Spent    Left  Speed
  100  5970  100  5970    0     0  11969      0 --:--:-- --:--:-- --:--:-- 23050


(author here) That bug was fixed a while ago, but I forgot to upload new binaries. Try again, they should be working now.



Same issue here.



Also see also RecordStream: https://github.com/benbernard/RecordStream

Someone mentioned jsawk in a sibling comment, too.



> You can download a single binary, scp it to a far away machine, and expect it to work.

Not if my machine is x86 and the server is SPARC.

Cool utility though.


Git clone and make, then. As long as you have a C compiler. And if you don’t, you probably aren’t using JSON.


Actually I bet there are a ton of people using JSON that don't have a c compiler. The technologies are spaces by several decades and I would imagine the majority of JSON people are using much higher level languages than c.

If I want to munge JSON I find it easier to do so in python, just convert it back to data structures and manipulate them and serialized it back to JSON.


I would usually use some other language with a JSON library as well, but this utility is conveniently one-function, after Unix fashion. I like composability. Also, a C compiler is usually on hand whether you use it or not; even if none is, they’re not exactly hard to obtain.


"they're not exactly hard to obtain"

Except, of course, in Mac OS X where you have to download (a couple of gigabytes) Xcode from Apple and install it to get GCC :)


<The technologies are spaces by several decades

this intrigues me. while the widespread use of JSON as JSON is perhaps what differentiates it from what i'm about to describe i propose that the concept is an inevitable or obvious means of data representation/manipulation. As an arguably irrelevant anecdote to support this proposition I submit that prior to my personal introduction to the existence of JSON I was organizing data in strings and in txt files in a format nearly identical to basic JSON and writing simple functions to essentially do the same thing. I submit that I am not special or particularly clever and therefore the baller-coders that cometh before me certainly had to have their own types of crap like this. If i didn't have a tablet in class i probably would have had their names in my notes, but i digress.

tl;dr - While the advent of JSON's incarnation and that of C are several decades apart the underlying concepts are not.

tl;dr;dr - just realized your second sentence kinda says this.


Also see App::PipeFilter - https://metacpan.org/module/App::PipeFilter

The module is a framework/collection of shell pipelines which include JSON processing:

  curl -s 'http://api.duckduckgo.com/?q=poe&o=json' |
  jsonpath -o '$..Topics.*.FirstURL' -o '$..Topics.*.Text' |
  grep -i perl |
  jmap -i col0 -o url -i col1 -o title |
  json2yaml
  
Output:

  title: Perl Object Environment, a library for event driven multitasking for the Perl programming language
  url: http://duckduckgo.com/Perl_Object_Environment
Other structured data can be added to this module (Looks like Ingy maybe adding YAML - https://github.com/ingydotnet/app-pipefilter)


Wow, this compiles fast. And the Flex and Bison files are there. This seems hackable. Nice work!

My only questions are (they are always the same with any purported sed/awk "replacement"):

1. what was the problem you were trying to solve where sed and awk failed you, and

2. does this program operate line by line or does it read entire files into memory?

I had to deal with some JSON a while ago and threw together some sed like this just so I could read it:

    sed '
    s/,/&\
    /g;
    /^$/d;
    s/^[{][^}]/\
    &/g; 
    /\"/s/,/<##eol##>/g;
    s/ *//;
    ' |tr '\012' '\040' |sed '
    s/<##eol##>/\
    /g;
    s/\[/\
    &\
    /;
    s/\]/\
    &\
    /;
    s/,  /, /g;
    '
But any difficuly I have dealing with JSON I attribute to the pervasive use of JSON, not sed.


Thanks!

1. I think your example shows sed/awk's failings with JSON data :) I don't want to write a JSON parser by hand every time I want to pull a field out of an object, and parsing recursive structures with regexes is never a good plan.

2. It reads JSON items from stdin into memory, one at a time. So if the input is a single giant JSON value, it all goes into memory, but if it's a series of whitespace-separated values they'll be processed sequentially.

It's cat-friendly: if you do

    cat a | jq foo; cat b | jq foo
then it's the same as doing

    cat a b | jq foo


1. But those are general statements. Opinions. What I mean is give me a specific case. A specific example, a specific block of JSON and a specific task. Once I have that, then I can ask myself "Is this something I would ever need to do or that I have to do on a regular basis?"

Sometimes I need to write one-off filters. There is just no getting around it. I have to choose a utility that gives maximum flexibility and is not too verbose; I don't like to type. Lots of people like Perl, and other similar scripting languages for writing one-off filters. But Perl, _out of the box_, is not a line-by-line filter. It's unlike sed/awk; it needs more memory. That brings us to #2.

2. If I understand correctly, jq is reading the entire JSON block into memory. This is what separates your program and so many other sed/awk "replacements" from sed and awk, the programs they purport to "replace". sed/awk don't read entire files into memory, they operate line-by-line and use a reasonably small buffer. Any sed/awk "replacement" would have to match that functionality. Given that sed/awk don't read in an entire structure (JSON, XML, etc.) before processing it, they are ideal for memory constrained environments. (As long as you don't overfill their buffers, which rarely happens in my experience.)

Anyway, so far I like this program. Best JSON filter I've seen yet (because I can hack on the lexer and parser you provided).

Well done.


Got the following compile error when running make (Lion + XCode 4.5.1):

  bison -W -d parser.y -v --report-file=parser.gen.info -o
  parser.gen.c
  bison: invalid option -- W
  Try `bison --help' for more information.
  make: *** [parser.gen.c] Error 1


(author here) I haven't tried to build on a mac in a while, bison there seems to support fewer options (must be an older version).

For now, I've just checked in the autogenerated parser, so that bison won't have to run when you build master. git-pull and try again :)


Works great now, thank you:)


FYI if you have python installed (and most do) you can just do

curl 'http://search.twitter.com/search.json?q=json&rpp=5&i... | python -mjson.tool


Very nice tool, congrats!

This is something everyone ends up needing at least one time, and we always end up using a language specific tool.

Nothing beats the command line piping



When doing the tasks this seems to be targeting, Jansson has worked well for me.

http://www.digip.org/jansson/


Built for Windows using cygwin, works a treat.


Builds with MinGW as well, although I did have to manually edit a header file because I couldn't get sed to work properly.


See also jsontool, http://trentm.com/json/


We periodically have to munge JSON at work—will definitely be trying this out.


This is awesome. Thank you.


Thanks; very nicely done. :-)


I vote for a man page!



D. DQ,-"l




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: