Hacker News new | past | comments | ask | show | jobs | submit login

This is the script I am running - https://pastebin.com/65GxJ9i9. The - after << ignores the tabbed indents in heredoc.

This is what it produces for me when I run `lexit.sh us.yahoo.com` - https://stuff-storage.sfo3.digitaloceanspaces.com/ee.txt




https://news.ycombinator.com/item?id=27490265 <-- yy054

The "gibberish" is GZIP compressed data. "yy054" is a simple filter I wrote to extract a GZIP file from stdin, i.e., discard leading and trailing garabage. As far as I can tell, the compressed file "ee.txt" is not chunked transfer encoded. If it was chunked we would first extract the GZIP, then decompress and finally process the chunks (e.g., filter out the chunk sizes with the filter submitted in the OP).

In this case all we need to do is extract the GZIP file "ee.txt" from stdin, then decompress it:

    printf "GET /ee.txt\r\nHost: stuff-storage.sfo3.digitaloceanspaces.com\r\nConnection: close\r\n\r\n"|openssl s_client -connect 138.68.34.161:443 -quiet|yy054|gzip -dc > 1.htm
    firefox ./1.htm
   
Hope this helps. Apologies I initially guessed wrong on here doc. I was not sure what was meant by "gibberish". Looks like the here doc is working fine.


new pastebin. Had a typo in the old one- https://pastebin.com/4j9Z3eCc


Need to get rid of the leading spaces on all lines except the "int fileno" line. Can also forgo the "here doc" and just save the lines between "flex" and "eof" to a file. Run flex on that file. This will create lex.yy.c. Then compile lex.yy.c.

The compiled program is only useful for filtering chunked transfer encoding on stdin. Most "HTTP clients" like wget or curl already take care of processing chunked transfer encoding. It is when working with something like netcat that chunked tranfser encoding becomes "DIY". This is a simple program that attempts to solve that problem. It could be written by hand without using flex.


Okay I'll give up for now. There are really no spaces in front of the lines. In pastebin if you check the raw version you'll see they are tabs. Which get stripped out because I added a `-` before eof. Providing the file manually to flex also produces the same gibberish for me.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: