More

wsmith · on Dec 3, 2016

The new garbage collector should be based on well-researched and proven algorithms, together with a couple of thoroughly evaluated innovations, where appropriate. The real innovation should be in the specific mix of techniques, forming a coherent and well-balanced system, with meticulous attention to detail and relentless optimization for performance.

Sometimes I feel there's an inverse correlation between the strength of claims for the future and the end result.

wsmith · on Dec 3, 2016

>Things still felt very new then, like Christmas. The excitement and freshness was like that.

> I had hit a ceiling in terms of what I was learning and I got bored, to be totally honest.

I wish more companies understood how common this is.

I also liked how you didn't investigate the idea of Kickstarter that much. I've heard other people say this too about good work they've done, that they fell into things. It makes me wonder if we're paying attention to the right signs before making decisions.

Which signs do you pay attention to now?

cassiemarketos · on Dec 3, 2016

I've said this before and I will say it again. A long time ago, a friend gave me a piece of advice, re: making career choices: "Don't think about what you want to do, think about how you want to feel."

So much about making the "right choice" is about understanding and being responsive to your emotional intuition -- not a category of feeling that you often hear celebrated in (speaking frankly) bro-heavy tech spaces.

I still pay attention to how an opportunity makes me feel: excited? afraid? leery?

Then I work backwards to: Why? (Sometimes, being a little afraid can be a good sign, haha -- it means something will challenge you.)

And I pay close attention to the people I would be working for: how do they communicate? who do they respect? how do they picture the future?

I also think about my own end goals. What am I looking for? Do I just need a paycheck? (That happens sometimes.) Is there a specific skill I'm trying to master? Will a role be a stepping stone toward a bigger picture, long term goal?

It's a confluence of factors, and there's no science on how to balance them against each other. Again: that emotional intuition will guide you. (Keep it well honed.)

Oof, long answer. Does that help at all? Feel like I might have gone off deep into left field with this one. :)

captain_crabs · on Dec 4, 2016

A friend linked just this comment to me, and as somebody amidst an existential crisis, this is such a great answer.

That emotional intuition - have you noticed or observed any ways in yourself, or in others, to really keep this sharp?

cassiemarketos · on Dec 4, 2016

Glad to hear it was helpful!

Yes, definitely. You have to carve out time for yourself, very deliberately, and be conscious about minimizing the amount of stimulation and distractions you are letting in. Spend time alone, go for long walks with your phone off, let your mind wander. Devote mental time to the things that scare you and trouble you -- really, really lean into them. (Another friend once said to me: "Embrace the struggle." Also a helpful mantra.) Therapy, if it's an option, can be great self maintenance. So is daily meditation, even if just for ten minutes. (It helps you get into the habit of unplugging.)

The work of focusing on your "feelings" is interesting because, often, it's actually the opposite of "focus." It's more like letting yourself drift freely and, in doing so, mapping your interior sea. :)

(Not to get tooooo hippy BS about it. Ha.)

captain_crabs · on Dec 5, 2016

You've characterized what the act of honing emotional intuition needs, then what one would actually look like: block distractions, go for a long walk, generally think about the things you're uncomfortable about but let your mind wander. I think the hard part is, how much time do you need to devote before you get something back out of it? Best policy is probably just "always do it."

And in the last line, you elicited a reaction from me to say "This isn't hippy bs" and buy in to what you've said. Thanks for the deceptively good answer :D

wsmith · on Nov 30, 2016

You may want to try using an ML algorithm called association rules, which produces rules automatically. Though acccounting for the sequence of events would be harder.

wsmith · on Nov 26, 2016

I once reduced the running time of a report from 45 minutes to 3 seconds (900x improvement) by moving the code inside the database.

If a programming language wants to stay fast it must eventually become a database. I realize this is an unpopular opinion but popularity is the wrong metric to judge by.

Normal_gaussian · on Nov 26, 2016

576 hours -> 12 minutes (2880)

Admittedly the code I was moving away from was hilariously bad. Threaded custom build of PHP bad. Then again I haven't tried to optimise the code I wrote at all

You are absolutely right about fast code becoming a database, this is simply down to the query planner - it can try and do the least possible work for the data you actually have.

inopinatus · on Nov 26, 2016

I recently used temporary per-transaction tables (CREATE TEMPORARY TABLE .. ON COMMIT DROP, basically CTEs that persist across statements, and that can be indexed) with json_to_recordset and turned a three-minute ruby ETL background process into a sub-1-second inline call.

majewsky · on Nov 26, 2016

CREATE TEMP TABLE is really awesome. Not really related, but I used it at my previous gig to optimize the unit tests. They would previously all use the same database that devs use during development, so scans over some large tables were particularly inefficient, and tests could break when someone modified table contents for some development task.

I implemented a small function where you could list the tables that you were working with, and it would create empty temporary tables with the same name in a namespace that takes precedence over the normal namespace in postgres' search_path, therefore giving you a blank slate to work with for the unit test, while the other tables were still populated with useful data. (Some of these data come from expensive infrastructure scanning jobs, so starting from an entirely empty database was not an option.)

debaserab2 · on Nov 27, 2016

What language were you using to do that?

Just curious - not sure if you were using Ruby like the parent above. I literally just built a library to back ActiveRecord models with temporary tables. It's very useful since you can use AR scopes off of your temporary table then (I had to do this instead of using CTE's because my DB currently doesn't support CTE's)

https://github.com/ajbdev/acts_as_temptable

Just thought I'd share it in case it helps, or to hear if anyone else has a better approach to this. (warning: it's very experimental and not production tested yet). I couldn't find any other gem that would do this.

inopinatus · on Nov 27, 2016

Cool idea.

How did you resist the temptation to call it temptation?

Although I think temptable is almost the opposite of contemptible, therefore also good going in the pun department.

My code is all Ruby, and I ended up pushing all the work into SQL and just eventually selecting with find_by_sql into existing models for the results. There is possibly concurrent invocation vs updates and a race against itself, so it's also all wrapped with a serializable transaction and starts with a mutex lock on the controlling model.

debaserab2 · on Nov 27, 2016

Hah! I was sitting around trying to think of a clever name, but then I got tired of sitting around and just went with temptable. Had I thought of temptation I would have gone with that.

The approach I went with works really well for cases when you want to persist the temporary table through the end of the request (it works good for master/index type views that may have a lot of sums/counts and filter options available on it).

harperlee · on Nov 27, 2016

One shower idea that I had is the concept of a clojure version that instead of compiling to JVM bytecode, javascript or CLR, compiled to postgreSQL. I think that would be awesome: you could just run the same functions, seamlessly, from the database, through the web server, to the browser. And, whilst of course you need to know about the limitations of the database, it could be great for pushing code to the database seamlessly.

Unfortunately I don't think I have the skills for that, so posting here in the hopes that someone that can likes the idea :)

brassic · on Nov 27, 2016

This is kind of what LINQ does. You write a query in C#. The structure of the query (like an abstract syntax tree) can be exposed to the query provider which can interpret it or compile it to the CLR or compile it to SQL or whatever.

Someone · on Nov 27, 2016

A sufficiently advanced LINQ (http://wiki.c2.com/?SufficientlySmartCompiler) would do wonders in some cases, but I haven't encountered it. AFAIK, LINQ to SQL only knows about joins and some aggregate functions.

One thing that makes improving it cumbersome is that the semantics of many operations are slightly different in the database than in C#. For example, SQL Server doesn't do NaNs and infinities, supports a zillion string string collations, and its date types may not 100% map to C# ones.

Also, databases may run stored procedures faster than the SQL that LINQ genrerates on the fly because they can cache their query plans (most databases will detect that an on-the-fly query is equal to one run recently, though, so this may be somewhat of a moot argument)

tracker1 · on Nov 27, 2016

Which can lead to great, or less ideal results.. Seeing some of the resulting queries via monitoring are wild though.

__s · on Nov 27, 2016

At work we've got a LINQ query that gets passed around a few functions, eventually getting into a core of 60 lines of LINQ logic. Colleague verified that chaining selects produces different output, but gets ran at the same speed (Chain selects in order to somewhat declare variables, Select(x => new { x, y = <query-x> }) then you can Select(xy => new { use xy.y multplie times }))

Sometimes I think I should just be using sql.. (which we do on other projects)

tracker1 · on Nov 28, 2016

Yeah, I've been pretty happy without ORM in node.js, I even wrote a semi-nice wrapper so I could turn template strings into parameterized queries. Made writing a bunch of migration scripts a cakewalk.

Sometimes it's really just easier to write SQL directly.

vincentdm · on Nov 27, 2016

Do you mean something like Datomic? It has been intriguing me for a while...

dom0 · on Nov 26, 2016

Hence Kx, ZODB.

takeda · on Nov 27, 2016

I'm not too familiar with ZODB, but it looks like it tries to impose OO on a database, while in reality relational model works best with data, so making programing language able to interact with data that way would be better. I think something like JOOQ[1] is close to that.

In order to get good performance you want to minimize number of back and forth requests over network. So instead of making a request to obtain list of items, and then fetching each of the item one by one (so called N+1 select issue) you will get a better performance if you make the database send only the data you want, nothing more, nothing less.

[1] http://www.jooq.org/

dom0 · on Nov 27, 2016

> I'm not too familiar with ZODB, but it looks like it tries to impose OO on a database, while in reality relational model works best with data, so making programing language able to interact with data that way would be better.

ZODB itself is essentially a transactional single-object store, where that "single object" is usually the root node of an arbitrary (possibly cyclic) object graph. Storage depends on the backend, nowadays you'd usually use file storage for development and relstorage for any kind of deployment.

It doesn't have any query language, Python is. There are no indices beyond those that you explicitly create (eg. a dictionary would be a simple index, but other packages provide more complex indices, like B-Trees - the nice thing here is that anything that goes into the DB is fully transacted, which removes a lot of headache that you'd have with other solutions (eg. flat XML files)).

ZODB thrives when your data model is too complex to easily or efficiently map into SQL, when you don't do or need efficient complex ad-hoc queries and when 180 % performance and concurrency (although this was recently improved through the MVCC reimplementation) isn't the highest priority. Since it's pretty good at caching (much better than most ORMs) performance usually doesn't suck.

the-dude · on Nov 26, 2016

ZODB is not a database, nor is it fast.

dom0 · on Nov 26, 2016

More like a database toolkit. Performance depends on application but was clearly never the top priority. (Can still be pretty impressive in comparison due to the rather effective instance caching)

PeCaN · on Nov 27, 2016

Yep, the K interpreter is pretty mediocre (it is very small and the language is inherently pretty fast, but the interpreter is otherwise not very advanced¹) but being directly integrated with a very optimized in-memory column-store database means it smokes anything else. Even the massive engineering effort in a JVM+RDBMS simply can't compete with running K directly on memory-mapped files.

1. I believe they added SIMD support a year or two ago and got a massive (2-10× IIRC) speedup.

wsmith · on Nov 24, 2016

A friend proposed another theory, inappropriate as it may seem. That it's a way to draw attention to Reddit needing to make money, and Steve, being the CEO, has to find a way to do it and take blame for it.

If so, he deserves to be applauded. There are things a founder/CEO has to do you can't say.

flopsey · on Nov 24, 2016

How would what he did do that?

Grangar · on Nov 24, 2016

Bad PR is still PR.

wsmith · on Nov 24, 2016

I don't see a Twitter clone getting enough users any time soon.

wsmith · on Nov 23, 2016

I want to get a job within 3 days, not 3 months.

bobyscaph · on Nov 26, 2016

I am in the same line. Time is key there. Companies & canditates are often on the same page on this.

Fast & efficient sourcing & applying process: Smart matching to offer me only relevant and up-to-date. One click to apply.

Real time update on processes I applied: Has it been reviewed but the company? Are they interested or not? How many candidate are still in the process?

cauterized · on Nov 26, 2016

I want to have time to think over and compare offers instead of being pressured to make a decision before they explode.

wsmith · on Nov 18, 2016

by relaxing constrains, constant expressions became so powerful that a C compiler can be implemented in!

I wonder how long it will take before C++ accidentally becomes a Lisp.

gumby · on Nov 18, 2016

I discussed this very thing with Stroustrup as far back as the mid 1990s. He was, for example, a big fan of Lisp's macro capabilities, and described templating as a way of providing this kind of capability while (back then it was a goal) the ability to compile C.

(Compiling C is mostly out the window now of course)

marssaxman · on Nov 18, 2016

Why do you say that compiling C is "mostly out the window"? It's difficult to imagine how C++ could change in a way that broke C compatibility without also breaking compatibility with a great deal of C++ code.

OskarS · on Nov 18, 2016

C++ isn't compatible C, it never was. There's lots of valid C code that isn't valid C++. For instance, this line:

    int *new = malloc(sizeof(int));

is valid C but invalid C++ for two different reasons: "new" is a keyword in C++, so you can't use it as an identifier, and C++ doesn't allow implicit casts from void*, while C does.

dmitrygr · on Nov 18, 2016

int new = old + 5;

:)

marssaxman · on Nov 18, 2016

Oh, sure, and the void pointer thing, etc., but that's all been true from the start. From gumby's comment I got the impression that there might be some more recent trend toward abandoning C compilation, and that's what I have a hard time imagining.

spc476 · on Nov 18, 2016

For years, Microsoft's C compiler only supported C89 and would choke on C99 code (that used C99 features). That has recently changed (I'm guessing with the release of C11, which made certain C99 features optional instead of mandatory) but for a long time, it seemed that Microsoft was saying "We don't care about C."

gumby · on Nov 18, 2016

That's basically what I mean: both the C++ and C committees have been OK with C++ not being a proper superset of C.

The reason I mentioned it is that some of the syntactic horrors of C++ could safely be thrown out the window these days.

BTW though I'm primarily still a Lisp programmer I do like C++

munificent · on Nov 19, 2016

It's true that the C circle in the Venn diagram leaks out of the C++ circle a bit, but I find in practice most of it is contained quite well inside the C++ one.

My current hobby project is a language VM that compiles cleanly as both C and C++. It took very little effort to get it working in C++ despite hacking on it for months as a strictly C project first.

knome · on Nov 18, 2016

C++'s template system is more of a dynamically typed Haskell.

xwvvvvwx · on Nov 18, 2016

This video (from the amazing Bartosz Milewski) does a great job of illustrating this:

https://vimeo.com/7211030

RcouF1uZ4gsC · on Nov 18, 2016

Learning Haskell will actually help you to understand C++ templates much better.

wsmith · on Nov 15, 2016

Is Spark faster than MemSQL?

gopalv · on Nov 15, 2016

MemSQL is a transactional database (system of record).

Spark is a way of processing data, ideally stored in a system of record (Hive/HDFS/S3/MemSQL etc).

They're not the same.

wsmith · on Nov 16, 2016

There are similarities. A database is also a way of processing data.

For the kinds of processing both Spark and MemSQL do (e.g. join operation) is Spark faster than MemSQL?

wsmith · on Nov 15, 2016

Quantopian will bill you immediately for the first month's subscription.

How much does a month's premium data subscription cost?

joshuapayne · on Nov 15, 2016

Each premium data set is priced differently. Prices are set by the data vendor, ranging from totally free to $150/month. Most are in the $5 - $50/month range. (I work at Quantopian).