- In this specific case, it's also a problem of the API: theano.scan would return the whole sequence. But if you only need the last entry, i.e. y[-1], there is a very complicated optimization rule which checks for that. Basically many optimizations around theano.scan are very complicated because of that.
- The graph building and esp the graph optimizations are very slow. This is because all the logic is done in pure Python. But if you have big graphs, even just building up the graph can take time, and the optimization passes will take much longer. This was one of the most annoying problems when working with Theano. The startup time to build the graph could easily take up some minutes. I also doubt that you can optimize this very much in pure Python -- I think you would need to reimplement that in C++ or so. When switching to TensorFlow, building the graph felt almost instant in comparison. I wonder if they have any plans on this in this fork.
- On the other side, the optimizations on the graph are quite nice. You don't really have to care too much when writing code like log(softmax(z)) -- it will optimize it also to be numerically stable.
- The optimizations also went so far to check if some op can work inplace on its input. Which made writing ops more complicated, because if you want to have nice performance, you would write two versions, one which works inplace on the tensor, and another one not. And then again 2 further versions if you want CUDA as well.
I disagree. I think we should generalize more, beyond codes of conduct: If you are building an adjudication process for resolving non-criminal personal conflicts (whether that be a Code of Conduct, an HR department, a Title IX proceeding, a professional organization, or something else), you should take a look at Anglo-derived common law and the safeguards against abuse that have been evolved over the centuries.
That doesn't mean everything needs to go through the courts; it means that if your process allows something that Anglo common law does not allow, you should have a good answer for why that is. Does it allow anonymous accusations? Is the accused allowed to know the charges against them, before a finding of guilt is rendered? Is there a presumption of innocence? Is the accused allowed to have a trusted third party - one who knows the rules of the game - to advocate on their behalf? Who, exactly, is responsible for deciding matters of fact vs matters of "law"? Is there an appeals process to fix possibly incorrect decisions?
Going by the linked document by Valerie Aurora, a good Code of Conduct allows anonymous accusations, the accused does not get to know the charges against them before a finding is rendered, there is no presumption of innocence, the accused does not get a third party advocate, matters of fact are necessarily decided by the same committee that makes the rules, and there is no appeal process.
This doesn't mean that such a committee will always do wrong. But I think it's worth thinking about how people operating in bad faith (either on the committee, or reporters to it) can abuse those features to achieve goals that are not actually aligned with what the Code of Conduct is trying to do. Yes, it's true that people can't be put in jail for these sorts of things, but a poorly-run adjudication process can have significant negative personal and financial effects on people.
It's an ongoing debate, but there are some facts that don't change regardless of stance in the debate.
If you limit HFT, then markets /do/ become less efficient. When two trading venues for the same instrument exists, say, in Europe and the US, there is no benefit to any participant for a price disparity to exist for a long period. If $GOOG tanks 10% in a day, and a retail investor buys $GOOG at its old price in Europe, who is better off? Similarly for a seller in Europe, what if you were fleeced 10% because your holdings moved favourably in the moments before you sold?
HFT also makes trading cheaper for everyone. Much of the time these firms are primarily competing with each other. One way that competition manifests is in the bid/ask spread. With firms fighting for order flow, if they can improve their offer by a single cent to ensure price-time order book priority then they'll improve their price. For Joe Retail buying or selling that instrument, he just received a small improvement on the spread as a side effect of essentially duelling titans.
There are more aspects I'm not smart enough to discuss, like how HFT basically enables entire asset classes through dynamic hedging. The only reason that option markets exist on most stocks is because there is an HFT counterparty that sold you the option rapidly buying and selling the underlying stock to ensure its exposure to the position was only the premium you paid for the option.
HFTs also provide some market stability, first through increasing liquidity having a volatility smoothing effect, and second through so-called "volatility compression" as a result of option market dynamic hedging causing HFTs to buy when others are selling and sell when others are buying. This one has a darker side as depending on their aggregate positioning, HFTs will eventually begin to dump just like everyone else.
I believe HFTs are also necessary for exchange-traded funds to be priced correctly and function correctly. That's essentially because two markets always exist for an ETF: a primary market between dealers and the fund where creation/redemption units are traded, and the secondary market where regular folk buy its shares. Given the prevalence of ETFs as a retail investing vehicle, if they were mispriced this would be potentially disastrous for individual investors.
Probably a bunch more good reasons for HFT, I'm not sufficiently versed in this stuff
I have come to the conclusion that reddit is 2 apps/websites.
The first is a tiktok-esque waste of time, instant gratification meme machine. Everyone on this app uses the reddit native app and doesn't care about dark patterns. They would never know that reddit's app is shit, because they literally don't care enough.
The 2nd is a Hackernews-esque collection of hobbyist sub-forums. These people are invested in their hobbies and sub-reddits. They use reddit to interact and discuss, but also a source of niche-news for their hobby. Every one here has a 3rd party reddit app or uses RES. (if you don't, please do). Unlike twitter, reddit lets 3rd parties offer feature complete wrappers for reddit. This group has ad-block, but will occasionally give someone gold. This group hates reddit, but also has no where to go.
If a person tries to make reddit both, then it is an annoying experience. I use reddit entirely as the latter. The front-page and r/all are garbage to me. Every super popular (barring sports) subreddit is trash. But, my niche subreddits are literally the best places on the internet to gain niche information.
Examples where the subreddit is the best source of open discussion on the internet for that niche: dota2, manga, soccer, metal, prog, civilized discourse, history, male fashion (kinda), calisthenics, small cities, fantasy fiction, niche YT channels, super authentic cooking....and that's just for me.
PSA : Use a 3rd party reddit app (SYNC is my preferred. Pay up the 2$ dollars. It is worth it). Use RES, and enforce filters strongly. Use RedditProTools to detect trolls, bias and top contributors. Use Imagus (hoverzoom has malware) for pop-image/video viewer. These will greatly enhance your reddit experience.
"Sapiens: A Brief History of Humankind" by Yuval Noah Harari.
Hands down the book that most influenced me. The book had (for me) not one but several simple-yet-profound ideas that were forever inserted into the foreground of how I make sense of the world. For example, the existence of shared myths that allow humans to cooperate on a large scale. Or how I too, am religious, though I was sure I wasn't.
This is Matthew Butterick. I wrote “Why Racket? Why Lisp?”
As I allude there, Paul Graham’s writings about Lisp (mostly in Hackers & Painters) helped persuade me to explore Lisp languages. (Those writings have also persuaded many others.)
In particular, Arc's reliance on Racket persuaded me to take a serious look at Racket. So leaving aside quibbles about what “on top of” means — is Clojure not built “on top of” the JVM? Python “on top of” C? — Paul’s choice of Racket was influential in my choice too. (As it has been for many others.)
As for software being “built in [one’s] head,” that seems facially true of any software. The core thesis of “Beating the Averages” is that the tool you choose to get it out of your head and into the world matters. Having now had my own Lisp revelation, I not only buy Paul’s thesis, but I even think it could be strengthened: Lisp permits the implementation of a whole category of ideas that aren’t possible in other languages.
Moreover, Paul wrote that essay nearly 14 years ago. Since then, Lisps have gotten somewhat more popular (Clojure has led the pack). But as I say in the article, as a group, Lisps remain way behind the programming mainstream. So ultimately, my goal is not to evangelize for Racket and exclude other Lisps. I know Racket better because that’s what I use. But more people using all of them would be a great thing.
I'm actually using pollen to create my online book[1] and it's been very good.
While I write my blog in markdown it's super nice to be able to mix real code with in your markup language. For example if you want to create a special layout for a specific page or if you want a table with subtly different properties than the rest, it's easy.
It's also very powerful to extend the markup itself. I added support for Tufte style sidenotes[2] which I use extensively throughout the book. This is the markup for the sidenotes:
Lisp is a pretty nice◊sn{cult} language.
◊ndef["cult"]{
Some may say it's the language to rule them all.
}
The way it uses X-expressions to represent text is really intuitive and easy to work with. I do think there's merit to do this in lisp instead of say Python, because the modeling maps so well to lisp.
I'm a CS teacher and use Pollen to write my assignments (sample: https://drive.google.com/file/d/0B9DAZWpkQIDuUlA4SU5lcHItRDg...). It allows me to type assignments as fast as I can think them, in a specialized mark-up language for exactly the type of assignments I give. I can then apply various post-processing operations (like doing syntax highlighting, converting straight quotation marks to curly but not when inside sample code, numbering of problems, inserting special CSS to make the assignments print with nice margins / without breaking up headings and paragraphs that belong with them, etc.) all with very little pain.
It's also neat to show students, after they've spent months learning Racket, that all sorts of useful things -- including the writing of their assignments! -- can be automated with their new skills.
Here's my favorite interview question (spent 10 years as a quant, interviewed a bunch of people, most do not do well on this)
We're going to play a game. You draw a random number uniformly between 0 and 1. If you like it, you can keep it. If you don't, you can have a do-over and re-draw, but then you have to keep that final result.
I do the same. You do not know whether I've re-drawn and I do not know whether you've re-drawn. Decisions are made independently. We compare our numbers and the highest one wins $1.
What strategy do you use?
EDIT: The answer is not 0.5, it is not 0.7 either.
One is a derivatives quant, basically someone who spends a lot of time working out the values of derivatives. So you're looking at ito calculus and iterative methods to find the value of contracts with weird clauses (Knockouts, Cliquets, vanillas, etc), and then building spreadsheets to calculate the hedges for those trades as time goes by.
The other is strategy quants. Here you're looking for some way to beat the market. So you're reading in a bunch of data from various sources and applying some sort of quite open-ended analysis on it to come up with rules for a trading system. You then build a framework for trading such systems, which can get very involved eg execution algos that read realtime market data.
I started off in derivatives as a trader but quickly moved on to the other. They're both fairly deep, and both require you to know coding and math.
The questions here seem to apply mostly to derivs quants, but strategy quants have found many things applicable.
1) pricing quants,
you work for a bank or investment house like Goldamn. You know stochastic calculus very well, you know finite differencing like the back of your hand. You'll know every way to look at a derivative product. You can program in matlab. You create the next big thing like CDO's or CDS's
If you love math, this is where you want to be.
2) Quants who trade
You work at a hedge fund. You can program in python with scikit or use R. You don't know calculus maybe as well as a pricing quant but you know some area of the market much much better. You know stats like its your mother tongue.
3) Risk quant/programmer. You do modelling all day for a trader or risk manager. You can take a portfolio and model any feature that someone ask for. Var, beta, greeks, you can spit them out quickly. You know C++, excel and R. You might have been an engineer in a former life.
This is often considered the low position in the quant hierarchy. This is how some programmers break into the industry.
4) I lied there is actually special category 4). This is the professors of quants. You sit around all day and think about the next big equation, or how to model derivatives better than black scholes. If you are one of these people chances are your name is known in the industry:)
EDIT someone asked if you can do these without a math degree. The short answer is probably not. YOu'll need a math, engineering, or physics degree, unless you really are driven to learn hte math yourself.
A comp sci degree might work but you would be the exception and you'd be fighting up hill. Ask your self honestly, how many branches of mathematics have you self taught yourself and you'll get your answer. For most the answer is none, for a select few the answer is yes.
Someone asked about brain teasers. They do get asked, you have to deal with it, whining in an interview that this type of question gives no useful hiring indicator won't get you hired. I don't ask them but they are common:(
We throw around brain teasers at work during the day trying to stump each other. I guess some of it is trying to look smart. Some of it is that we just really like to dissect any problem and figure it out.
I think the biggest reason for asking brain teasers is that at a hedge fund there are no rules for how to make money, excluding legal. They want people who can think outside the box. it turns out that its really tough to test for "can the candidate think outside the box"?
Trading is a higher stress job than most other math or programming jobs, what we are trying to see is not only are you smart, but can you think on your feet and not get stressed out, because if an interview stresses you out, whats going to happens when a $50,000,000 position that should be going up starts to go down.
Are you going to complain that the model says it should go up and the market isn't being fair, or are you going to accept what's happening and get back to work? You'd be surprised at the number of people who chose option 1.
I ask a bunch of questions that some people feel are brain teasers. In between questions I'l throw out what's the square root of 225 to see how the candidate reacts. Good traders seem to be exceptional at mental math, with very, very few exceptions.
I'll also ask alot of probability questions. Get ready to know what your expected payoff is if you gamble on dice games, its basically what you do every minute of the day when trading:)
Now if you are a programmer, how do you get into the industry?
You need to know stats, machine learning, and programming, really well.
And I don't mean know machine learning, like "I tool an nlp library and stiched it together to do sentiment analysis on a corpus of text". I will ask what algorithms the underlying library used. You used SVM, great talk to me about your kernel selection methods. I want to know that you understand the math, and more importantly the assumptions and limitations of the library you are using.
The reason is that when you trade on a model that is based on your machine learning, I want to know that you know when it breaks. Finance is an industry that loves to model but has crashes that are predicted to happen 1 in 1000 year events happen every 10 years.
I love helping programmers who want to become quants get into the industry. Please feel free to ask if you have questions!
It seems very active right now.
Here some further information: https://pymc-devs.medium.com/the-future-of-pymc3-or-theano-i...
I haven't really found references to its new name "Aesara".
Apparently, the main new feature for Theano will be the JAX backend.
I wonder though, my experience when working with Theano, and also deep with the internals (trying to get further graph optimizations on theano.scan):
- Some parts of the code are not really clean.
- The code is extremely complex and hard to follow. See this: https://github.com/pymc-devs/Theano-PyMC/blob/master/theano/...
- This also made it very complicated to perform optimizations on the graph. See this: https://github.com/pymc-devs/Theano-PyMC/blob/master/theano/...
- In this specific case, it's also a problem of the API: theano.scan would return the whole sequence. But if you only need the last entry, i.e. y[-1], there is a very complicated optimization rule which checks for that. Basically many optimizations around theano.scan are very complicated because of that.
- Here is one attempt for some optimization on theano.scan: https://github.com/Theano/Theano/pull/3640
- The graph building and esp the graph optimizations are very slow. This is because all the logic is done in pure Python. But if you have big graphs, even just building up the graph can take time, and the optimization passes will take much longer. This was one of the most annoying problems when working with Theano. The startup time to build the graph could easily take up some minutes. I also doubt that you can optimize this very much in pure Python -- I think you would need to reimplement that in C++ or so. When switching to TensorFlow, building the graph felt almost instant in comparison. I wonder if they have any plans on this in this fork.
- On the other side, the optimizations on the graph are quite nice. You don't really have to care too much when writing code like log(softmax(z)) -- it will optimize it also to be numerically stable.
- The optimizations also went so far to check if some op can work inplace on its input. Which made writing ops more complicated, because if you want to have nice performance, you would write two versions, one which works inplace on the tensor, and another one not. And then again 2 further versions if you want CUDA as well.